<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Operations on Letters From The Wild Side</title><link>/categories/operations/</link><description>Recent content in Operations on Letters From The Wild Side</description><generator>Hugo -- gohugo.io</generator><language>en-uk</language><lastBuildDate>Tue, 08 Jun 2021 00:00:00 +0000</lastBuildDate><atom:link href="/categories/operations/index.xml" rel="self" type="application/rss+xml"/><item><title>ML Engineering</title><link>/p/ml-engineering/</link><pubDate>Tue, 08 Jun 2021 00:00:00 +0000</pubDate><guid>/p/ml-engineering/</guid><description>&lt;p&gt;Last modified: Dec-14-2021, 10:35PM +08&lt;/p&gt;
&lt;h1 id="from-manual-to-semi-automatic"&gt;From Manual To Semi-Automatic
&lt;/h1&gt;&lt;p&gt;Before the advent of the concept &amp;ldquo;MLOps&amp;rdquo;, getting a single machine learning(ML) model to
production was tedious and belaboring. Every single detail pertaining to the inputs,
model server, training and inference have to be defined explicitly. This is to ensure the
input tensors follow a strict requirement for them to be processed by user defined
functions.&lt;/p&gt;
&lt;p&gt;To serve a single model, these predefined configurations have to be under version
control as the ML field and software ecosystem is accelerating at near exponential
speeds. In addition to the model, version control has to be applied to the training data
as well as the software infrastructure that is used to host the model. A working
production pipeline is like a moving train loading and offloading compartments to keep up
with cutting-edge development.&lt;/p&gt;
&lt;p&gt;After the release of &lt;a class="link" href="https://ieeexplore.ieee.org/abstract/document/5206848" title="ImageNet"
 target="_blank" rel="noopener"
 &gt;ImageNet&lt;/a&gt; dataset,
there was tremendous effort poured into surpassing the human baseline. In the early
2010s, that
&lt;a class="link" href="https://kr.nvidia.com/content/tesla/pdf/machine-learning/imagenet-classification-with-deep-convolutional-nn.pdf" title="baseline"
 target="_blank" rel="noopener"
 &gt;baseline&lt;/a&gt; was exceeded with the combination of readily available data, open-source frameworks and modern computing
resources that can be bought off the shelf. However, being proficient in these resources was restricted to experts
and those within the technical community. Developer sanity was largely dependant on up-to-date documentation
or comments within the source where documentation was absent.&lt;/p&gt;
&lt;p&gt;In the mid 2010s, a number of Deep Learning(DL) frameworks were designed to unify the
common primitives in building these DL models. These include
&lt;a class="link" href="https://www.tensorflow.org/" title="TensorFlow"
 target="_blank" rel="noopener"
 &gt;TensorFlow&lt;/a&gt;, &lt;a class="link" href="https://keras.io/" title="Keras"
 target="_blank" rel="noopener"
 &gt;Keras&lt;/a&gt;,
&lt;a class="link" href="https://pytorch.org/" title="PyTorch"
 target="_blank" rel="noopener"
 &gt;Pytorch&lt;/a&gt;, &lt;a class="link" href="https://mxnet.apache.org/versions/1.8.0/" title="mxnet"
 target="_blank" rel="noopener"
 &gt;Apache MXNet&lt;/a&gt; and many others.&lt;/p&gt;
&lt;p&gt;To tackle the problem of productionizing models, one of the solutions explored was the usage
of &lt;a class="link" href="https://www.docker.com/" title="Docker"
 target="_blank" rel="noopener"
 &gt;Docker&lt;/a&gt; containers, to package both the dependencies and the actual model as lightweight
components that can be easily shared through a public repository. This approach greatly
democratize the deployment of DL models to common hosting providers like the public clouds
or in-house servers.&lt;/p&gt;
&lt;p&gt;The natural progression in using Docker containers meant the inclusion of shell scripts,
cron jobs and triggers that allow the automation of the entire ML pipeline. Docker-based
workflows gave developers access to version controlled resources locally on their laptops
and globally across different time zones.&lt;/p&gt;
&lt;h2 id="components-based-workflow"&gt;Components-Based Workflow
&lt;/h2&gt;&lt;p&gt;For organizations that need to scale to millions of containers in production, the
de facto solution include container orchestration platforms such as
&lt;a class="link" href="https://kubernetes.io/" title="Kubernetes"
 target="_blank" rel="noopener"
 &gt;Kubernetes&lt;/a&gt;. The platform allows hundreds and
thousands of engineers to collaborate on different levels of a complex ML system. This
ranges from low level implementation of hardware drivers to the high level design of
user-interfaces such as click-and-drag block diagrams.&lt;/p&gt;
&lt;p&gt;The low-code or no-code approach is an industry effort to lower the cognitive strain
in designing complex ML models. The design and implementation of mission-critical models
requires non-trivial engineering efforts, so why should their deployment be unnecessarily complex?&lt;/p&gt;
&lt;p&gt;Behind the scenes of the components-based workflow lies Kubernetes applications such as
&lt;a class="link" href="https://argoproj.github.io/argo-workflows/" title="Argo Workflows"
 target="_blank" rel="noopener"
 &gt;Argo Workflows&lt;/a&gt;,
&lt;a class="link" href="https://tekton.dev/" title="Tekton"
 target="_blank" rel="noopener"
 &gt;Tekton&lt;/a&gt; as well as many others. These applications specify
steps in a ML pipeline as containers that can spun up sequentially or in parallel. These
steps can be expressed as a directed acyclic graph(DAG), which can be version controlled
and compiled for export to different hardware architectures.&lt;/p&gt;
&lt;p&gt;Initially, we had manual design, hand-tuned and hand-crafted models without A/B testing
because deployment of new models simply could not keep up with the development of a core
application(4~6 weeks cycle). Now we can churn dozens of models daily in parallel, set to trigger on
arrival of new data or based on adjacent/over-lapping time windows. The models that
passed evaluation are then uploaded to a model repository for further downstream
processes.&lt;/p&gt;
&lt;h2 id="cautionary-tales"&gt;Cautionary Tales
&lt;/h2&gt;&lt;p&gt;A majority of kubernetes applications are rather new to the scene, many more are emerging
to solve critical issues pertaining to storage, security, networking and other
peripherals. Choosing the right software stack requires an in-depth technical review of
existing solutions with respect to dimensions of correctness, latency and costs.&lt;/p&gt;
&lt;p&gt;At the SME scale, one single competent ML engineer is the bare requirement for a
sufficiently complex ML system, serving requests up to the number of CPU cores procured with
default settings.&lt;/p&gt;
&lt;p&gt;At the enterprise scale, ML engineering is not well suited to be an one-man job, but
rather spread across different teams with each being a subject matter expert on their domains.&lt;/p&gt;
&lt;h2 id="whats-next"&gt;What&amp;rsquo;s Next
&lt;/h2&gt;&lt;p&gt;Currently working on a regression pipeline, targeting TensorFlow.js models to be deployed
in a Flutter application, hosted by Firebase. The pipeline is designed to be agnostic to
regression problem domains. Future regression tasks include cryptocurrency market size,
health monitoring, renewable energy forecasts and EV tank-to-wheel
efficiency(70~90%).&lt;/p&gt;
&lt;p&gt;Other pipelines include tasks under the pillars of ML:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;classification&lt;/li&gt;
&lt;li&gt;density-estimation&lt;/li&gt;
&lt;li&gt;dimensionality-reduction&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Pipelines for generative models are in the roadmap as well.&lt;/p&gt;
&lt;p&gt;Incorporating accelerators such as GPUs or TPUs into pipeline to further parallelize existing
workflows.&lt;/p&gt;
&lt;hr&gt;</description></item></channel></rss>