<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Classification on Letters From The Wild Side</title><link>/tags/classification/</link><description>Recent content in Classification on Letters From The Wild Side</description><generator>Hugo -- gohugo.io</generator><language>en-uk</language><lastBuildDate>Sun, 27 Jun 2021 00:00:00 +0000</lastBuildDate><atom:link href="/tags/classification/index.xml" rel="self" type="application/rss+xml"/><item><title>Flutter Web, TensorFlow and PyTorch Project</title><link>/p/flutter-web-tensorflow-and-pytorch-project/</link><pubDate>Sun, 27 Jun 2021 00:00:00 +0000</pubDate><guid>/p/flutter-web-tensorflow-and-pytorch-project/</guid><description>&lt;p&gt;Last modified: May-29-2022, 02:40PM +08&lt;/p&gt;
&lt;h1 id="progressive-web-app"&gt;Progressive Web App
&lt;/h1&gt;&lt;p&gt;Progressive web app(PWA), a term that is increasingly common whenever one does a search on the
web in 2021. To date, there have been over &lt;a class="link" href="https://medium.com/flutter/announcing-flutter-2-2-at-google-i-o-2021-92f0fcbd7ef9" target="_blank" rel="noopener"
 &gt;200, 000&lt;/a&gt;
applications in the Play store built using Flutter. At it&amp;rsquo;s core, a PWA is an application software
that runs on a web server and accessed through a client such as a browser.&lt;/p&gt;
&lt;p&gt;The PWA alternative paradigm to traditional app development is made possible by the
ongoing work on large number of modern web APIs such as Cache, WebGL, WebAssembly,
Bluetooth, File System, IndexedDB, Service Workers and many more.&lt;/p&gt;
&lt;p&gt;Over the last few decades, cybersecurity is being increasingly recognized as critical
infrastructure, the Covid-19 pandemic further accelerated the shift towards the digital
landscape in an attempt to return to normalcy. The &lt;a class="link" href="https://go.crowdstrike.com/rs/281-OBQ-266/images/Report2021GTR.pdf" target="_blank" rel="noopener"
 &gt;2021 Global Threat
Report&lt;/a&gt; by
&lt;a class="link" href="https://www.crowdstrike.com/" target="_blank" rel="noopener"
 &gt;Crowdstrike&lt;/a&gt; provides details on how adversaries exploit
weaknesses present in current day business and government infrastructures. Security is now
a first-class citizen in PWAs as communications and/or data must be served over TLS
connections.&lt;/p&gt;
&lt;p&gt;A PWA looks, feels and navigate like any other web page with nested URLs or it can
function as a &lt;a class="link" href="https://en.wikipedia.org/wiki/Single-page_application" target="_blank" rel="noopener"
 &gt;single page
application&lt;/a&gt;. It is also installable
and executes using the browser runtime. Additional functionalities include
having the ability to work offline and access to device hardware, e.g., camera,
microphone or GPU(s), that are traditionally available only to native applications. The
app developer is also able to embed PWA within a web page or vice versa using WebView on
mobile, resulting in a hybrid framework.&lt;/p&gt;
&lt;p&gt;The PWA is not limited to mobile platform, even though there is increasing adoption on
&lt;a class="link" href="https://github.com/flutter-tizen/flutter-tizen" target="_blank" rel="noopener"
 &gt;smart watches&lt;/a&gt;. It is also compatible
across platforms, look beautiful and responsive at different resolutions and/or orientations.
It does this by treating different resolutions and platforms as multiple independent states,
resulting in a single codebase, which greatly streamlined rapid iteration of feature
development.&lt;/p&gt;
&lt;p&gt;Being browser and platform independent, all you need is a stable internet connection
which enable high interactivity, performance(FPS) and low latency responses for simple
through medium complexity use cases. A PWA can be configured to work with any input type
including touch, mouse, keyboard, audio or gestures.&lt;/p&gt;
&lt;p&gt;The PWA paradigm trades native performance for flexibility.&lt;/p&gt;
&lt;h2 id="flutter-for-the-web"&gt;Flutter For The Web
&lt;/h2&gt;&lt;p&gt;A PWA framework in the spotlight for web development is &lt;a class="link" href="https://github.com/flutter/flutter" target="_blank" rel="noopener"
 &gt;Flutter Web&lt;/a&gt; which
hit the stable
&lt;a class="link" href="https://medium.com/flutter/flutter-web-support-hits-the-stable-milestone-d6b84e83b425" target="_blank" rel="noopener"
 &gt;milestone&lt;/a&gt; in 2021.&lt;/p&gt;
&lt;p&gt;However, it takes high cognitive effort in navigating complex low-level APIs for
intermediate and advanced usage. As with any new and exciting framework, Flutter attracts
droves of developers in implementing their own Flutter port of exiting applications, only
to be met with lacking &lt;a class="link" href="https://github.com/flutter/flutter/issues/69315" target="_blank" rel="noopener"
 &gt;documentation&lt;/a&gt;
and/or over-simplified examples that do not translate to real world use cases.&lt;/p&gt;
&lt;p&gt;Usability issues aren&amp;rsquo;t unique to Flutter Web. Coming over from
&lt;a class="link" href="https://www.tensorflow.org/" target="_blank" rel="noopener"
 &gt;TensorFlow&lt;/a&gt; 1.x,
intermediate and advanced usage also experienced the same kind of brick wall once the
developer crossed the novice speed limit. With TensorFlow 2.x, the engineering team
adopted &lt;a class="link" href="https://keras.io/" target="_blank" rel="noopener"
 &gt;Keras&lt;/a&gt; as their high level API with an emphasis on
progressive disclosure of complexity and greatly improved usability.&lt;/p&gt;
&lt;p&gt;In my neutral opinion, TensorFlow 2.x and &lt;a class="link" href="https://snapcraft.io/" target="_blank" rel="noopener"
 &gt;Snapcraft&lt;/a&gt; serve as good
starting points for communicating user/reference guides, targeting different expertise
levels. As such, I have newfound appreciation for well communicated technical documentations.
As someone starting with zero experience in web development and web technologies, Flutter Web
represents an enormous challenge with a tremendous investment in cognitive effort.
Previously, &lt;a class="link" href="https://streamlit.io/" target="_blank" rel="noopener"
 &gt;Streamlit&lt;/a&gt; was my go-to for rapid experimentation.
Jumping from Streamlit to full-fledged Flutter Web is akin to bungee jumping in Grand Canyon,
a straight plunge to rock bottom.&lt;/p&gt;
&lt;p&gt;Not recommended for new, aspiring web developers with restricted time allowance as
APIs implementation such as &lt;a class="link" href="https://medium.com/flutter/learning-flutters-new-navigation-and-routing-system-7c9068155ade" target="_blank" rel="noopener"
 &gt;Navigator
2.0&lt;/a&gt;
can get low level and filled with boilerplate for intermediate and advanced use cases. There is
significant effort in reviewing third party alternatives where several packages are replicating
similar use cases for complex APIs. Due to the complexities in modern network of web technologies
and native platform APIs, community contributions are in great need.&lt;/p&gt;
&lt;p&gt;It is also due to this patchwork of volunteers and industry that allow a PWA built with
Flutter Web to exhibit near native performance across different platforms and modern
&lt;a class="link" href="https://www.w3.org/" target="_blank" rel="noopener"
 &gt;W3C&lt;/a&gt;-compliant browsers. A Flutter Web PWA just works, no need for
app stores, no hard requirement to download or install any executable.&lt;/p&gt;
&lt;p&gt;The real testament to Flutter framework is emulating &lt;a class="link" href="https://www.wechat.com/" target="_blank" rel="noopener"
 &gt;Wechat&lt;/a&gt;
which serve over 1 billion users and represents a super app, housing smaller apps within
its ecosystem.&lt;/p&gt;
&lt;h2 id="modern-browser-as-a-general-purpose-computing-platform"&gt;Modern Browser As A General Purpose Computing Platform
&lt;/h2&gt;&lt;p&gt;Evolution from fetching web pages, reading emails to crunching computation in a secure
and sandboxed environment. Modern browsers greatly enhanced productivity and
entertainment with plugins ecosystem and a growing body of &lt;a class="link" href="https://developer.mozilla.org/en-US/docs/Web/API" target="_blank" rel="noopener"
 &gt;Web APIs&lt;/a&gt;.
A general-purpose modern browser represents an international community effort in a pursuit of
a fair, open and privacy-preserving high accessibility tool. Recent functionalities
include programming sandbox(IDEs), screen casting and machine learning powered tools such as
autocomplete search.&lt;/p&gt;
&lt;h2 id="symbiosis-of-flutter-and-tensorflow"&gt;Symbiosis Of Flutter And TensorFlow
&lt;/h2&gt;&lt;p&gt;As a tinkerer, there&amp;rsquo;s an itch to satisfy after witnessing the exponential advancement in modern
technologies. Hence, the inspiration to create a tool that&amp;rsquo;s designed to initiate
creative and/or problem solving processes, reducing the user&amp;rsquo;s cognitive inertia with
productive work. The fruition of this project is evident by well defined interfaces of
public facing APIs in seemingly unrelated fields(Flutter Web, TensorFlow,
&lt;a class="link" href="https://www.tensorflow.org/js" target="_blank" rel="noopener"
 &gt;TensorFlowJS&lt;/a&gt;,
&lt;a class="link" href="https://www.tensorflow.org/tfx" target="_blank" rel="noopener"
 &gt;TFX&lt;/a&gt;) across different languages(Python, C/C++/CUDA, JavaScript/TypeScript, Dart).&lt;/p&gt;
&lt;p&gt;Initial machine learning(ML) models were written with TFX pipelines, the SavedModel
outputs were further converted to JSON format to be compatible with the browser. Separate
JavaScript/TypeScript scripts would contain the logic in loading the converted browser-compatible
models and handling inference requests.&lt;/p&gt;
&lt;p&gt;These inference scripts are executed as callbacks from Dart classes upon accepting user
inputs in the app UI. For low complexity models that have just a single parameter and
accept a single input, response time are &amp;lt;1 second. For medium complexity models that
have multiple parameters and accept a single input, response time are also &amp;lt;1 second. For
high complexity models that require loading from content delivery networks(CDNs) and
accept open-ended inputs such as text, audio, image or video, response time range
from 3 to 10 seconds.&lt;/p&gt;
&lt;p&gt;Preliminary exploration suggest that these technologies are fully compatible and worth
further investments into advanced capabilities of the humble web browser.&lt;/p&gt;
&lt;h2 id="whats-next"&gt;What&amp;rsquo;s Next?
&lt;/h2&gt;&lt;p&gt;Future roadmap include expansion of low, medium and high complexity models with strict
performance restrictions. Extending application to process text, audio, image and video
data efficiently. Explore different problem domains in related fields of computer vision,
natural language processing and machine-generated content.&lt;/p&gt;
&lt;h2 id="summary"&gt;Summary
&lt;/h2&gt;&lt;p&gt;General purpose ML toolbox that is cross-platform and readily accessible through the internet.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;PWAs will continue to proliferate due to it&amp;rsquo;s flexibility&lt;/li&gt;
&lt;li&gt;Flutter Web hits stable milestone for production use&lt;/li&gt;
&lt;li&gt;Modern W3C-compliant browser with exciting APIs&lt;/li&gt;
&lt;li&gt;Flutter + machine learning frameworks = UI meets AI&lt;/li&gt;
&lt;li&gt;Roadmap&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;</description></item><item><title>ML Engineering</title><link>/p/ml-engineering/</link><pubDate>Tue, 08 Jun 2021 00:00:00 +0000</pubDate><guid>/p/ml-engineering/</guid><description>&lt;p&gt;Last modified: Dec-14-2021, 10:35PM +08&lt;/p&gt;
&lt;h1 id="from-manual-to-semi-automatic"&gt;From Manual To Semi-Automatic
&lt;/h1&gt;&lt;p&gt;Before the advent of the concept &amp;ldquo;MLOps&amp;rdquo;, getting a single machine learning(ML) model to
production was tedious and belaboring. Every single detail pertaining to the inputs,
model server, training and inference have to be defined explicitly. This is to ensure the
input tensors follow a strict requirement for them to be processed by user defined
functions.&lt;/p&gt;
&lt;p&gt;To serve a single model, these predefined configurations have to be under version
control as the ML field and software ecosystem is accelerating at near exponential
speeds. In addition to the model, version control has to be applied to the training data
as well as the software infrastructure that is used to host the model. A working
production pipeline is like a moving train loading and offloading compartments to keep up
with cutting-edge development.&lt;/p&gt;
&lt;p&gt;After the release of &lt;a class="link" href="https://ieeexplore.ieee.org/abstract/document/5206848" title="ImageNet"
 target="_blank" rel="noopener"
 &gt;ImageNet&lt;/a&gt; dataset,
there was tremendous effort poured into surpassing the human baseline. In the early
2010s, that
&lt;a class="link" href="https://kr.nvidia.com/content/tesla/pdf/machine-learning/imagenet-classification-with-deep-convolutional-nn.pdf" title="baseline"
 target="_blank" rel="noopener"
 &gt;baseline&lt;/a&gt; was exceeded with the combination of readily available data, open-source frameworks and modern computing
resources that can be bought off the shelf. However, being proficient in these resources was restricted to experts
and those within the technical community. Developer sanity was largely dependant on up-to-date documentation
or comments within the source where documentation was absent.&lt;/p&gt;
&lt;p&gt;In the mid 2010s, a number of Deep Learning(DL) frameworks were designed to unify the
common primitives in building these DL models. These include
&lt;a class="link" href="https://www.tensorflow.org/" title="TensorFlow"
 target="_blank" rel="noopener"
 &gt;TensorFlow&lt;/a&gt;, &lt;a class="link" href="https://keras.io/" title="Keras"
 target="_blank" rel="noopener"
 &gt;Keras&lt;/a&gt;,
&lt;a class="link" href="https://pytorch.org/" title="PyTorch"
 target="_blank" rel="noopener"
 &gt;Pytorch&lt;/a&gt;, &lt;a class="link" href="https://mxnet.apache.org/versions/1.8.0/" title="mxnet"
 target="_blank" rel="noopener"
 &gt;Apache MXNet&lt;/a&gt; and many others.&lt;/p&gt;
&lt;p&gt;To tackle the problem of productionizing models, one of the solutions explored was the usage
of &lt;a class="link" href="https://www.docker.com/" title="Docker"
 target="_blank" rel="noopener"
 &gt;Docker&lt;/a&gt; containers, to package both the dependencies and the actual model as lightweight
components that can be easily shared through a public repository. This approach greatly
democratize the deployment of DL models to common hosting providers like the public clouds
or in-house servers.&lt;/p&gt;
&lt;p&gt;The natural progression in using Docker containers meant the inclusion of shell scripts,
cron jobs and triggers that allow the automation of the entire ML pipeline. Docker-based
workflows gave developers access to version controlled resources locally on their laptops
and globally across different time zones.&lt;/p&gt;
&lt;h2 id="components-based-workflow"&gt;Components-Based Workflow
&lt;/h2&gt;&lt;p&gt;For organizations that need to scale to millions of containers in production, the
de facto solution include container orchestration platforms such as
&lt;a class="link" href="https://kubernetes.io/" title="Kubernetes"
 target="_blank" rel="noopener"
 &gt;Kubernetes&lt;/a&gt;. The platform allows hundreds and
thousands of engineers to collaborate on different levels of a complex ML system. This
ranges from low level implementation of hardware drivers to the high level design of
user-interfaces such as click-and-drag block diagrams.&lt;/p&gt;
&lt;p&gt;The low-code or no-code approach is an industry effort to lower the cognitive strain
in designing complex ML models. The design and implementation of mission-critical models
requires non-trivial engineering efforts, so why should their deployment be unnecessarily complex?&lt;/p&gt;
&lt;p&gt;Behind the scenes of the components-based workflow lies Kubernetes applications such as
&lt;a class="link" href="https://argoproj.github.io/argo-workflows/" title="Argo Workflows"
 target="_blank" rel="noopener"
 &gt;Argo Workflows&lt;/a&gt;,
&lt;a class="link" href="https://tekton.dev/" title="Tekton"
 target="_blank" rel="noopener"
 &gt;Tekton&lt;/a&gt; as well as many others. These applications specify
steps in a ML pipeline as containers that can spun up sequentially or in parallel. These
steps can be expressed as a directed acyclic graph(DAG), which can be version controlled
and compiled for export to different hardware architectures.&lt;/p&gt;
&lt;p&gt;Initially, we had manual design, hand-tuned and hand-crafted models without A/B testing
because deployment of new models simply could not keep up with the development of a core
application(4~6 weeks cycle). Now we can churn dozens of models daily in parallel, set to trigger on
arrival of new data or based on adjacent/over-lapping time windows. The models that
passed evaluation are then uploaded to a model repository for further downstream
processes.&lt;/p&gt;
&lt;h2 id="cautionary-tales"&gt;Cautionary Tales
&lt;/h2&gt;&lt;p&gt;A majority of kubernetes applications are rather new to the scene, many more are emerging
to solve critical issues pertaining to storage, security, networking and other
peripherals. Choosing the right software stack requires an in-depth technical review of
existing solutions with respect to dimensions of correctness, latency and costs.&lt;/p&gt;
&lt;p&gt;At the SME scale, one single competent ML engineer is the bare requirement for a
sufficiently complex ML system, serving requests up to the number of CPU cores procured with
default settings.&lt;/p&gt;
&lt;p&gt;At the enterprise scale, ML engineering is not well suited to be an one-man job, but
rather spread across different teams with each being a subject matter expert on their domains.&lt;/p&gt;
&lt;h2 id="whats-next"&gt;What&amp;rsquo;s Next
&lt;/h2&gt;&lt;p&gt;Currently working on a regression pipeline, targeting TensorFlow.js models to be deployed
in a Flutter application, hosted by Firebase. The pipeline is designed to be agnostic to
regression problem domains. Future regression tasks include cryptocurrency market size,
health monitoring, renewable energy forecasts and EV tank-to-wheel
efficiency(70~90%).&lt;/p&gt;
&lt;p&gt;Other pipelines include tasks under the pillars of ML:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;classification&lt;/li&gt;
&lt;li&gt;density-estimation&lt;/li&gt;
&lt;li&gt;dimensionality-reduction&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Pipelines for generative models are in the roadmap as well.&lt;/p&gt;
&lt;p&gt;Incorporating accelerators such as GPUs or TPUs into pipeline to further parallelize existing
workflows.&lt;/p&gt;
&lt;hr&gt;</description></item></channel></rss>