by
Dr. RGS Asthana
Senior
Member IEEE
Figure
1: Revolutionizing of digital
enterprises by ML [7]
Summary
Machine
Learning (ML) and Artificial Intelligence (AI) – Part Two covers distinction
between ML and AI followed by some popular software projects in the field.
A very
brief survey of some ML and Deep Learning libraries and frameworks is done. A description
of some common use cases in the areas, such as, Data Security, Personal Security,
Financial Trading, Healthcare Industry, Fraud Detection, Recommender Apps, On-line
Search, NLP, Smart Cars and Robotics is given. Way
forward covers the possibility of elimination of jobs due to use of ML and AI
based automations in real world scenarios and also mentions in brief new
products likely to come soon.
Keywords
Machine Learning (ML) Tools, Artificial Intelligence (AI), Neural Networks, Reinforced Learning, Supervised and un-supervised Learning, Internet of Things (IoT)
Prelude
In
Part Two, ML and deep learning projects are surveyed. How the
implementation of ML and deep learning solutions will change enterprises view
customer value today [7] (figure 1)?
Four years ago, an
Oxford University study predicted 47% of jobs could be
automated by 2033 [20]. Even the near-term outlook has been quite negative: A
2016 report by the Organization for Economic
Cooperation and Development (OECD) said 9% of jobs in the 21 countries that
make up its membership could be automated. And in January 2017, McKinsey’s
research arm estimated AI-driven job losses at 5%. What will really happen only time will tell?
Figure 2 is self-explanatory and shows how businesses use AI.
One needs to identify AI applications that not
only eliminate jobs but also give big benefits?
Figure 2: Use of AI by businesses [20]
A RunBook
[30] (see figure 2) is a
compilation of repetitive actions and operations that the system administrator
or operator carries out. All the relevant entries in the RunBook can be done using software tools
and AI can really automate this process.
Distinction between ML and AI
In
fact, ML is a subset of AI. Machines that could fully copy or even surpass all
humans’ cognitive tasks are still a Science Fiction, ML in reality is behind AI
and it is available today. ML based systems attempt to imitate the human cognitive
system functions and solve problems based on that functioning. These systems have
necessary capacity to handle and analyze massive data with accuracy which is
way beyond human skills.
There is no need to
program a ML based computer as it can now learn from data and identify patterns
to solve a specific problem. Computers can
now handle tasks previously only humans could do, e.g., win games like chess,
go and poker; can identify images accurately, translate spoken words in text accurately,
and can translate say, over a hundred languages.
With growing
popularity of unsupervised learning (USL), one AI machine can now do multiple
jobs. The methods exploited in ML use USL techniques, such as, Neural Nets or its variations, Big Data
patterns, or cluster analysis to name a
few.
Some Popular Software
Projects of ML and Deep Learning [21]
Data
visualization [28] is an important step in data analysis as well as in ML. There
are many data Visualization tools including Google Charts, Matplotlib, Matlab, ggplot and other
language-native tools. Microsoft ML Studio creates canned plots covering
all the usual type of visualization.
Given below is very brief detail of software tools/ languages/
frameworks useful for ML:
Python
Python is increasingly
used for analytics in data mining, data science or ML projects. In Python, there are many good options for data
visualization including Matplotlib, Seaborn, and Plotly.
Apache Software Foundation
There are many
Apache projects with ML capabilities. Of these, Spark has the most users,
active contributors, commits, and lines of code added.
Caffe
Caffe
[19] is a deep learning library (DLL) has Python and MATLAB bindings, though it’s
written in C++ with CUDA acceleration support. It is, in fact, a general
purpose DLL for deploying convolutional networks, as well as other
architectures, in vision, speech, and other applications. If you wish you
can use built-in models where it offers the following:
·
Model
definitions
·
Optimization
settings
·
Pre-defined
weights
So you
can start immediately working.
Microsoft Distributed Machine Learning
Toolkit (DMTK)
DMTK
framework handles the problem of distributing various kinds of ML jobs across a
cluster of systems.
DMTK is billed as a framework rather than a full-blown
out-of-the-box-solution, so the number of actual algorithms included with it is
small.
Keras
Keras is a Python DLL which leverages
both TensorFlow and Theano, meaning that it can be run on top of either of what
are arguably two of the most popular deep learning research libraries currently
in existence. This makes it a real high
level library for neural net based frameworks.
The
R Project
The R
Consortium - a collaborative project of the Linux Foundation – its
user community as well as its functionality continued to grow in the year 2016.
The new members included IBM
and ESRI, which means Alteryx, Avant, DataCamp, Google, Ketchum
Trading, Mango Solutions, Microsoft, Oracle, RStudio, and TIBCO also become
part of R community.
Apache Spark
Spark’s ML
capabilities include additional algorithms in the DataFrames-based API, in
PySpark and in SparkR, as well as support for saving and loading ML models. The DataFrames-based API is now the primary
interface for ML in Spark. Third parties
also added close to 25 ML packages to Spark Packages in 2016. The major public cloud service providers
(viz. AWS, Microsoft, IBM and Google)
deliver Spark services and also value-added managed services for data
scientists.
ML and/or Deep Learning Frameworks
Table 1.0 below gives a brief list of frameworks for ML and/or
deep learning engineers.
Table
1.0: ML and/or Deep Learning Frameworks
Platform
|
Purpose
|
Models Supported
|
Built-in layers or languages
supported
|
Visuali-zation tools
|
Apache
Singa
|
It’s
a distributed deep learning Open source platform for training and It can handle
big deep learning models over large datasets. It’s good for solving NLP and
Image Recognition problems
|
Feed-forward
models, e.g., CNN, RBM and Recurrent Neural Nets or RNN. Models can be trained synchronously
(one after the other) or asynchronously (side by side), depending on whatever
works best for the given problem. It can be programmed to work with cluster
of machines easily. It’s slow and too complicated for solving simple
problems.
|
Many models, provides Improvised Python
binding & contains more deep learning models like VGG & ResNet. [32]
|
|
Amazon
ML
|
connects
to data stored in Amazon S3, Redshift, or RDS, and can run on said data to
create a model
|
It supports three types of models: binary
classification, multi-class classification and regression [32].
|
has many visuali-sation tools & wizards.
|
|
Azure
ML Studio
|
It is slow and
takes seconds, if not minutes. To properly understand ML, it’s important to
iterate often. I use Azure ML a lot, but mainly for production and
maintaining models. By the way, it comes with a cost.
|
Has
wide range of algorithms, courtesy of both Microsoft and third parties.
Azure ML Studio has monthly, hourly, and
free-tier versions. Azure ML
Studio allows users to create and train models, then turn them into APIs that
can be consumed by other services. Users get up to 10GB of storage per account
for model data, although you can also connect your own Azure storage to the
service for larger models.
|
||
Torch
|
It is
a scientific computing framework with wide support for ML algorithms that
puts GPUs first.
|
The
scripting language is LuaJIT, and it is a core C/CUDA implementation with
maximum flexibility and speed in building scientific algorithms. It has
support for ML, computer vision, signal processing, parallel processing,
image, video, audio and networking among others, and builds on top of the Lua
community.
|
||
H2O
|
easily
apply math and predictive analytics to solve today’s most challenging
business problems
|
Best
of Breed Open Source Technology, easy-to-use WebUI and Familiar
Interfaces, Data Agnostic Support for all Common Database and File
Types. You can work with your existing languages and tools including Hadoop
environments
|
||
Massive
Online Analysis (MOA)
|
It’s
an open source framework for data stream mining,
|
Has
ML algorithm for classification, regression, clustering, outlier
detection, concept drift detection and recommender systems
|
Written
in Java
|
|
MLlib
(Spark)
|
Apache
Spark’s ML lib: It is a pipeline
wrapper over ML lib. The main difference is that it runs on Spark, which lets
you do parallel processing on a massive scale.
|
It consists of common learning algorithms and
utilities, including classification, regression, clustering, collaborative
filtering Collaborative filtering techniques including Alternating Least
Squares (ALS),
dimensionality reduction, as well as lower-level optimization primitives such
as stochastic gradient descent and limited-memory BGGS and higher-level
pipeline APIs. Summary statistics, correlations, hypothesis testing, random
data generation.
|
Python
|
One
can use any third party tool
|
Pattern
|
It
is a web mining module
|
It
can mine data from Google, Twitter and Wikipedia API, a web crawler, a HTML
DOM parser, NLP (part-of-speech taggers, sentiment analysis, WordNet), ML
(vector space model, clustering, SVM), network analysis
|
Python
|
Yes
|
TensorFlow
|
It
is an open source software library for numerical computation using data flow
graphs. TensorFlow implements what are called data flow graphs, where
batches of data (“tensors”) can be processed by a series of algorithms
described by a graph? The key shortcomings are: 1. It Needs lot of processing
power and 2. User needs good knowledge of Neural Networks.
|
The
movements of data through the system are called “flows”—hence, the name.
Graphs can be assembled with C++ or Python and can be processed on CPUs or
GPUs.
It is a low-level, but if you want a
higher-level version, you can wrap Keras over it. It is for deep learning, while Scikit-learn
excels in traditional ML, so the two complement each other nicely. It is designed to scale across multiple nodes
|
C++
and Python
|
users can write code, and see visual-isations
and data flow graphs.
|
Scikit-learn
|
One can implement
ML pipelines with 5–10 lines of code. In my experience, it’s one of the
easiest ML libs to work with.
|
The
kit is available under a BSD license. It’s fully open & reusable. It
includes tools for many of the standard ML tasks (Such as
clustering, classification, regression, etc.).
|
Python
|
|
Veles
(Samsung)
|
It
is distributed platform for deep-learning applications. Python is used
to automate and coordinate between
nodes. Datasets are analyzed and
normalized before being passed on to the cluster
|
A
REST API allows the trained model to be used in production immediately. It
has little hard-coded entities and enables training of all the widely known
topologies, such as fully connected nets, convolutional nets, recurrent nets
etc.
|
Like TensorFlow and DMTK, it’s written in
C++, uses Python to perform automation & coordination between nodes.
|
The data-visualization and analysis tool can
visualize and publish results from a Veles cluster.
|
If you really like to learn ML, it
is then advisable to
1. Write at least one ML algorithm from
scratch. It need not be more complex than linear regression with gradient
descent.
2. Experiment with Scikit-learn. It’s
fast, easy and free. It lets you iterate often and understand what you’re
doing.
3.
After
you have a good grasp of the basics, expand to other libraries and APIs.
Use
Cases [7 and 8]
Data Security
As per,
Deep
Instinct [9] - each piece of new
malware has almost the same code as compared to the previous versions with 2
and 10% of variation. Their ML model can easily handle 2–10% variations, and
can identify which files are malware with very good accuracy.
Personal Security
If
you’ve flown on an airplane or attended a big public event like T20 Cricket
match, you certainly had to wait in long security screening lines. But ML is
proving that it is possible to help eliminate false alarms and spot things
human screeners might miss in security screenings at airports, stadiums,
concerts, and other venues. The use of ML techniques can really accelerate the
process of screening significantly and make certain that events become safer.
California-based Avata Intelligence [33] has been using AI with game
theory to predict when terrorists
or other threats will strike a target. The Coast Guard uses Avata Intelligence AI software for port security at New York, Boston and
Los Angeles in USA. It draws on data sources and generates a schedule that
makes it hard for a terrorist to predict timing of increased police presence.
Financial Trading
AI
and ML Have probably the perfect tools for the financial market, using
forecasts to make vital trading decisions. Financial feat depends heavily on
predicting where the market is heading. AI is predictive by nature, and can analyze
big data sets with incredible speed and accuracy [15]. ML algorithms are getting closer to
predicting trend in the stock market. Many businesses do rely on probabilities, but
even a trade carried out with a relatively low probability of failure, at a
high enough volume or speed, can result in huge profits for the firms.
Healthcare Industry [16]
Robots and computers will
possibly never fully replace doctors and nurses; although machine learning/deep
learning and AI are changing the healthcare Industry, with successful outcomes. ML algorithms can process more information
and spot more patterns than their human counterparts even from big data which
often is the case with healthcare. Computers with ML algorithms have particularly
been successful in finding breast cancer and also other cancer cases. IBM’s Watson [17] is helping oncologist make the best
care decisions for their patient. Now, it
is possible that hospital, clinics and individual doctors can rent time with
Watson over the cloud – send it information on a patient, after seconds (or at
most minutes), Watson will return a series of suggested treatment options. Watson is equipped with NLP capability so it
can handle a natural language query and answer it the same way.
Fraud Detection [18]
ML with deep learning is also improving its
performance getting better at spotting potential cases of fraud across many
different fields. Fraud
detection always falls short of complete automatic detection because of the
false positive rate and the need for at least some human intervention,
typically on a case-by-case basis. Neural nets with deep learning
techniques are becoming increasingly useful with unsupervised learning
paradigm.
Recommender Apps
Amazon or Netflix services use such Apps. ML
algorithms analyze your activity and compare the info with millions of other customers
in the database to determine what you might like to buy next and then recommend
that product to you. An interesting area of successful
use besides many of ML and AI algorithms is collecting fitness data for users
through smart sensors and tracking similar users based on the fitness data
captured in the database and analyzing and then recommending fitness products
to the users [11].
Online Search
ML algorithms are exploited by Google and
its competitors in improving the Search operation i.e. what the search engine comprehends
[12] by your search input? RankBrain [4], the new AI/ML algorithm, is
now used by Google for SEO. This algorithm falls in Artificial Narrow Intelligence
(ANI) category. RankBrain algorithm decides what mixture of core algorithms are
to be used to get best search results. For instance, in certain search results,
RankBrain might learn that the most important signal is the META Title. Adding
more significance to the META Title matching algorithm might lead to a better
searcher experience. But in another search result, this very same signal might
have a horrible correlation with a good searcher experience. So in that other
vertical, another algorithm, maybe PageRank is used by Google. This implies
that, in each search result, Google has a completely different mix of
algorithms. RankBrain, in fact, is only a computer program used to sort through
the billions of pages and select the ones it finds most relevant for a
particular query. The overall search
algorithm of Google is called ‘Hummingbird’. As per Google [13], gradual
rollout of RankBrain began in early 2015.
This approach of Google in using ML for page ranking makes life of SEO
industry difficult and industry is trying to catch up with Google and modify
websites to get better page ranking. Thus use of ML/AI tool by Google - called
RankBrain - has really changed the future of SEO Industry. RankBrain and other
forms of AI will keep on improving with time and at some point surpass the
human brain. And at this point, nobody knows where this technology will lead
us.
Natural Language Processing (NLP)
As per Wikipedia, Natural language processing (NLP) is a field of computer science, AI, and computational
linguistics concerned with the interactions between computers and human (natural) languages and, in particular,
concerned with programming computers to fruitfully process large natural language corpora. NLP is being used in all sorts of
exciting applications across disciplines. ML algorithms with natural language
can stand in for customer service agents like SIRI, and Cortana from Apple Inc.
and Microsoft corp. respectively.
Smart Cars [3]
A smart car may not only incorporate Internet of Things [10] in automotive Technology besides
Vacuum cleaners and Smart thermostat solutions, like that of Nest Labs [25]. In smart cars, Ai based systems
learn about its owner and its environment, i.e., it may adjust the internal
settings like temperature, audio, seat position, etc. — automatically based on
the driver, report and even fix problems and also drive itself
We are already seeing trials of
driverless cars from large companies such as Audi, Tesla and Google to name a
few, with a number of other enterprises viz. GM, Fiat Chrysler and Ford
are developing new solutions and want to put their cars for show in less
than 5 years. Apple seems to have some thoughts on the project. SDCs are likely to be more efficient
and safer than conventional cars with people driving the car. Moreover, SDCs
are likely to reduce congestion as well emissions.
In 2018 the first SDCs will appear
for the public. Why should one own a car today especially in urban
areas? This question comes to one’s mind, particularly, when you can
call a car with your phone and a car will show up at your location and
drive you to your destination. You don’t have to worry to park it, you
only pay for the driven distance and can be free to do anything you like.
Robotics
The debate over using
AI to control lethal weapons in warfare is more complex than it seems. An open
letter calling for a ban on
lethal weapons controlled by AI machines was signed by thousands of scientists
and technologists at the International
Joint Conference on Artificial Intelligence held
from Jul, 25 to 31, 2015 in Buenos Aires, Argentina (see [1]). The letter
states:
“Artificial Intelligence (AI)
technology has reached a point where the deployment of such systems
is—practically if not legally—feasible within years not decades, and the stakes
are high: autonomous weapons have been described as the third revolution in
warfare, after gunpowder and nuclear arms.”
At present, all Robots are
owned. Do we only build robots which
only work as machine slaves [27]? Will future AI robots or AI machines have
sentiments too? Is it ethical to design and develop such robots? Are
developments in ML and AI so alarming that they need to be controlled or is it
right to do? The clear answers to these questions are not available presently but
with time hopefully answers will also evolve.
Way forward
Although many CEO wanted their companies
to be called an AI or ML powered companies but only a few companies essentially
have put significant investments in AI. These include Amazon, Baidu – a Chinese Company [31], Google, IBM, Microsoft, Tesla Motors, Facebook and NVidia.
Automation and AI will be main cause of elimination of some
jobs and this fact we all should be ready to face very soon [20]. The numbers
of ChatBots [29] in the market - for customer service - are increasing day by
day; we do see real robots on the factory floor as companies find them actually
useful as they are resulting in savings. But we believe companies would be wise
to use AI first where there is computer to computer communication. So the companies for the time being will be
busy collecting the low-hanging fruit.
Year
2017Christmas [22], may see introduction of new AI based smart toys and gadgets
for children and adults. These toys could converse with your kids and learn to
adapt to their speech patterns and interest areas. Voice may become the de-facto interface of
man-machine interaction. Robot
may take your job, but that time has not come as yet.
ML seemingly is a different way to
develop a model on a computer and needs training the model with a lot of sample
data. With experience, this fact has clearly come out as a reality. General AI
[1, 4] which we also refer to as super-intelligence still remains a distant
goal depending on the specific domain of the “intelligence” being learned. To
be sure, computers trained using ML hold great potential, as well as the possibility
for huge disturbance in the Industry.
It has
not been long for AI and ML based startups begin popping up in India. About a
dozen came up during year 2016 [23] only. These startups will hopefully solve
some real world problem and bring India to ML and AI roadmap. Designing Technology with the best intentions
can still lead to disaster [see 24], therefore, one needs to be extremely
careful while designing and /or implementing ML and AI based Apps [1].
References
[1] Progress and
Perils of Artificial Intelligence (AI)
[2] Invited Chapter 6 - Evolutionary Algorithms and Neural Networks, Pages 111-136, R.G.S. Asthana
in book, Soft Computing and Intelligent Systems (Theory and Applications),
Academic Press Series in Engineering, Edited by:Naresh K. Sinha, Madan M. Gupta
and Lotfi A. Zadeh ISBN: 978-0-12-646490-0
http://www.sciencedirect.com/science/book/9780126464900
[3] Future 2030
by Dr. RGS Asthana, Senior Member IEEE
[4] Machine
Learning (ML) and Artificial Intelligence (AI) – Part 1, by Dr. RGS Asthana, Senior Member IEEE
[5] The year in
ML (Part two)
[6] How are
businesses using artificial intelligence? 13 enterprise uses for AI and machine
learning
[7] How machine
learning is revolutionizing digital enterprises
[8] The Top 10 AI And Machine
Learning Use Cases Everyone Should Know About
[9] Deep Instinct
[10] Internet of Things (IoT)
[11] Big Data & ML: Case study
of a Fitness Product Recommender application
[12] How
Google uses machine learning in its search algorithms
http://searchengineland.com/google-uses-machine-learning-search-algorithms-261158
[13] FAQ: All about
the Google RankBrain algorithm
[14] Natural Language Processing
[15] Disruption: FinTech –
Artificial Intelligence in Financial Trading
[16] How
Machine Learning, Big Data And AI Are Changing Healthcare Forever
[17] IBM's
Watson is better at diagnosing cancer than human http://www.wired.co.uk/article/ibm-watson-medical-doctor
[18] Fraud
Detection Using Deep Learning ML Techniques at Paypal
[19] Top
10 Deep Learning Projects on Github
[20] How Companies Are Already Using AI
[21] 13
frameworks for mastering machine learning
[22] Five Things to Watch in AI and ML in
2017
[23] 10 super
exciting Data Science / Machine Learning / Artificial Intelligence based
startups in India
[24] What Frankenstein can teach
Engineers by G. Pascal Zachary, in Opinion Section of IEEE Spectrum, Feb. 2017
issue, page 6.
[25] AI plus the Internet of Things (IoT)
– 3 Examples worth Learning from
[27] Do we have to build Robots that
need rights? By Sussan Hassler, in Opinion Section of IEEE Spectrum, Mar. 2017
issue, page 6.
[28] Machine Learning: What are best tools/softwares for data visualization for machine learning applications?
https://www.quora.com/Machine-Learning-What-are-best-tools-softwares-for-data-visualization-for-machine-learning-applications
[29] The complete Beginner’s guide to Chatbots
[30] Run Book
[32]
The Best Open
Source Machine Learning Frameworks
Thank you so much for this nice information. Hope so many people will get aware of this and useful as well. And please keep update like this.
ReplyDeleteText Analytics Software
Text Summarization Solutions