Wednesday, 19 April 2017

Machine Learning (ML) and Artificial Intelligence (AI) – Part Two by Dr. RGS Asthana Senior Member IEEE

Machine Learning (ML) and Artificial Intelligence (AI) – Part Two
by
Dr. RGS Asthana
Senior Member IEEE

Figure 1: Revolutionizing of digital enterprises by ML [7]
Summary
Machine Learning (ML) and Artificial Intelligence (AI) – Part Two covers   distinction between ML and AI followed by some popular software projects in the field.
A very brief survey of some ML and Deep Learning libraries and frameworks is done. A description of some common use cases in the areas, such as, Data Security, Personal Security, Financial Trading, Healthcare Industry, Fraud Detection, Recommender Apps, On-line Search, NLP, Smart Cars and Robotics is given.   Way forward covers the possibility of elimination of jobs due to use of ML and AI based automations in real world scenarios and also mentions in brief new products likely to come soon.
Keywords

Prelude
In Part Two, ML and deep learning projects are surveyed. How the implementation of ML and deep learning solutions will change enterprises view customer value today [7] (figure 1)? 
Four years ago, an Oxford University study predicted 47% of jobs could be automated by 2033 [20]. Even the near-term outlook has been quite negative: A 2016 report by the Organization for Economic Cooperation and Development (OECD) said 9% of jobs in the 21 countries that make up its membership could be automated. And in January 2017, McKinsey’s research arm estimated AI-driven job losses at 5%. What will really happen only time will tell?
Figure 2 is self-explanatory and shows how businesses use AI.  One needs to identify AI applications that not only eliminate jobs but also give big benefits?

Figure 2: Use of AI by businesses [20]
A RunBook [30] (see figure 2) is a compilation of repetitive actions and operations that the system administrator or operator carries out.  All the relevant entries in the RunBook can be done using software tools and AI can really automate this process.
Distinction between ML and AI
In fact, ML is a subset of AI. Machines that could fully copy or even surpass all humans’ cognitive tasks are still a Science Fiction, ML in reality is behind AI and it is available today. ML based systems attempt to imitate the human cognitive system functions and solve problems based on that functioning. These systems have necessary capacity to handle and analyze massive data with accuracy which is way beyond human skills.
There is no need to program a ML based computer as it can now learn from data and identify patterns to solve a specific problem.  Computers can now handle tasks previously only humans could do, e.g., win games like chess, go and poker; can identify images accurately, translate spoken words in text accurately, and can translate say, over a hundred languages.
With growing popularity of unsupervised learning (USL), one AI machine can now do multiple jobs. The methods exploited in ML use USL techniques, such as,    Neural Nets or its variations, Big Data patterns,  or cluster analysis to name a few.
Some Popular Software Projects of ML and Deep Learning [21]
Data visualization [28] is an important step in data analysis as well as in ML. There are many data Visualization tools including Google Charts, Matplotlib, Matlab, ggplot and other language-native tools. Microsoft ML Studio creates canned plots covering all the usual type of visualization. 
Given below is very brief detail of software tools/ languages/ frameworks useful for ML:
Python
Python is increasingly used for analytics in data mining, data science or ML projects. In Python, there are many good options for data visualization including Matplotlib, Seaborn, and Plotly. 
Apache Software Foundation
There are many Apache projects with ML capabilities. Of these, Spark has the most users, active contributors, commits, and lines of code added. 
Caffe
Caffe [19] is a deep learning library (DLL) has Python and MATLAB bindings, though it’s written in C++ with CUDA acceleration support. It is, in fact, a general purpose DLL for deploying convolutional networks, as well as other architectures, in vision, speech, and other applications. If you wish you can use built-in models where it offers the following:
·         Model definitions
·         Optimization settings
·         Pre-defined weights
So you can start immediately working.   
Microsoft Distributed Machine Learning Toolkit (DMTK)
DMTK framework handles the problem of distributing various kinds of ML jobs across a cluster of systems.
DMTK is billed as a framework rather than a full-blown out-of-the-box-solution, so the number of actual algorithms included with it is small.  
Keras 
Keras is a Python DLL which leverages both TensorFlow and Theano, meaning that it can be run on top of either of what are arguably two of the most popular deep learning research libraries currently in existence.  This makes it a real high level library for neural net based frameworks.
The R Project
The R Consortium - a collaborative project of the Linux Foundation – its user community as well as its functionality continued to grow in the year 2016.  The new members included IBM and ESRI, which means Alteryx, Avant, DataCamp, Google, Ketchum Trading, Mango Solutions, Microsoft, Oracle, RStudio, and TIBCO also become part of R community.  
Apache Spark
Spark’s ML capabilities include additional algorithms in the DataFrames-based API, in PySpark and in SparkR, as well as support for saving and loading ML models.  The DataFrames-based API is now the primary interface for ML in Spark.  Third parties also added close to 25 ML packages to Spark Packages in 2016.  The major public cloud service providers (viz. AWS, Microsoft, IBM and Google)   deliver Spark services and also value-added managed services for data scientists.

ML and/or Deep Learning Frameworks

Table 1.0 below gives a brief list of frameworks for ML and/or deep learning engineers.

Table 1.0: ML and/or Deep Learning Frameworks
Platform
Purpose
 Models Supported
Built-in layers or languages supported
Visuali-zation tools
Apache Singa
It’s a distributed deep learning Open source platform for training and It can handle big deep learning models over large datasets. It’s good for solving NLP and Image Recognition problems
Feed-forward models, e.g., CNN, RBM and Recurrent Neural Nets or RNN. Models can be trained synchronously (one after the other) or asynchronously (side by side), depending on whatever works best for the given problem. It can be programmed to work with cluster of machines easily. It’s slow and too complicated for solving simple problems.
Many models, provides Improvised Python binding & contains more deep learning models like VGG & ResNet. [32]

Amazon ML
connects to data stored in Amazon S3, Redshift, or RDS, and can run on said data to create a model
It supports three types of models: binary classification, multi-class classification and regression [32].

has many visuali-sation tools & wizards. 
Azure ML Studio
It is slow and takes seconds, if not minutes.   To properly understand ML, it’s important to iterate often. I use Azure ML a lot, but mainly for production and maintaining models. By the way, it comes with a cost. 
Has wide range of algorithms, courtesy of both Microsoft and third parties.
Azure ML Studio has monthly, hourly, and free-tier versions. Azure ML Studio allows users to create and train models, then turn them into APIs that can be consumed by other services. Users get up to 10GB of storage per account for model data, although you can also connect your own Azure storage to the service for larger models. 


Torch
It is a scientific computing framework with wide support for ML algorithms that puts GPUs first.
The scripting language is LuaJIT, and it is a core C/CUDA implementation with maximum flexibility and speed in building scientific algorithms. It has support for ML, computer vision, signal processing, parallel processing, image, video, audio and networking among others, and builds on top of the Lua community.


H2O
easily apply math and predictive analytics to solve today’s most challenging business problems
Best of Breed Open Source Technology, easy-to-use WebUI and Familiar Interfaces, Data Agnostic Support for all Common Database and File Types. You can work with your existing languages and tools including Hadoop environments


Massive Online Analysis (MOA)
It’s an open source framework for data stream mining,
Has ML algorithm for classification, regression, clustering, outlier detection, concept drift detection and recommender systems
Written in Java

MLlib (Spark)   
Apache Spark’s ML lib:  It is a pipeline wrapper over ML lib. The main difference is that it runs on Spark, which lets you do parallel processing on a massive scale. 

It consists of common learning algorithms and utilities, including classification, regression, clustering, collaborative filtering Collaborative filtering techniques including Alternating Least Squares (ALS), dimensionality reduction, as well as lower-level optimization primitives such as stochastic gradient descent and limited-memory BGGS and higher-level pipeline APIs. Summary statistics, correlations, hypothesis testing, random data generation.   
Python
One can use any third party tool
Pattern
It is a web mining module
It can mine data from Google, Twitter and Wikipedia API, a web crawler, a HTML DOM parser, NLP (part-of-speech taggers, sentiment analysis, WordNet), ML (vector space model, clustering, SVM), network analysis
Python  
Yes
TensorFlow 
It is an open source software library for numerical computation using data flow graphs. TensorFlow implements what are called data flow graphs, where batches of data (“tensors”) can be processed by a series of algorithms described by a graph? The key shortcomings are: 1. It Needs lot of processing power and 2. User needs good knowledge of Neural Networks.
The movements of data through the system are called “flows”—hence, the name. Graphs can be assembled with C++ or Python and can be processed on CPUs or GPUs.
It is a low-level, but if you want a higher-level version, you can wrap Keras over it. It is   for deep learning, while Scikit-learn excels in traditional ML, so the two complement each other nicely. It is designed to scale across multiple nodes
C++ and Python
users can write code, and see visual-isations and data flow graphs.
Scikit-learn
One can implement ML pipelines with 5–10 lines of code. In my experience, it’s one of the easiest ML libs to work with.
The kit is available under a BSD license. It’s fully open & reusable. It includes tools for many of the standard ML tasks (Such as clustering, classification, regression, etc.). 
Python

Veles (Samsung) 
It is distributed platform for deep-learning applications. Python is used to   automate and coordinate between nodes. Datasets are analyzed and  normalized before being passed on to the cluster
A REST API allows the trained model to be used in production immediately. It has little hard-coded entities and enables training of all the widely known topologies, such as fully connected nets, convolutional nets, recurrent nets etc.
Like TensorFlow and DMTK, it’s written in C++, uses Python to perform automation & coordination between nodes.
The data-visualization and analysis tool can visualize and publish results from a Veles cluster.
If you really like to learn ML, it is then advisable to  
1.   Write at least one ML algorithm from scratch. It need not be more complex than linear regression with gradient descent.
2.   Experiment with Scikit-learn. It’s fast, easy and free. It lets you iterate often and understand what you’re doing.
3.   After you have a good grasp of the basics, expand to other libraries and APIs.
Use Cases [7 and 8]
Data Security
As per, Deep Instinct [9] - each piece of new malware has almost the same code as compared to the previous versions with 2 and 10% of variation. Their ML model can easily handle 2–10% variations, and can identify which files are malware with very good accuracy.  
Personal Security
If you’ve flown on an airplane or attended a big public event like T20 Cricket match, you certainly had to wait in long security screening lines. But ML is proving that it is possible to help eliminate false alarms and spot things human screeners might miss in security screenings at airports, stadiums, concerts, and other venues. The use of ML techniques can really accelerate the process of screening significantly and make certain that events become safer.
California-based Avata Intelligence [33] has been using AI with game theory to predict when terrorists or other threats will strike a target. The Coast Guard uses Avata Intelligence AI software for port security at New York, Boston and Los Angeles in USA. It draws on data sources and generates a schedule that makes it hard for a terrorist to predict timing of increased police presence.
Financial Trading
AI and ML Have probably the perfect tools for the financial market, using forecasts to make vital trading decisions. Financial feat depends heavily on predicting where the market is heading. AI is predictive by nature, and can analyze big data sets with incredible speed and accuracy [15].  ML algorithms are getting closer to predicting trend in the stock market.  Many businesses do rely on probabilities, but even a trade carried out with a relatively low probability of failure, at a high enough volume or speed, can result in huge profits for the firms.
Healthcare Industry [16]
Robots and computers will possibly never fully replace doctors and nurses; although machine learning/deep learning and AI are changing the healthcare Industry, with successful outcomes.  ML algorithms can process more information and spot more patterns than their human counterparts even from big data which often is the case with healthcare.  Computers with ML algorithms have particularly been successful in finding breast cancer and also other cancer cases. IBM’s Watson [17] is helping oncologist make the best care decisions for their patient.  Now, it is possible that hospital, clinics and individual doctors can rent time with Watson over the cloud – send it information on a patient, after seconds (or at most minutes), Watson will return a series of suggested treatment options.  Watson is equipped with NLP capability so it can handle a natural language query and answer it the same way.
Fraud Detection [18]
ML with deep learning is also improving its performance getting better at spotting potential cases of fraud across many different fields.  Fraud detection always falls short of complete automatic detection because of the false positive rate and the need for at least some human intervention, typically on a case-by-case basis. Neural nets with deep learning techniques are becoming increasingly useful with unsupervised learning paradigm.
Recommender Apps
Amazon or Netflix services use such Apps. ML algorithms analyze your activity and compare the info with millions of other customers in the database to determine what you might like to buy next and then recommend that product to you.  An interesting area of successful use besides many of ML and AI algorithms is collecting fitness data for users through smart sensors and tracking similar users based on the fitness data captured in the database and analyzing and then recommending fitness products to the users [11].
Online Search
ML algorithms are exploited by Google and its competitors in improving the Search operation i.e. what the search engine comprehends [12] by your search input?  RankBrain [4], the new AI/ML algorithm, is now used by Google for SEO. This algorithm falls in Artificial Narrow Intelligence (ANI) category. RankBrain algorithm decides what mixture of core algorithms are to be used to get best search results. For instance, in certain search results, RankBrain might learn that the most important signal is the META Title. Adding more significance to the META Title matching algorithm might lead to a better searcher experience. But in another search result, this very same signal might have a horrible correlation with a good searcher experience. So in that other vertical, another algorithm, maybe PageRank is used by Google. This implies that, in each search result, Google has a completely different mix of algorithms. RankBrain, in fact, is only a computer program used to sort through the billions of pages and select the ones it finds most relevant for a particular query.   The overall search algorithm of Google is called ‘Hummingbird’. As per Google [13], gradual rollout of RankBrain began in early 2015.  This approach of Google in using ML for page ranking makes life of SEO industry difficult and industry is trying to catch up with Google and modify websites to get better page ranking. Thus use of ML/AI tool by Google - called RankBrain - has really changed the future of SEO Industry. RankBrain and other forms of AI will keep on improving with time and at some point surpass the human brain. And at this point, nobody knows where this technology will lead us.
Natural Language Processing (NLP)
As per Wikipedia, Natural language processing (NLP) is a field of computer science, AI, and computational linguistics concerned with the interactions between computers and human (natural) languages and, in particular, concerned with programming computers to fruitfully process large natural language corporaNLP is being used in all sorts of exciting applications across disciplines. ML algorithms with natural language can stand in for customer service agents like SIRI, and Cortana from Apple Inc. and Microsoft corp. respectively. 
Smart Cars [3]
A smart car may not only incorporate Internet of Things [10] in automotive Technology besides Vacuum cleaners and Smart thermostat solutions, like that of Nest Labs [25]. In smart cars, Ai based systems learn about its owner and its environment, i.e., it may adjust the internal settings like temperature, audio, seat position, etc. — automatically based on the driver, report and even fix problems and also drive itself
We are already seeing trials of driverless cars from large companies such as Audi, Tesla and Google to name a few, with a number of other enterprises viz. GM, Fiat Chrysler and Ford  are developing new solutions and want to put their cars for show in less than 5 years. Apple seems to have some thoughts on the project.  SDCs are likely to be more efficient and safer than conventional cars with people driving the car. Moreover, SDCs are likely to reduce congestion as well emissions.
In 2018 the first SDCs will appear for the public. Why should one own a car today especially in urban areas?  This question comes to one’s mind, particularly, when you can call a car with your phone and a car will show up at your location and drive you to your destination.  You don’t have to worry to park it, you only pay for the driven distance and can be free to do anything you like.
Robotics
The debate over using AI to control lethal weapons in warfare is more complex than it seems. An open letter calling for a ban on lethal weapons controlled by AI machines was signed by thousands of scientists and technologists at the International Joint Conference on Artificial Intelligence held from Jul, 25 to 31, 2015 in Buenos Aires, Argentina (see [1]). The letter states:
“Artificial Intelligence (AI) technology has reached a point where the deployment of such systems is—practically if not legally—feasible within years not decades, and the stakes are high: autonomous weapons have been described as the third revolution in warfare, after gunpowder and nuclear arms.”
At present, all Robots are owned.  Do we only build robots which only work as machine slaves [27]? Will future AI robots or AI machines have sentiments too? Is it ethical to design and develop such robots? Are developments in ML and AI so alarming that they need to be controlled or is it right to do? The clear answers to these questions are not available presently but with time hopefully answers will also evolve.
Way forward
Although many CEO wanted their companies to be called an AI or ML powered companies but only a few companies essentially have put significant investments in AI. These include Amazon, Baidu – a Chinese Company [31], Google, IBM, MicrosoftTesla Motors, Facebook and NVidia. 
Automation and AI will be main cause of elimination of some jobs and this fact we all should be ready to face very soon [20]. The numbers of ChatBots [29] in the market - for customer service - are increasing day by day; we do see real robots on the factory floor as companies find them actually useful as they are resulting in savings. But we believe companies would be wise to use AI first where there is computer to computer communication.  So the companies for the time being will be busy collecting the low-hanging fruit.  

Year 2017Christmas [22], may see introduction of new AI based smart toys and gadgets for children and adults. These toys could converse with your kids and learn to adapt to their speech patterns and interest areas.  Voice may become the de-facto interface of man-machine interaction.  Robot may take your job, but that time has not come as yet.

ML seemingly is a different way to develop a model on a computer and needs training the model with a lot of sample data. With experience, this fact has clearly come out as a reality. General AI [1, 4] which we also refer to as super-intelligence still remains a distant goal depending on the specific domain of the “intelligence” being learned. To be sure, computers trained using ML hold great potential, as well as the possibility for huge disturbance in the Industry.  
It has not been long for AI and ML based startups begin popping up in India. About a dozen came up during year 2016 [23] only. These startups will hopefully solve some real world problem and bring India to ML and AI roadmap.  Designing Technology with the best intentions can still lead to disaster [see 24], therefore, one needs to be extremely careful while designing and /or implementing ML and AI based Apps [1].
References
[1] Progress and Perils of Artificial Intelligence (AI) 

[2] Invited Chapter 6 - Evolutionary Algorithms and Neural Networks, Pages 111-136, R.G.S. Asthana in book, Soft Computing and Intelligent Systems (Theory and Applications), Academic Press Series in Engineering, Edited by:Naresh K. Sinha, Madan M. Gupta and Lotfi A. Zadeh ISBN: 978-0-12-646490-0

http://www.sciencedirect.com/science/book/9780126464900

[3] Future 2030 by Dr. RGS Asthana, Senior Member IEEE

[4] Machine Learning (ML) and Artificial Intelligence (AI) – Part 1, by Dr. RGS Asthana, Senior Member IEEE

[5] The year in ML (Part two)

[6] How are businesses using artificial intelligence? 13 enterprise uses for AI and machine learning

[7] How machine learning is revolutionizing digital enterprises

[8] The Top 10 AI And Machine Learning Use Cases Everyone Should Know About
[9] Deep Instinct
[10] Internet of Things (IoT)
[11] Big Data & ML: Case study of a Fitness Product Recommender application

[12] How Google uses machine learning in its search algorithms

http://searchengineland.com/google-uses-machine-learning-search-algorithms-261158

[13] FAQ: All about the Google RankBrain algorithm
[14] Natural Language Processing
[15] Disruption: FinTech – Artificial Intelligence in Financial Trading
[16] How Machine Learning, Big Data And AI Are Changing Healthcare Forever
[17] IBM's Watson is better at diagnosing cancer than human http://www.wired.co.uk/article/ibm-watson-medical-doctor
[18] Fraud Detection Using Deep Learning ML Techniques at Paypal
[19] Top 10 Deep Learning Projects on Github

[20] How Companies Are Already Using AI

[21] 13 frameworks for mastering machine learning

[22] Five Things to Watch in AI and ML in 2017

[23] 10 super exciting Data Science / Machine Learning / Artificial Intelligence based startups in India
[24] What Frankenstein can teach Engineers by G. Pascal Zachary, in Opinion Section of IEEE Spectrum, Feb. 2017 issue, page 6.
 [25] AI plus the Internet of Things (IoT) – 3 Examples worth Learning from
[26]  Military Robots: Armed, but How Dangerous?
[27] Do we have to build Robots that need rights? By Sussan Hassler, in Opinion Section of IEEE Spectrum, Mar. 2017 issue, page 6.

[28] Machine Learning: What are best tools/softwares for data visualization for machine learning applications?

https://www.quora.com/Machine-Learning-What-are-best-tools-softwares-for-data-visualization-for-machine-learning-applications

[29] The complete Beginner’s guide to Chatbots
[30] Run Book
[32] The Best Open Source Machine Learning Frameworks

[33] Artificial Intelligence and Security: Current Applications and Tomorrow’s Potentials

https://www.techemergence.com/artificial-intelligence-and-security-applications/




















































1 comment:

  1. Thank you so much for this nice information. Hope so many people will get aware of this and useful as well. And please keep update like this.

    Text Analytics Software

    Text Summarization Solutions

    ReplyDelete