Machine Learning (ML) and Artificial Intelligence (AI) – Part 1

Dr. RGS Asthana

Senior Member IEEE

Summary

There is a difference AI and machine learning (ML) although today many use these terms interchangeably. As per [40], ML is where a computer teaches itself how to do something, rather than being taught by humans or following detailed programming. Similarly, AI is used to refer to computer systems that are designed to learn and make connections. I leave it to the reader to find the difference as many researchers use these terms interchangeably.

We describe techniques and few popular algorithms in AI and ML. We also take up A few ways ML is (or should be) implemented.

In the end we do describe the way forward for ML systems and what ML or AI may look after a few years.

Keywords

Machine Learning (ML) Tools, Artificial Intelligence (AI), Neural Networks, Reinforced Learning, Supervised and un-supervised Learning, Internet of Things (IoT)

Prelude

AI is a branch of computer science attempting to build machines capable of intelligent performance. Stanford University defines machine learning as “the science of getting computers to act without being explicitly programmed”. AI [1] researchers can build smart machines, but ML experts make them truly intelligent.

As per, Nidhi Chappell, head of machine learning from Intel the main difference in AI and ML is, “AI is basically the intelligence – how we make machines intelligent, while machine learning is the implementation of the compute methods that support it. The way I think of it is: AI is the science and machine learning is the algorithms that make the machines smarter. So the enabler for AI is machine learning,”

Classification of AI

AI is, generally, classified in three types.

· Artificial Narrow Intelligence (ANI)

This involves specialization in one particular thing i.e. narrow AI (e.g. beating the world champion in chess or Google’s RankBrain [29, 40], and the machine learning algorithms that are currently running on Google like Google Translate, IBM’s Watson ML feature on Amazon that tells you products that are “recommended for you” and self-driving cars).

· Artificial General Intelligence (AGI)

One AI system or algorithm can perform everything. If AI can perform like a human, we consider it AGI. If an AI is going to be as intelligent as the human brain, one crucial thing has to happen— the AI [30] “needs to equal the brain’s raw computing capacity. One way to express this capacity is in the total calculations per second the brain could manage.” Think of a system which learns to drive by watching the human driver? It is difficult to believe but is true [42]. The result seems to match the responses you’d expect from a human driver. But what if one day it did something unexpected—gone down looking at a tree or crashed, or did not move at a green light or hit a pedestrian crossing road at zebra line? This is a burning issue with an AI system it does right thing but there is no explanation why and how it arrives at its decision. But this should not happen—unless we find ways of making techniques like deep learning more plausible to their creators and answerable to their users. Otherwise it will be hard to predict when failures might occur—and it’s unavoidable they will. Complex ML methods, that could fully automate the decision-making process, are being used for investments or financial decisions, or the military decisions, or medical decisions, although this process is altogether indecipherable. This is the cause of concern many people has expressed about AI and ML [1].

How we can get AGI soon that we know of. The way is obvious from the following quote [30]:

“We’d build a computer whose two major skills would be doing research on AI and coding changes into itself — allowing it to not only learn but to improve its own architecture. We’d teach computers to be computer scientists so they could bootstrap their own development.”

It is possible that we may achieve true AGI level around 2050.

· Artificial Super-intelligence (ASI)

It’s very likely that ASI or AI [30] may achieve a level of intelligence which makes AI smarter than all of humanity combined — it will be something entirely different than intelligent machines we are comfortable with. The goals to ASI systems are given by its creators —e.g., your GPS’s goal is to give you the most efficient driving directions and Watson Computer’s aim [31] is to understand questions and then answer them accurately using its NLP interface – which is capable of answering questions posed in natural language. The main aim of an ASI system is to fulfill its assigned goals.

AI and Machine learning techniques

Supervised learning [33] uses associated targets for every input; the aim is to reduce error to reach a possible target. In the examples of unsupervised task there is no associated target, so there is no credit or blame to be used in learning. In ML [34], such solutions are called target or output and situations are called input or unleveled data. Situation and solution in combination it is called leveled data.

Reinforcement learning (RL)

As per Wikipedia, RL [13] is an area of ML inspired by behaviorist psychology, concerned with how software agents take actions in surroundings with the aim to maximize some concept of cumulative reward function. RL is close to process to emulate the animal learning. The idea is to interpret how certain behaviors tend to result in a positive or negative outcome. Using this method, a automation can navigate a maze by trial and error and then associate the positive outcome—exiting the maze—with the actions that led up to it. This lets a machine learn without instruction or even explicit examples. As per definition, RL allows machines and software agents to spontaneously fix the ideal behavior within a specific context, in order to maximize its performance. Simple reward feedback is used as reinforcement signal for the agent to learn its behavior.

The machine, in fact, picks an action or a sequence of actions, and gets a reward. This is used when teaching machines to play and win games but the main drawback of this method is that it needs a large number of trials to learn even modest or simple tasks. This issue comes in many disciplines, e.g., game theory, control theory, operations research, information theory, simulation-based Optimization, multi-agent systems, swarm intelligence, statistics, and genetic or evolutionary [2] algorithms. In the operations research and control literature. In fact, RL methods are close to dynamic programming.

In ML, the environment is modeled as a Markov decision process (MDP) [25], as many reinforcement learning algorithms for this context utilize dynamic programming techniques.

Supervised learning (SL)

In this case, expert communicates to the machine the correct answer for each input. For example, they show it an image of a dog and tell the machine the correct answer is “dog”. It is the most common method for training neural networks [2] and other machine learning systems. In Supervised learning or SL, we give all known features of the problem to the AI machine.

So, if you are training your ML system for every input with corresponding target, it is SL [33], which will be able to provide target for any new input after sufficient training. In fact, learning algorithm seeks a function from inputs to the respective targets. If the targets are expressed in some classes, it is called classification problem. Alternatively, if the target space is continuous, it is called regression problem.

Unsupervised learning or predictive learning (USL)

Humans and animals follow this method of learning. Both the species carefully observe how the world works. However, we really don't know well and precisely how to inculcate this level of learning in machines at the moment. In USL, we think carefully about the organization of the data and construct a model which reproduces that arrangement of data. Clustering is a good example of USL, as it would create different cluster of inputs and will be able to put any new input in appropriate cluster.

Other than clustering, other USL techniques include: anomaly detection [36], Hebbian Learning [35] and learning latent variable [37] – a statistical model.

A few Popular Algorithms in AI and Machine Learning

Neural Net Algorithms

The development of neural networks (NN) has been crucial to teaching computers to think and understand the world in the way we do, while retaining the inherent benefits such as speed, accuracy and lack of bias. A Neural Network is a system intended to work by classifying information in the same way a human brain does. It can be taught to recognize, for example, images, and classify them according to elements they contain. Learning is inculcated by providing a feedback loop–to inform NN whether its decisions are right or wrong. Based on the feedback NN modifies the approach it takes in the future. The most popular artificial neural network algorithms (ANN) [2] are:

· Perceptron

· Back-Propagation

· Hopfield Network

Deep Learning Algorithm

The most popular deep learning algorithms are:

Deep Belief Networks (DBN): In ML, a DBN is a generative graphical model, or alternatively a type of deep neural network, composed of multiple layers of latent variables ("hidden units"), with connections between the layers but not between units within each layer. Neural networks" or NN is a term used to refer to feed-forward neural networks. Deep Neural Networks (DNNs) are feed-forward Neural Networks with many layers. It may be noted that a DBN is not the same as a DNN. As can be seen that a DBN has undirected connections between some layers. This means that the topology of the DNN and DBN is different by definition.

Hinton [4] discovered that better results could be obtained in deeper architectures when each layer is pre-trained with an unsupervised learning algorithm. Then the Network can be trained in a supervised way using back-propagation in order to "fine-tune" the weights. A RBM is a restricted Boltzmann machine which is a Boltzmann machine whose nodes must be a bipartite graph (so strictly feed-forward only). A DBM is simply a deep Boltzmann machine. A DBN basically comprises of stacked RBMs which are trained layer-wise (see figure 1].

Figure 1: DBN Structure [6]

A bipartite graph (see figure 2), also called a bi-graph, is a set of graph vertices decomposed into two disjoint sets such that no two graph vertices within the same set are adjacent.

Figure 2: Examples of Bipartite graph [5]

Convolutional Neural Network (CNN): In ML, a CNN is a feed-forward artificial neural network - with the connectivity between its neurons is inspired by the animal visual cortex. Individual cortical neurons reply to stimuli only in its receptive field. The receptive fields of different neurons partially overlap such that they tile the visual field. The reply of an individual neuron to stimuli within its receptive field can be approximated mathematically by a convolution operation. Convolutional networks were inspired by biological processes and are variations of multilayer perceptron or MLP [7] (As per Wikipedia, MLP is a feed-forward artificial neural network (ANN) model that maps sets of input data onto a set of appropriate outputs. It is designed to use minimal amounts of preprocessing. CNN applications include image and video recognition and NLP etc.

Generative Adversarial Networks (GANs) [19]

Neural Information Processing System (NIPS) 2016 is an annual event. “Reinforcement learning” (RL) [13, 17] and “Generative Adversarial Networks” (GANs) [19] are the in thing in AI now. The idea of RL has been around for decades, but combining it with large neural networks provides the power needed to make it work on really complex problems (like the game of Go — originated in China more than 2,500 years ago). AlphaGo [15] with hit and trial came out for itself how to play the game at an expert level.

Work on GANs was initiated by Ian Goodfellow [14] — a system of network consisting of two independent networks. One network referred to as ‘D’ generates new data after learning from a given (real) training set. Other Network referred to as ‘G’ tries to differentiate between real and fake data. This approach could be used to generate video-game scenery, de-blur pixelated video footage, or apply stylistic changes to computer-generated designs. In fact, both GANs and RLs lead to improving performance of unsupervised machines (neural networks [15, 2, and 19]).

For example, consider an image discriminator network ‘D’ which identifies a series of images depicting a set of animals. Now consider an adversary network (‘G’) whose mission is to fool ‘D’ using judiciously created images that look very nearly right. This can be done by picking a genuine sample randomly from the given training set and synthesizing a new image by randomly altering its known features. For instance, ‘G’ can fetch the image of a dog and can add an extra eye to the image converting it to a false sample. The result is an image very similar to a normal dog with the exception of the number of eye. Please note that false samples are created only from the original samples and ‘D’ and ‘G’ are totally independent nets. The aim of network ‘G’ is to fool network ‘D’ to a maximum extent (see figure 2). In the ultimate case, both ‘D’ and ‘G’ would improve their performance over time until ‘G’ has become a “master forger and ‘D’ is at a loss, i.e., it is “unable to differentiate between the two distributions [29].”

During training, ‘D’ is presented with a random mix of genuine images from training data as well as fake images generated by ‘G’. ‘D’s task is to recognize correct and fake images. Based on the outcome, both machines try to fine-tune their parameters and become better in what they do. If ‘D’ makes the right prediction, ‘G’ updates its parameters in order to generate better fake samples to confuse ‘D’. If ‘D’s prediction is incorrect, it tries to learn from its mistake to avoid similar mistakes in the future. The reward for net ‘D’ is the number of right predictions and the reward for ‘G’ is the number of errors ‘D’ commits. This process is continued until a steadiness is reached to optimize ‘D’s training.

Goodfellow [14] had shown that network ‘G’ performs a form of USL on the original dataset. Further, Yann LeCun [27] - Director of AI Research, Facebook and Founding Director of the NYU Center for Data Science - stated, USL is the “cake” of true AI. This powerful technique (called GAN) is claimed to be easily programmed using PyTorch [21, 26], in under 50 lines of code.

Figure 3: GAN [15]

PyTorch [21] is a Python-based Machine Learning library with GPU support. It can be used as easily as NumPy and is built upon the famous Torch library. The main feature is that Neural Networks can be built dynamically making way for learning more advanced and complex AI tasks.

A few ways ML is (or can be) implemented

1. ML for edge scenarios where network accessibility is not guaranteed at all times or network is not available or is irregular e.g. automated cars or Internet of things (IoT) [38] devices; one may like to use a model that has already been trained and this model may become the core part of the application. If you hire services say from Google training, it could be an expansive preposition and all may not be able to afford it. One need to also see a feedback during training process say in form of visualization which may or may not be available in all cases or from all libraries.

2. ML as a service scenario [8] where an app has access to a network or a Websites or Games site. You can then publish the model as a part of Amazon, or Microsoft and Google ML. The training in this case may take place using cloud compute power mainly from Amazon Web Services, or Microsoft Azure and Google Cloud. Other cloud players include Digital Ocean and VMWare Cloud. This scenario works well if training infrastructure required is hard to conjure and you derive full benefit from state of the art ML libraries like TensorFlow [10] which is a pretty low level library.

This also makes referencing the model itself easier as you enjoy full liberty to write your model in a particular ML friendly language like R and Python, it’s tough to incorporate that into your app written in C# or C or C++, or Java per say. Given an option that you can publish your model as a cloud service and your interface between your app and model becomes a REST API [9], that problem is solved easily. Building RESTful web services is part art and part science. Building REST API is at present in initial stages only, therefore, top methods are slowly emerging. Google Cloud ML Engine is a managed service that enables you to easily build ML models, to work on any type of data, of any size. You can always develop your model with the powerful TensorFlow framework [10, 24] from Google that powers many Google products, including Google Photos and Google Cloud Speech. Your trained model is immediately available for use with global prediction platform that can support thousands of users and TBs of data. The service is integrated with Google Cloud Dataflow for pre-processing, allowing you to access data from Google Cloud Storage [12], Google BigQuery [11], and others. However, the only catch is the cost and how many times one would need to train the model.

3. NLP applications attempt to understand natural human communication, either written or spoken, and connect to us in return using similar natural language. ML is used here to help machines understand the vast distinctions or variations or semantics or to be more precise lexical semantics in human language, and to learn to respond in a way that a particular audience is likely to understand. Cognitive Services [23] pre-trained models are available to a developer for common scenarios like natural language processing (NLP), recommendation engines and semantic search etc. by making use of ML. Use of these services makes applying ML to applications easier. The first two approaches require a data science background. One can, however, use cognitive services as another approach which may help in leveraging canned models for common workflows.

Way forward

AI's scientific challenge [18] comprises of providing new heuristic based computational models that provide the wide range of capabilities credited to human brain and are thought to be beneficial even for nonhuman intelligence. Common logical sources help in understanding knowledge representation, planning, problem solving, reasoning, and some aspects of NLP whereas economic aspects and the mathematics of Markov decision processes help unify probabilistic forecasting, fault diagnosis and repair, reinforcement learning, robot control, and many aspects of speech recognition and image processing. Many of these include cross-disciplinary boundaries and lead to integration with other fields. AI has also included logic, philosophy, psychology, and linguistics for some time in its domain. Inclusion of economics, decision theory, control theory, and operations research has served as a focus for more recent efforts. Next area is integration of big data results with AI and to see and analyze how these systems can derive new heuristics from large amount of data processed using big data methods instead of heavy mathematical time-consuming computations [41]. AI may be useful even in handling big Data [3. 20].

In this study [22], the ML and data science skills were examined along with the languages like C, Java, C++, C## and JavaScript. Python and R were included as they are known to be popular languages for ML and data science. This study also included languages like Scala and Julia. Figure 4 shows some of the interesting results that were observed.

Figure 4: Percentage of matching job postings based on languages [22]

An engineering team includes ML and data scientists. The role of ML and data scientists [41] is given below:

· Machine learning engineers build, implement, and maintain production machine learning systems.

· Data scientists conduct research to generate ideas about machine learning projects, and perform analysis to understand the metrics impact of machine learning systems.

A few specific cases of use of ML and AI [28] pertain to Data and personal security, financial trading, healthcare, personalization in marketing, fraud detection, NLP, search and smart cars.

In fact, humans, are limited by slow biological evolution, and couldn’t compete with ever evolving AI and would in all likelihood be superseded one day but when nobody knows for now. This fact is shown in Figure 5 [29]. AI and ML based automation technologies including big data, self-driving cars and robotics [3, 39] now play noticeable role in everyday life. Their prospective effect on the workplace has, naturally, not only attained main focus of research but has also attracted a lot of public apprehension. The main concern is to guess that which jobs will or won’t be replaced by machines?

Figure 5: Growth of AI and ML [29]A

RankBrain [29, 40] from Google, the new AI/ML algorithm, is now used by Google for SEO. This algorithm falls in ANI category. RankBrain algorithm decides what mixture of core algorithms are to be used to get best search results. For instance, in certain search results, RankBrain might learn that the most important signal is the META Title. Adding more significance to the META Title matching algorithm might lead to a better searcher experience. But in another search result, this very same signal might have a horrible correlation with a good searcher experience. So in that other vertical, another algorithm, maybe PageRank is used by Google. This implies that, in each search result, Google has a completely different mix of algorithms. RankBrain, in fact, is only a computer program used to sort through the billions of pages and select the ones it finds most relevant for a particular query. The overall search algorithm of Google is called ‘Hummingbird’. As per Google [40], gradual rollout of RankBrain began in early 2015. This approach of Google in using ML for page ranking makes life of SEO industry difficult and industry is trying to catch up with Google and modify websites to get better page ranking. Thus use of ML/AI tool by Google called RankBrain has really changed the future of SEO Industry. RankBrain and other forms of AI will keep on improving with time and at some point surpass the human brain. And at this point, nobody knows where this technology will lead us.

AI bots were given a task to learn how to talk to each other and develop their own language a few months back only. The bots created a way of communicating with each other but they did not use words in the way humans think of them — rather, the bots generated sets of numbers, which researchers later labeled with English words [32] for easy understanding.

It is expected that fully grown AGI level will be achieved by say year 2050 and thereafter some feel that it may take another 10 to 50 years to achieve fully grown ASI level.

References

[1] Progress and Perils of Artificial Intelligence (AI)

http://newblogrgs10.blogspot.in/2017/04/progress-and-perils-of-artificial_5.html

[2] Invited Chapter 6 - Evolutionary Algorithms and Neural Networks, Pages 111-136, R.G.S. Asthana in book, Soft Computing and Intelligent Systems (Theory and Applications), Academic Press Series in Engineering, Edited by:Naresh K. Sinha, Madan M. Gupta and Lotfi A. Zadeh ISBN: 978-0-12-646490-0

http://www.sciencedirect.com/science/book/9780126464900

[3] Future 2030 by Dr. RGS Asthana, Senior Member IEEE

https://www.linkedin.com/pulse/future-2030-dr-rgs-asthana-senior-member-ieee-r-g-s-asthana

[4] ‘Fast Learning Algorithm for Deep Belief Nets’, by Geoffrey E. Hinton, Simon Osindero and Yee-Whye-Teh, Posted Online May 17, 2006. (doi:10.1162/neco.2006.18.7.1527), © 2006 Massachusetts Institute of Technology

http://www.mitpressjournals.org/doi/abs/10.1162/neco.2006.18.7.1527#.WOcbXdR96t8

[5] Wolfram MathWorld: Bipartite Graph

http://mathworld.wolfram.com/BipartiteGraph.html

[6] DBN Structure

https://www.google.co.in/search?q=dbn&rlz=1C1AOHY_enIN708IN708&source=lnms&tbm=isch&sa=X&ved=0ahUKEwienbrHipLTAhVGvI8KHYcSB1QQ_AUIBigB&biw=1280&bih=709#tbm=isch&q=DBN+structure&imgrc=Qy0pu1kQZyhVrM:

[7] Multi-layer Perceptron

https://en.wikipedia.org/wiki/Multilayer_perceptron

[8] Cloud machine learning engine: ML on any data, of any size

https://cloud.google.com/ml-engine/