Ai Algorithms PDF
Ai Algorithms PDF
Ai Algorithms PDF
H3 TRENDS IN AI ALGORITHMS:
THE INFOSYS WAY
Abstract
Artificial Intelligence algorithms are the wheels of AI. To make the art
of possible applications of AI, a very good and deep understanding
of these algorithms is required. This paper tries to bring a perspective
on the landscape of various AI algorithms that will be shaping key
advancements across industries.
Today, technology adoption is influenced At Infosys Center for Emerging With the emergence and availability of
by business and technology uncertainties. Technology solutions (iCETS), we several open datasets, computational
These uncertainties drive organisations continuously look at H2, and H3 thrust with GPU availability and maturity
to evaluate technology adoptions based technologies and their impact on client of Artificial Intelligence (AI) algorithms, AI
on risks and returns. Broadly, technology landscapes. These H2 and H3 technologies is making strong inroads into current and
led disruptions can be classified into are very important to be monitored as they future of IT ecosystems. Today, AI plays
Horizon1, Horizon 2 and Horizon 3. Horizon have the potential to transform or disrupt an integral role in IT strategy by driving
1 or H1 technologies are those that are existing well-oiled business models, hence new experiences and creating new art of
in mainstream client adoptions and have fetching large returns. However, there possibilities. In this paper, we try to look
steady business transactions, while H2 are also associated risks from adoptions at important AI algorithms that are
and H3 are those that are yet to become that need to be monitored as some of shaping various H3 of AI possibilities.
mainstream but have started to spring those can have higher negative impact on While we do that, here is a chart
interesting possibilities and potential compliance, safety and so on. representing the broader AI algorithm
returns in the future. landscape in the context of this paper.
Emerging Investment
Em
n
Differentiate Ru • Logistic Regression • Recommendations
Diversify Main stream • Naive Bayes • Prediction
Deploy • Random Forest • Document, image
• Support Vector Machines (SVM) classification
Fl
y
Business Uncertainty
Figure 2.0: Feature Visualisation. Source: Olah, et al. 2017 (CC-BY 4.0)
Network dissection helps in associating LIME ( Local Interpretability Model parts of the image are most important in
these established units to concepts. agnostic Explanations): It treats the arriving at results. Since the original model
They learn from labeled concepts during model as a blackbox and tries to create does not participate directly, it is model
supervised training stages, and how and in another surrogate non-linear model, where independent. The challenge with this
what magnitude these are influenced by explainablitiy is supported or feasible approach is that even when the surrogate
channel activations. such as SVM, Random Forest or Logistic model based explanations can be relevant
Regression. The surrogate non-linear to the model it is used on, it may not be
Several frameworks are currently evolving
model is then used to evaluate different generalizable precisely or become one to
to improve the explainability of the
components of the image by perturbing one mappable to the original model all the
models. Two known frameworks in this
the inputs and evaluating its impact time.
space are LIME and SHAP.
on the result. Thereby deciding which
Figure 3.0: Explaining a Prediction with LIME. Source: Pol Ferrando, Understanding how LIME explains predictions
Generative AI
Generative AI will have a potentially domain (e.g., think millions of images, Generative Networks can be of multiple
strong role in creative work, be it writing sentences, or sounds, etc.) and then types depending on the objective they are
articles, creating completely new images train the model to generate similar designed for, example being.
from the existing set of trained models, data. Generative network generates the
improving image or video quality, merging data to fool the Discriminative Network Neural Style Transfer (NST)
images for artistic creations, in creating while Discriminative Network learns by
Neural Style Transfer (NST) is one of the
music or improving dataset through data identifying real vs fake data received from
Generative AI techniques in deep learning.
generation. Generative AI as it matures, in the Generative Network.
As seen below, it merges two images,
near term, will augment many jobs and will
Generator trains with an objective function namely, a "content" image (C) and a "style"
potentially replace many in future.
on whether it can fool the discriminator image (S), to create a "generated" image
Generative Networks consists of two deep network, whereas discriminator trains on (G). The generated image G combines the
neural networks, a generative network its ability to not be fooled and correctly "content" of the image C with the "style" of
and a discriminative network. They work identify real vs fake. Both network image S
together to provide high-level simulation learns through back propagation. The
of conceptual tasks. generator is typically a deconvolutional
neural network, and the discriminator is a
To train a Generative model, we first
convolutional neural network.
collect a large amount of data in some
Figure 4.0 : Novel Artistic Images through Neural Style Transfer. Source: Fisseha Berhane, Deep Learning & Art: Neural Style Transfer
Some of the other GAN variations that are and flowers. · eGANs (Evolutionary Generative
popular are Adversarial Networks) that generate
· Sketch-GAN, a Generative model for
photographs of faces with different
· Super Resolution GAN (SRGAN) that vector drawings, which is a Recurrent
ages, from young to old.
helps improve quality of images. Neural Network (RNN) and is able to
construct stroke-based drawings of · IcGAN, to reconstruct photographs
· Stack-GAN that generates realistic
common objects. The model is trained of faces with specific features, such
looking photographs from textual
on a dataset of human-drawn images as changes in hair color, style, facial
descriptions of simple objects like birds
representing many different classes. expression, and even gender.
In Fine Grained Classification, the interleaved with max-pooling can capture groups of detected key points are used
progression through the 8-layer CNN deformable parts, and fully connected to compute multiple warped image
network can be thought of as a progression layers can capture complex co-occurrence regions that are aligned with prototypical
from low to mid to high-level features. statistics. models. Each region is fed through a
The later layers aggregate more complex deep convolutional network, and features
Bird recognition is one of the major are extracted from multiple layers after
structural information across larger examples in fine grained classification, which they are concatenated and fed to a
scales–sequences of convolutional layers in the below image, given a test image, classifier.
Figure 5.0: Bird Recognition Pipeline Overview. Source: Branson, Van Hoen et al.: Bird Species Categorization
Figure 6.0: Car Detection System. Source: Learning Features and Parts for Fine-Grained Recognition
Capsule Network
Convolutional Network are so far the alignment of eyes, and eyebrows or say several capsules, each capsule consists
defacto and well accepted algorithms to eyebrows swaps with lips and ears are of several neurons. Capsules in lower
work with image based datasets. They placed on forehead, the same CNN trained layers are called primary capsules and are
work on the pixels of images using various algorithm would still go on and detect this trained to detect an object (e.g. triangle,
size filters (channels) by convolving, using as a human face. This is the huge drawback circle) within a given region of image. It
pooling techniques to bubble the stronger of CNN algorithm and happens due to outputs a vector that has two properties;
features to derive colors, textures, edges its inability to store the information on Length and Orientation. Length represents
and shapes and establish structures relative position of various objects. the probability of the presence of the
through lower to highest layers. object and Orientation represents the
Capsule Network, invented by Geoffery
pose parameters of the object such as
Given the face of a person, CNN identifies Hinton, addresses exactly this problem of
coordinates, rotation angle, etc.
the face by establishing eyes, ears, CNN by storing the spatial relationships of
eyebrows, lips, chin, etc. components various parts. Capsules in higher layers called routing
of the face. However, if the facial image capsules, detect larger and more complex
Capsule Network like CNN are multi
is provided with incorrect position and objects, such as eyes, ears, etc.
layered neural networks, consisting of
Figure 8.0: Capsule Network for House or Boat classification. Source: Beginners’ Guide to Capsule Networks
Routing by Agreement considerably larger than the bets on the CNN, during edge detection, kernel
presence of a boat or a car. for edge detection works only on a
Unlike CNN which primarily bubbles higher
specific angle and each angle requires
order features using max. or avg. pooling, Advantage over CNN a corresponding kernel. When dealing
Capsule Network bubbles up features
· Less data for training - Capsule with edges CNN works well, because
using routing by agreement, where every
Networks need very less data for there are very few ways to describe
capsule participates in choosing the shape
training (almost 10%) as compared to an edge. Once we get up to the level
by voting (democratic election way).
CNN of shapes, we do not want to have a
In the figure given above, kernel for every angle of rectangles,
· Fewer parameters: The connections
· Lower level corresponds to rectangles, ovals, triangles, and so on. It would
between layers require fewer
triangles and circles. get unwieldy, and would become
parameters as capsule groups neurons,
even worse when dealing with more
· High level corresponds to houses, resulting in relatively less computations
complicated shapes that have 3
boats, and cars. bandwidth
dimensional rotations and features like
If there is an image of a house, the · Preserve pose and position - They lighting, the reason why traditional
capsules corresponding to rectangles preserve pose and position information neural nets do not handle unseen
and triangles will have large activation as against CNN rotations effectively.
vectors. Their relative positions (coded · High accuracy - Capsule Networks Capsule Networks are best suited for
in their instantiation parameters) will bet have higher accuracy as compared to object detection and image segmentation
on the presence of high-level objects. CNNs while it helps better model hierarchical
Since they will agree on the presence of relationships and provides high accuracy.
house, the output vector of the house · Reconstruction vs mere classification
However, Capsule Networks are still under
capsule will become large. This, in turn, - CNN helps you to classify the images,
research and relatively new and mostly
will make the predictions by the rectangle but not reconstruct the same image
tested and benchmarked on MNIST
and the triangle capsules larger. This whereas Capsule Networks help you to
dataset, but they will be the future in
cycle will repeat 4-5 times after which the reconstruct the exact image.
working with massive use cases emerging
bets on the presence of a house will be · Information retention vs loss - With from Vision datasets.
Figure 9.0: Transfer Learning Layers. Source: John Cherrie, Training Deep Learning Models with Transfer Learning.
State
Qtable Q value
Action
Q* learning
Q value action 1
Deep
State Q Neural Q value action 2
network
Q value action 3
Deep Q* learning
Figure 11.0: Schema inspired by the Q learning notebook by Udacity
Reward
Expected Given that state
discounted
The agent will use this value function to There are two types of policies:
select which state to choose at each step.
1. Deterministic: A policy which at a given
Policy Based state will always return the same action.
Auto ML (AML)
Designing machine learning solution As the complexity of these and other tasks intensive, and requires an expertise that
involves several steps such as, collecting can easily get overwhelming, the rapid limits its use to a smaller community of
data, understanding, cleansing and growth of machine learning applications scientists and engineers. That’s why we’ve
normalizing data, doing feature has created a demand for off-the-shelf created an approach called AutoML,
engineering, selecting or designing machine learning methods that can be showing that it’s possible for neural nets
the algorithm, selecting the model used easily and without expert knowledge. to design neural nets” while Google’s
architecture, selecting and tuning model’s The AI research area that encompasses Head of AI, Jeff Dean, suggested that 100x
hyper-parameters, evaluating model’s progressive automation of machine computational power could replace the
performance, deploying and monitoring learning pipeline tasks is called AutoML need for machine learning expertise.
the machine learning system in an online (Automatic Machine Learning).
AutoML Vision relies on two core
system and so on. Such machine learning
Google CEO Sundar Pichai wrote, techniques: transfer learning and neural
solution design requires an expert Data
“Designing neural nets is extremely time architecture search.
Scientist to complete the pipeline.
AutoML system
Bayesian Optimization
Hand-crafted
portfolio
ML Pipeline
Figure 12.0: An example of Auto sklearn pipeline. Source: André Biedenkapp, We did it Again: World Champions in AutoML
AutoML
Hyperparameter
Optimization
NAS
Figure 13.0: Source: Liam Li, Ameet Talwalkar, What is neural architecture search
Components of NAS
Optimization Evaluation
Search Space
Method Method
Figure 14.0: Components of NAS. Source: Liam Li, Ameet Talwalkar, What is neural architecture search.
Search space: The search space provides These are also usually hand crafted by could be done using full training approach
boundary within which the specific expert data scientists. or doing partial training and then applying
architecture needs to be searched. certain specialized methods such as partial
Optimization method: This is responsible
Computer Vision (captioning the scene, training or early stopping, weights sharing,
for providing mechanism to search the
or product identification) based use network morphism, etc.
best architecture. It could be searched
cases would need a different neural
and applied randomly or using certain For selective problem spaces, as
network architecture style, as against
statistical or Machine Learning evaluation compared to manual methods, NAS have
Speech (speech transcription, or speaker
approach such as Bayesian method or outperformed and is showing definite
classification) or unstructured Text (Topic
reinforcement learning methods. promise for future. However, it is still
extraction, intent mining) based use cases.
evolving and not ready for production
Search space tries to provide available Evaluation method: This has the role
usages as several architectures need to be
catalogs of best in class architectures based of evaluating the quality of architecture
established and evaluated depending on
on other domain data and performance. considered by optimization method. It
the problem space.
Generative AI, Neural Style Art Generation, Sketch Generation, Image or Video Resolution
2
Transfer (NST) Improvements, Data Generation/Augmentation, Music Generation
Vehicle Classification,
3 Fine Grained Classification
Type of Tumor Detection
Image Re-construction,
4 Capsule Networks
Image Comparison/Matching
Deep Reinforcement Learning Intelligent Agents, Robots, Driverless cars, Traffic Light Monitoring,
8
(RL) Continuous Learning scenarios for document review and corrections
• https://simmachines.com/explainable-ai/
• https://www.cmu.edu/news/stories/archives/2018/october/explainable-ai.html
• https://medium.com/@QuantumBlack/making-ai-human-again-the-importance-of-explainable-ai-xai-95d347ccbb1c
• https://towardsdatascience.com/explainable-artificial-intelligence-part-2-model-interpretation-strategies-75d4afa6b739
3. Capsule Networks
• https://arxiv.org/pdf/1710.09829.pdf
• https://keras.io/examples/cifar10_cnn_capsule/
• https://www.youtube.com/watch?v=pPN8d0E3900
• https://www.youtube.com/watch?v=rTawFwUvnLE
• https://medium.freecodecamp.org/understanding-capsule-networks-ais-alluring-new-architecture-bdb228173ddc
4. Meta Learning
• https://medium.com/@jrodthoughts/whats-new-in-deep-learning-research-openai-s-reptile-makes-it-easier-to-learn-how-to-learn-
e0f6651a39f0
• http://proceedings.mlr.press/v48/santoro16.pdf
• https://towardsdatascience.com/whats-new-in-deep-learning-research-understanding-meta-learning-91fef1295660
5. Transfer Learning
• https://www.fast.ai/2018/07/23/auto-ml-3/
• https://medium.freecodecamp.org/an-introduction-to-reinforcement-learning-4339519de419
• https://medium.com/@jonathan_hui/alphago-zero-a-game-changer-14ef6e45eba5
• https://arxiv.org/pdf/1811.12560.pdf
8. Auto ML
• https://www.ml4aad.org/automated-algorithm-design/algorithm-configuration/smac/
• https://www.fast.ai/2018/07/23/auto-ml-3/
• https://www.fast.ai/2018/07/16/auto-ml2/#auto-ml
• https://competitions.codalab.org/competitions/17767
• https://www.automl.org/automl/auto-sklearn/
• https://www.ml4aad.org/automated-algorithm-design/algorithm-configuration/smac/
• https://automl.github.io/HpBandSter/build/html/optimizers/bohb.html
To know more about our work on the H3 trends in AI, write to icets@infosys.com.
© 2019 Infosys Limited, Bengaluru, India. All Rights Reserved. Infosys believes the information in this document is accurate as of its publication date; such information is subject to change without notice. Infosys
acknowledges the proprietary rights of other companies to the trademarks, product names and such other intellectual property rights mentioned in this document. Except as expressly permitted, neither this
documentation nor any part of it may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, printing, photocopying, recording or otherwise, without the
prior permission of Infosys Limited and/ or any named intellectual property rights holders under this document.