by Sajan Mathew
What is this Guide?
This guide demystifies AI and democratizes AI knowledge on how it creates
and delivers value. It provides an essential understanding of AI to anyone
with varying technical knowledge, curiosity, and interest in the technology.
Why was this book written?
Thanks to the internet, today knowledge is available in abundance and on a
click of a button. This also applies to even niche domains such as AI. There
are numerous articles, books and other material on AI that are easily
available on the world wide web but the only challenge is they are either
highly technical or philosophical in nature or they exist in fragments in form of
long posts and blogs. Being an AI enthusiast, I believe that to build a
preferable AI future, it is important that we all have a uniform and shared
understanding about AI. I spent time educating myself on AI concepts
through several resources and captured here some of the finest and simplest
explanations about AI technology that can help anyone to learn and explain
AI to others.
I understood AI, How do I find opportunities to apply AI technology?
I am also creating an AI strategy canvas, similar to the Business Model
Canvas, which helps you critically and creatively think about opportunities for
AI application, why it is fit for AI, what building blocks of AI are needed, how
do we apply and how will it benefit the business and customers.
"Just as electricity
transformed almost
everything 100 years ago,
today I actually have a hard
time thinking of an industry
that I don’t think AI will
transform in the next
several years.
Description source: https://bit.ly/2I5pSpQ
5 AI today for the future
7 Artificial Intelligence
- Categories of Artificial Intelligence
- Types of Artificial Intelligence
11 Machine Learning
- Machine Learning Algorithm Types
22 Deep Learning
- How Deep Learning works?
- Deep Learning models
27 AI enabled companies
28 AI case studies: AI at India’s top eCom firms
- Flipkart's Project Mira
- Myntra’s Rapid platform and AI initiatives
- Amazon India's AI enabled Smart eCom
AI Today for the future
AI is not a technology that has developed in the last decade; it has been
taking shape in the laboratories since the 1950's. This technology remained
with the scientists and researchers in the lab until the recent advancement in
computing power and growth in data fueled the development and application
of AI for the commercial uses. AI can bring tremendous economic value to
organizations as it can perform various actions and tasks which earlier were
unthinkable for a computer to do and beyond human ability. In the last 5
years, AI has garnered a lot of attention and has become a buzzword in the
industry just like big data generated the interest, attracting enormous
investments from Angels, VCs, and corporations. AI technology development
has been faster than any other similar promising technologies. Unlike AR/VR
which is still on the hype curve, AI has started to move into the mainstream.
It now lives everywhere, from smartphones to refrigerators. It has evolved
significantly in the last few years but is still far from achieving human-level
intelligence or the state of Singularity.
As AI technology is maturing and evolving every day, it is also unlocking
several possibilities. From beating the Alpha Go World champion to
developing superhuman skills by playing itself over and over, AI technology
is continuously proving that it is achieving the ability to think like a human
mind and perform better than humans. Companies are getting closer to
creating general purpose AI that can intelligently tackle various challenges in
science such as designing new drugs to helping the farmers to increase their
crop yield. With every successful AI experiment, a better future is getting
promised. However, the growing ability of AI to succeed in human skills is
also raising concerns and worries. The most common fear being AI and
robots taking away jobs from humans. Michio Kaku says the jobs that are
going to be safe from automation are the ones that robots can't do such as
non-repetitive jobs, jobs that require common sense, creativity, and
imagination. Some proponents of AI believe that if robots and automation
take away jobs, then they will create a different kind of jobs like how
industrial revolution has done in the past.
Hope you enjoy the read!
- Sajan Mathew
What is Artificial
Intelligence, Machine
learning, and Deep
AI, Machine Learning, Deep learning are very hot
buzzwords today, creating both excitement and fear.
Even though they are not quite the same thing but they
are still used very interchangeably or all in the same
context. To leverage the benefits of the AI technology, it
is important to understand what AI, ML and DL mean,
what they can do and what they can’t do.
Artificial Intelligence
Artificial intelligence was coined in 1956 by Dartmouth Assistant Professor
John McCarthy. On a high level, AI could be defined as "an ability of a
machine/computer to think intelligently like humans."
Basic ‘AI’ has existed for decades, via rules-based programs that deliver
rudimentary displays of ‘intelligence’ in specific contexts. Progress, however,
has been limited — because algorithms to tackle many real-world problems
such as predicting machine failures, identifying objects in images etc are too
complex for people to program by hand.
What if we could transfer these complex activities from the programmer to
the program? This is the promise of modern artificial intelligence.
AI research has focused on five fields of enquiry:
1. Reasoning: the ability to solve problems through logical deduction.
e.g. Legal assessment; financial asset management
2. Knowledge: the ability to represent knowledge about the world.
e.g. Medical diagnosis; drug creation; media recommendation
3. Planning: the ability to set and achieve goals.
e.g. Logistics; scheduling; navigation; predictive maintenance
4, Communication: the ability to understand written and spoken language.
e.g. Voice control; intelligent agents, assistants and customer support
5. Perception: the ability to deduce things about the world from visual
images, sounds and other sensory inputs. E.g. Autonomous vehicles;
medical diagnosis; surveillance.
Why is AI rising Now?
Improved algorithms: There has been a huge evolution in the algorithms
that are used to provide intelligence to the machines.
Specialised hardware: The advancement in the computational and
processing hardware has slashed the time required to train a machine.
Extensive data: Data creation and availability has grown exponentially in the
last two decades, which is fueling the development of AI systems.
Interest and entrepreneurship: The interest and awareness of AI have
increased in the last five years from big companies and startups alike, which
has attracted investments and talents, catalyzing the progress.
Description source: https://bit.ly/2hfiauX
Categories of AI
AI is the branch of computer science that gives the machines/computers
ability to mimic human decision making processes or intelligence and carry
out tasks in human ways. Even though AI is a very broad term, experts
categorize AI development into Artificial Narrow Intelligence, Artificial
General Intelligence and Artificial Super Intelligence
ANI: Artificial Narrow Intelligence
AI that specializes in optimizing a specific area/task such as playing chess or
recommending songs on Spotify and that’s the only thing it does.”
AGI: Artificial General Intelligence
AI that has reached or passed the human intelligence, has the ability to
“reason, plan, solve problems, think abstractly, comprehend complex ideas,
learn quickly, and learn from experience. It can handle tasks from different
areas and apply experience gathered in one area to a different area. In
comparison to a narrow AI, a general AI has all the necessary knowledge
and abilities to improve not only tomato growth in a greenhouse but
cucumber, eggplant, peppers, radishes and kohlrabi as well. Thus, a general
AI is a system, that can handle more than just one specific task.
ASI: Artificial Super Intelligence
AI that achieves a level of intelligence smarter than all of humanity combined
— “ranging from just a little smarter … to one trillion times smarter.”
To date humans have been able to achieve ANI through hard-coded and
sophisticated algorithms and programs, and now it exists everywhere, from
Google search to airplanes. Currently these ANI systems don’t pose any
existential threats but badly programmed or poorly tested AI could cause
loss. However every advancement in AI research is maturing ANI and
bringing use closer to AGI/ASI.
Types of AI
Type I AI: Reactive machines
The most basic types of AI systems are purely reactive, and have the ability
neither to form memories nor to use past experiences to inform current
decisions. E.g. Deep Blue, IBM’s chess-playing supercomputer. Deep Blue
can identify the pieces on a chess board and know how each moves. It can
make predictions about what moves might be next for it and its opponent.
And it can choose the most optimal moves from among the possibilities. But
it doesn’t remember the past moves, or what has happened before.
Type II AI: Limited memory
This Type II class contains machines can look into the past. E.g. Self-driving
cars. They observe other cars’ speed and direction and identify specific
objects and monitor them over time. These observations are added to the
self-driving cars’ preprogrammed representations of the world, but aren’t
saved as part of the car’s library of experience it can learn from, the way
human drivers compile experience over years behind the wheel.
Type III AI: Theory of mind
Machines in the next, more advanced, class not only form representations
about the world, but also about other agents or entities in the world. In
psychology, this is called “theory of mind” – the understanding that people,
creatures and objects in the world can have thoughts and emotions that
affect their own behavior. If AI systems are indeed ever to walk among us,
they’ll have to be able to understand that each of us has thoughts and
feelings and expectations for how we’ll be treated. And they’ll have to adjust
their behavior accordingly.
Type IV AI: Self-awareness
The final step of AI development is to build systems that can form
representations about themselves. This is, in a sense, an extension of Type
III artificial intelligence. Consciousness is also called “self-awareness”.
Conscious beings are aware of themselves, know about their internal states,
and are able to predict the feelings of others. While we are probably far from
creating machines that are self-aware, we should focus our efforts toward
understanding memory, learning and the ability to base decisions on past
Description source: https://bit.ly/2fyeKmL
Artificial Intelligence: Anything which enables computers to think and
behave like a human
Machine Learning: Subset of Artificial Intelligence which deals with the
extraction of patterns from data sets
Deep Learning: A specific class of Machine Learning algorithms which are
using complex neural networks
Machine Learning
As teaching computers became challenging and complex, Arthur Samuel in
1959 developed the idea of teaching computers to learn for themselves and
coined the term Machine Learning, a sub-field of AI. The emergence of the
internet and large stream of data, made Machine learning to be a more
efficient way to teach computers to think like humans.
At the basic level, Machine Learning is the practice of using algorithms to
parse data, learn from it, and then make a recommendation, prediction about
something in the world, find the association between events or cluster by a
condition. Instead of hand-coding complex software routines to accomplish a
particular task, the machine is “trained” using large amounts of data and
algorithms that give it the ability to learn how to perform the task.
Labeled data: A dataset that has been tagged with one or more labels, an
input, and the desired output value.
Unlabeled data: Samples of natural or human-created artifacts that can be
obtained relatively easily from the world but lack meaningful tags that are
informative. e.g. photos, audio recordings, videos, news articles, tweets etc.
Machine learning algorithms initially provided with examples whose outputs
are known, notes the difference between the predictions and the correct
outputs, and tunes the weightings of the inputs to improve the accuracy of its
predictions until they are optimized. e.g. finding the probability of the person
enjoying a film in the future based on the films the person has watched in the
past. The defining characteristic of machine learning algorithms, therefore, is
that the quality of their predictions improves with experience. The more data
we provide, the better the prediction engines we can create.
First, the "training data" must be labeled and then it is "classified." When
features of the object in question are labeled and put into the system with a
set of rules it leads to a prediction. e.g. "red" and "round" are inputs into the
system that leads to the output: Apple. Similarly, a learning algorithm could
also be left alone to create its own rules that will apply when it is provided
with a large set of the object—like a group of apples, and the machine figures
out that they have properties like "round" and "red" in common.
Description source: https://tek.io/2npWvp0
ML Algorithms Types
Machine learning algorithms can be fundamentally categorized in two ways:
1. By the learning style
2. By similarity in form or function
Algorithms Grouped by Learning Style
There are different ways an algorithm can model a problem based on its
interaction with the experience or environment. It is suggested to first
consider the learning styles that an algorithm can adopt because it forces
you to think about the roles of the input data and the model preparation
process and select one that is the most appropriate for your problem in order
to get the best result.
Three learning styles in machine learning algorithms:
Supervised Learning
Supervised learning is mainly used in predictive
modeling. A predictive model is basically a model
constructed from a machine learning algorithm and
features or attributes from training data such that we
can predict a value using the other values obtained
from the input data. Supervised learning algorithms
try to model relationships and dependencies between
the target prediction output and the input features
such that we can predict the output values for new
data based on those relationships which it learned
from the previous datasets.
Most of the time we are not able to figure out the true function that always
make the correct predictions and the algorithm rely upon the assumption
made by humans about how the computer should learn and these
assumptions introduce bias. Here the human experts acts as the teacher
where we feed the computer with Input data which is called training data and
has a known label or result (output) such as spam/not-spam or a stock price
at a time. A model is prepared through a training process in which it is
required to make predictions and is corrected when those predictions are
wrong. The training process continues until the model achieves a desired
level of accuracy on the training data.
Description source: https://bit.ly/2yIMqem
The main types of supervised learning algorithms include:
Classification algorithms: These algorithms build predictive models from
training data which have features and class labels. These predictive models
in-turn use the features learned from training data on new, previously unseen
data to predict their class labels. The output classes are discrete. Types of
classification algorithms include decision trees, random forests, Support
Vector Machines (SVM) Neural Networks etc.
Regression algorithms: These algorithms are used to predict output values
based on some input features obtained from the data. To do this, the
algorithm builds a model based on features and output values of the training
data and this model is used to predict values for new data. The output
values, in this case, are continuous and not discrete. Types of regression
algorithms include logistic regression, linear regression, multivariate
regression, regression trees, and lasso regression, among many others.
Examples of Supervised learning applications:
1. Predict creditworthiness of credit card holders: Build an ML model to look for
delinquency attributes by providing it with delinquent and non-delinquent
2. Predict patient readmission rates: Build a regression model by providing data
on the patient treatment regime and readmissions to show variables that best
correlate with readmission variables.
3. Analyze products customer buy together: Build a supervised learning to
identify frequent sets and association rules from transactional data.
Description source: https://bit.ly/2uoRZgr
The main types of unsupervised learning algorithms include:
Clustering algorithms: The main objective of these algorithms is to cluster
or group input data points into different classes or categories using just the
features derived from the input data alone and no other external information.
Unlike classification, the output labels are not known beforehand in
clustering. There are different approaches to build clustering models, such as
by using means, medoids, hierarchies, and many more. Some popular
clustering algorithms include k-means, k-medoids, and hierarchical
Association rule learning algorithms: These algorithms are used to mine
and extract rules and patterns from data sets. These rules explain
relationships between different variables and attributes, and also depict
frequent item sets and patterns which occur in the data. These rules in turn
help discover useful insights for any business or organization from their huge
data repositories. Popular algorithms include Apriori and FP Growth.
Examples of Unsupervised learning applications:
1. Segment customers by behavioral characteristics: Survey prospects and
customers to develop multiple segments using clustering
2. Categorize MRI data by normal or abnormal images: Use deep learning
techniques to build a model that learns different features of images to
recognize different patterns.
3. Recommend products to customers based on past purchases: Build a
collaborative filtering model based on past purchases by "Customers like
Unsupervised Learning
The unsupervised learning algorithms are the family
of machine learning algorithms which are mainly
used in pattern detection and descriptive modeling.
However, there are no output categories or labels
here based on which the algorithm can try to model
relationships. These algorithms try to use techniques
on the input data to mine for rules, detect patterns,
and summarize and group the data points which help
in deriving meaningful insights and describe the data
better to the users.
Description source: https://bit.ly/2uoRZgr
These methods exploit the idea that even though the group memberships of
the unlabeled data are unknown, this data carries important information
about the group parameters.
Input data is a mixture of labeled and unlabelled examples. There is a
desired prediction problem but the model must learn the structures to
organize the data as well as make predictions.
Semi-supervised learning can be applied to classification and regression
Example algorithms are extensions to other flexible methods that make
assumptions about how to model the unlabeled data.
Semi-Supervised Learning
In the previous two types, either there are no labels
for all the observation in the dataset or labels are
present for all the observations. Semi-supervised
learning falls in between these two. In many practical
situations, the cost to label is quite high, since it
requires skilled human experts to do that. So, in
cases where labels are absent in the majority of the
data but present in few, semi-supervised algorithms
are the best candidates for the model building.
Description source: https://bit.ly/2uoRZgr
Reinforcement Learning
This method aims at using observations
gathered from the interaction with the
environment to take actions that would
maximize the reward or minimize the risk.
It allows machines and software agents to automatically determine the ideal
behavior within a specific context, in order to maximize its performance.
Simple reward feedback is required for the agent to learn its behavior; this is
known as the reinforcement signal.
There are many different algorithms that tackle this issue. As a matter of fact,
Reinforcement Learning is defined by a specific type of problem, and all of its
solutions are classed as Reinforcement Learning algorithms.
In order to produce intelligent programs (also called agents), reinforcement
learning goes through the following steps:
1. Input state is observed by the agent.
2. Decision making function is used to make the agent perform an action.
3. After the action is performed, the agent receives reward or reinforcement
from the environment.
4. The state-action pair information about the reward is stored.
Example algorithms include: Q-Learning, Temporal Difference (TD), Deep
Adversarial Networks
Examples of reinforcement learning applications
1. Create a 'next best offer' model fo the call center group: Build a predictive
model that learns over time as users accept or reject offers made by the
sales staff.
2. Allocate scarce medical resources to handle different types of ER cases:
Build a Markov Decision Process that learns treatment strategies for each
type of ER case.
3. Reduce excess stock with dynamic pricing: Build a dynamic pricing model
that adjusts the price based on customers response to offers.
Description source: https://bit.ly/2uoRZgr
Reinforcement learning algorithm (called the agent) continuously learns from
the environment in an iterative fashion. In the process, the agent learns from
its experiences of the environment until it explores the full range of possible
Algorithms Grouped by Similarity
Algorithms are grouped by similarity in terms of their function, (how they
work). This is not an exhaustive list but a list of the popular machine learning
Regression Algorithms
Regression Algorithms Regression is concerned with
modeling the relationship between variables that are
iteratively refined using a measure of error in the
predictions made by the model.
Regression methods are a workhorse of statistics and
have been co-opted into statistical machine learning.
Regression is a process.
E.g. Linear Regression, Logistic Regression, and
Stepwise Regression, Multivariate Adaptive
Regression Splines (MARS), Locally Estimated
Scatterplot Smoothing (LOESS)
Instance-based Algorithms
Instance-based learning model is a decision problem
with instances or examples of training data that are
deemed important or required to the model.
Such methods typically build up a database of
example data and compare new data to the database
using a similarity measure in order to find the best
match and make a prediction. For this reason,
instance-based methods are also called winner-take-
all methods and memory-based learning. Focus is
put on the representation of the stored instances and
similarity measures used between instances.
E.g. k-Nearest Neighbor (kNN), Learning Vector
Quantization (LVQ), Locally Weighted Learning
Description source: https://bit.ly/2yIMqem
Decision Tree Algorithms
Decision tree methods construct a model of decisions
made based on the actual values of attributes in the
Decisions fork in tree structures until a prediction
decision is made for a given record. Decision trees
are trained on data for classification and regression
problems. Decision trees are often fast and accurate
and a big favorite in machine learning.
E.g. Classification and Regression Tree (CART),
Conditional Decision Trees
Bayesian Algorithms
Bayesian methods are those that explicitly apply
Bayes’ Theorem for problems such as classification
and regression.
E.g. Naive Bayes, Gaussian Naive Bayes,
Multinomial Naive Bayes, Bayesian Network (BN)
Clustering Algorithms
Clustering, like regression, describes the class of
problem and the class of methods.
Clustering methods are typically organized by the
modeling approaches such as centroid-based and
hierarchial. All methods are concerned with using the
inherent structures in the data to best organize the
data into groups of maximum commonality.
E.g. k-Means, k-Medians, Expectation Maximisation
(EM), Hierarchical Clustering
Description source: https://bit.ly/2yIMqem
Association Rule Learning Algorithms
Association rule learning methods extract rules that
best explain observed relationships between
variables in data.
These rules can discover important and commercially
useful associations in large multidimensional datasets
that can be exploited by an organization.
E.g. Apriori algorithm, Eclat algorithm
Dimensionality Reduction Algorithms
Like clustering methods, dimensionality reduction
seeks and exploit the inherent structure of the data.
In this case in an unsupervised manner or order to
summarize or describe data using less information.
This can be useful to visualize dimensional data or to
simplify data which can then be used in a supervised
learning method. Many of these methods can be
adapted for use in classification and regression.
E.g. Principal Component Analysis (PCA), Principal
Component Regression (PCR), Sammon Mapping
Ensemble Algorithms
Ensemble methods are models composed of multiple
weaker models that are independently trained and
whose predictions are combined in some way to
make the overall prediction.
Effort is put into what types of weak learners to
combine and the ways in which to combine them.
This is a very powerful class of techniques and as
such is very popular.
E.g. Boosting, Bootstrapped Aggregation (Bagging)
AdaBoost, Stacked Generalization (blending)
Gradient Boosting Machines (GBM), Random Forest
Description source: https://bit.ly/2yIMqem
Association Rule Learning Algorithms
Association rule learning methods extract rules that
best explain observed relationships between
variables in data.
These rules can discover important and commercially
useful associations in large multidimensional datasets
that can be exploited by an organization.
E.g. Apriori algorithm, Eclat algorithm
Dimensionality Reduction Algorithms
Like clustering methods, dimensionality reduction
seeks and exploits the inherent structure of the data.
In this case in an unsupervised manner or order to
summarize or describe data using less information.
This can be useful to visualize dimensional data or to
simplify data which can then be used in a supervised
learning method. Many of these methods can be
adapted for use in classification and regression.
E.g. Principal Component Analysis (PCA), Principal
Component Regression (PCR), Sammon Mapping
Ensemble Algorithms
Ensemble methods are models composed of multiple
weaker models that are independently trained and
whose predictions are combined in some way to
make the overall prediction.
Effort is put into what types of weak learners to
combine and the ways in which to combine them.
This is a very powerful class of techniques and as
such is very popular.
E.g. Boosting, Bootstrapped Aggregation (Bagging)
AdaBoost, Stacked Generalization (blending)
Gradient Boosting Machines (GBM), Random Forest
Description source: https://bit.ly/2yIMqem
Image source: https://bit.ly/2uoRZgr
The 7 Steps of Machine Learning
Deep Learning
Deep learning has revolutionized the world of artificial intelligence, it is a sub-
set of machine learning. All deep learning is machine learning, but not all
machine learning is deep learning. Deep learning is useful because it avoids
the programmer having to undertake the tasks of feature specification
(defining the features to analyze from the data) or optimization (how to weigh
the data to deliver an accurate prediction) — the algorithm does both.
The breakthrough in deep learning is to model the brain, not the world. Our
own brains learn to do difficult things — including understanding speech and
recognizing objects — not by processing exhaustive rules but through
practice and feedback.
Deep learning uses the same approach. Artificial, software-based calculators
that approximate the function of neurons in a brain are connected together.
They form a ‘neural network’ which receives an input; analyses it; makes a
determination about it and is informed if its determination is correct. If the
output is wrong, the connections between the neurons are adjusted by the
algorithm, which will change future predictions. Initially the network will be
wrong many times. But as we feed in millions of examples, the connections
between neurons will be tuned so the neural network makes correct
determinations on almost all occasions. Using this process, with increasing
effectiveness we can now recognize elements in pictures, translate between
languages in real-time, detect tumours in medical images; and more.
Deep learning is not well suited to every problem as It typically requires large
datasets for training and extensive processing power to train and run a
neural network. And it has an ‘explainability’ problem — it can be difficult to
know how a neural network developed its predictions. But by freeing
programmers from complex feature specification, deep learning has
delivered successful prediction engines for a range of important problems. As
a result, it has become a powerful tool in the AI developer’s toolkit.
Description source: https://bit.ly/2hfiauX
How DL works?
Deep learning involves using an artificial ‘neural network’ — a collection of
‘neurons’ (software-based calculators) connected together. An artificial
neuron has one or more inputs. It performs a mathematical calculation based
on these to deliver an output. The output will depend on both the ‘weights’ of
each input and the configuration of ‘input-output function’ in the neuron. The
input-output function can vary.
A neuron may be a linear unit (the output is proportional to the total weighted
input, a threshold unit (the output is set to one of two levels, depending on
whether the total input is above a specified value); or a sigmoid unit (the
output varies continuously, but not linearly as the input changes). A neural
network is created when neurons are connected to one another; the output of
one neuron becomes an input for another.
Neural networks are organized into multiple layers of neurons (hence ‘deep’
learning). The ‘input layer’ receives information the network will process —
for example, a set of pictures. The ‘output layer’ provides the results.
Between the input and output layers are ‘hidden layers’ where most activities
occurs. Typically, the outputs of each neuron on one level of the neural
network serves as one of the inputs for each of the neurons in the next layer.
Typically, neural networks are trained by exposing them to a large number of
labeled examples. Errors are detected and the weights of the connections
between the neurons tuned by the algorithm to improve results. The
optimization process is extensively repeated, after which the system is
deployed and unlabelled images are assessed.
Description source: https://bit.ly/2hfiauX
Deep Learning Models
Convolutional Neural Networks
ConvNets or CNNs are a category of Neural Networks that have a different
architecture than regular Neural Networks. Regular Neural Networks
transform an unstructured data set e.g. images by putting it through a series
of connected hidden layers made up of a set of neurons and receive the
prediction as an output.
CNNs organize the layers in 3 dimensions: width, height and depth. Further,
the neurons in one layer do not connect to all the neurons in the next layer
but only to a small region of it. Lastly, the final output will be reduced to a
single vector of probability scores, organized along the depth dimension.
CNNs have two components:
The Hidden layers/Feature extraction part
In this part, the network will perform a series of convolutions and pooling
operations during which the features are detected. If you had a picture of a
zebra, this is the part where the network would recognise its stripes, two
ears, and four legs.
The Classification part
Here, the fully connected layers will serve as a classifier on top of these
extracted features. They will assign a probability for the object on the image
being what the algorithm predicts it is.
1. Diagnose health diseases from medical conditions
2. Understand customer brand perception and usage through images
3. Detect a defective product on a production line through images
Description source: https://bit.ly/2KGHgFT
Recurrent Neural Network (RNN)
Recurrent neural network (RNN) is a class of artificial neural network which
stores information in the context nodes to process sequences of inputs and
provide the output based on the input sequence.
Unlike other neural networks, all the inputs in RNN are related to each other.
e.g. To predict the next word in a given sentence, the relation among all the
previous words helps in predicting the better output. The RNN remembers all
these relations while training itself.
In order to achieve it, the RNN creates the networks with loops in them,
which allows it to persist the information. This loop structure allows the neural
network to take the sequence of input.
As you can see in the unrolled version. First, it takes the x(0) from the
sequence of input and then it outputs h(0) which together with x(1) is the
input for the next step. So, the h(0) and x(1) is the input for the next step.
Similarly, h(1) from the next is the input with x(2) for the next step and so
on. This way, it keeps remembering the context while training.
The following are the few applications of the RNN:
1. Next word prediction
2. Music composition.
3. Image captioning
4. Speech recognition
5. Time series anomaly detection
6. Stock market prediction
Description source: https://bit.ly/2xfR4NK
Description source: https://bit.ly/2I5pSpQ
AI enabled Companies
Today organizations are implementing/adopting AI at different levels
depending on the organization’s vision, understanding, structure, agility, and
ability to create opportunities for both creativity and disruption. These
organizations can be classified into three broad categories:
Applied AI companies
Applied AI companies use AI to optimize, personalize and/or automate
existing processes, products and services to make people, businesses and
organizations more productive. Today, most companies adopting AI would fit
in this category.
AI First
AI First companies distinguish themselves in that they develop
applications/services/products that are built from the ground up with AI at
their core and use every interaction with customers or users to feed and train
the AI algorithms and as a result improve the quality of the application
/product/service with each and every interaction, enabling new experiences
and business models and increasingly stronger competitive moats. AI First
solutions will change the way we interact with software/machines from a
master-slave relationship (where we tell the machine what to do and the
machine executes) to more of a peer-to-peer relationship (where the
machine anticipates our needs and makes suggestions as we interact with it)
and eventually to a slave-master relationship (where the machine tells us
what/when/how to do — based on a series of inputs and desired outcomes).
The AI Stack/Machine
As mentioned above, the world’s most dominant companies over the past
five years, including Google, Facebook, Amazon, Microsoft, Apple and
Baidu, are making massive investments in AI with the ambition of becoming
the platform (the Machine) that gets used to run our lives, our businesses
and our societies. Companies that are building the different components of
the Machine and understand how to best interact with it to create value for
themselves and their customers are also of great interest. This includes AI
technology such as new frameworks, software infrastructure (distributed,
centralized, local/edge, hybrid) required to run AI powered solutions,
hardware platforms specialized for AI (distributed, centralized, local/edge,
hybrid), connected devices, etc.
Description source: https://bit.ly/2xAqI9k
Artificial Intelligence at India’s
Top eCommerce Firms
Flipkart's Project Mira
Flipkart's Project Mira is an artificial intelligence focused on understanding
customers better to improve product search experience and reduce product
returns. Project Mira was piloted through a conversational search
experience that guides the users with relevant questions, conversational
filters, shopping ideas, offers, and trending collections.
Flipkart's marketplace processes more than 400,000 shipments a day, of
which customers return 10-11%. One-in-four fashion products such as
clothing and accessories are returned because of reasons such as incorrect
fit or as customers change their minds about a particular style. When
Flipkart's team reviewed product returns data on shoes and lifestyle
categories, they observed that there was a mismatch of expectations from
customers regarding size and fit issues. Flipkart’s team of experts started
brainstorming for attributes that could be prompted to buyers instead of
having them narrow the search results using filters. Say, if a Flipkart
customer is searching for an air-conditioner Mira would ask the customer
about what kind of AC they want, the tonnage, room size, brand, etc.
Flipkart also uses Mira to streamline its backend processes such as —
accurate classification of products, accurate product descriptions, avoid
duplication etc. Flipkart adds more than 10 million products from around
20,000 sellers every month. Due to the unstructured data provided by
sellers often it hugely difficult to accurately classify a product to an
automated catalog. Mira can classify the product into the specific vertical
based on the provided image. For verticals with similar images (say
shampoos and body lotions), it uses the product description to classify
products with 95% accuracy.
Mira can also detect incorrect images and morphed images and identify
duplicate products as sellers intentionally or unintentionally post duplicate
products, increasing user's effort in scanning to the desired set of products.
Project Mira is still in its infancy, but it has been expanded to several
verticals to solve issues such as product returns and quicker delivery.
Anticipate if a delivery is likely to lead to return and for what reasons.
Estimate if the product can be delivered in two days time to avoid midway
customer cancellations.
Description source: https://bit.ly/2xfGQgV
Myntra’s Rapid platform and Other AI initiatives
What's Rapid Platform?
Fashion e-tailer Myntra’s AI initiatives are centered around three key pillars
– product, experience, and logistics:
Myntra is focussing on building intelligent fast fashion through its AI platform
known as Rapid. “Fast fashion”, a term used by retailers to describe the
speeding up of production processes to get new trends to the market as
quickly and cheaply as possible.
This can dramatically reduce the time taken to create a fashion product to
few weeks from the typically long 9-14 months lifecycle. Leveraging the
available sales data, best selling attributes can be identified and based on
that designers can start producing the fashion items. This has helped
Myntra to quickly uncover fashion trends.
Myntra is using a new technique called Generative Adversarial Networks
(GANs) for design which creates products that are similar but not the same.
Myntra has also launched T-shirts with fully machine generated designs. It
is also using Rapid to intelligently select what to sell on the platform.
Myntra is using machine learning to improve the payment acceptance rates
of online transactions – an issue which is particularly prevalent in India
where failure rates are high. Online payments transactions typically fail in
India due to two distinct set of reasons:
1. User abandonment, this could be due to anything from a patchy internet
connection, a clunky interface, to just loss of interest.
2. Banks, which provide acquiring services in any payment transactions,
tend to have poor IT systems.
Using machine learning, the system figures out the best payment gateway
the payment needs to be routed through. This is done by detecting and
analysing thousands of success and failure patterns and then sending it
through the most optimized route. Myntra also enhances the user
experience by giving the right recommendations based on what a customer
has seen or bought in the past. It uses “collaborative filtering”, which
recommends products to one person based on what another person has
just bought and also helps match which fashion goes well with what.
As customers often complain about late refunds, Myntra created ‘Sabre’, an
AI-based returns system that enables faster refunds for customers who
demonstrated good buying-return behaviour. Myntra wants to make its
returns policy more efficient as it believes that returns is an integral part of
the fashion industry – which relies on sizes, fits, and tastes that make
returns more common than other sectors (differentiated goods like apparel
tend to have higher return rates than undifferentiated goods).
By analysing a customer’s past returns patterns, Myntra claims that ‘Sabre’
is able to detect which customers are genuinely returning the shipments and
which ones are attempting fraudulent activities.
Myntra is also aiming to reduce its rate of return to origin (RTO). A higher
RTO translates into higher losses as many cash on delivery (COD) orders
are not delivered for various reasons such as customers not being present
or not having cash at that point in time. Myntra can now to a great extent
predict if something is going to result in an RTO.
Description source: https://bit.ly/2KQLY0a
Amazon India's AI enabled Smart eCommerce
Amazon India has used machine learning and AI in a number of areas.
Correcting Addresses
Addresses in India are not well structured and often users enter incorrect
addresses (e.g. wrong pin code or city name) or addresses with missing
information (e.g. missing street name). Wrong addresses cause packages
to miss delivery dates and lead to failed deliveries. The company has been
using machine learning techniques to detect junk addresses, compute
address quality scores, correct city-pin code mismatches, and provide
suggestions to users to correct wrong addresses.
Catalog Quality
Product catalog defects such as missing attributes like brand, color or poor-
quality images/titles can adversely impact customer experience. The
company is using AI and machine learning to extract missing attribute
information like brand or color from product titles and images.
Product Size Recommendations
In categories such as shoes and other apparels, different brands often have
different size conventions. For example, a catalog size 6 may correspond to
a physical size of 15 cm for Reebok while for Nike a catalog size 6 may
correspond to a physical size of 16 cm. Amazon uses using machine
learning to recommend the product size that would best fit a customer when
the customer visits a product page based on past customer purchase and
returns data (e.g. product size was too large/small).
Deals for Events
Machine learning is used to identify relevant products for specific events
such as Diwali, Christmas, etc. These are typically products that are in high
demand during the event or get high volumes of search queries and review
mentions during the event period. Machine Learning algorithms also predict
the deals and discounts to offer on the products to achieve a certain sales
forecast that helps in better planning.
By training a machine learning system on past holiday purchase data and
current purchase activity, a system may be able to calibrate demand more
accurately in order to sell products at the right prices to either (a) move
certain items at high volume, or (b) maximize profit margins by matching the
highest margin products to users during the holiday season.
Description source: https://bit.ly/2KQLY0a
I hope you enjoyed reading this guide and it met your
expectations regarding learning about AI.
There is one
more thing....
Check out some cool AI experiments by coders here

