Unit1
Unit1
Unit1
Greater Noida
INTRODUCTION TO MACHINE
LEARNING
Unit: 1
MACHINE LEARNING
Dr. Roop Singh
ACSML0601
Associate Professor
B Tech 6th Sem CSE(AI) Department
Computer Science and Engineering (Artificial
Intelligence)
Course objective: To introduction to the fundamental concepts in machine learning and popular machine learning algorithms. To understand the standard
and most popular supervised learning algorithm.
Pre-requisites: Basic Knowledge of Machine learning.
Course Contents / Syllabus
UNIT-I INTRODUCTION TO MACHINE LEARNING 8 Hours
INTRODUCTION – Learning, Types of Learning, Well defined learning problems, Designing a Learning System, History of ML, Introduction of Machine
Learning Approaches, Introduction to Model Building, Sensitivity Analysis, Underfitting and Overfitting, Bias and Variance, Concept Learning Task, Find – S
Algorithms, Version Space and Candidate Elimination Algorithm, Inductive Bias, Issues in Machine Learning and Data Science Vs Machine Learning.
Ensembles methods: Bagging & boosting, C5.0 boosting, Random Forest, Gradient Boosting Machines and XGBoost.
Text books:
1) Marco Gori , Machine Learning: A Constraint-Based Approach, Morgan
Kaufmann. 2017
1) Bishop, Christopher. Neural Networks for Pattern Recognition. New York, NY: Oxford University Press, 1995
Reference Books:
1) Ryszard, S., Michalski, J. G. Carbonell and Tom M. Mitchell, Machine Learning: An Artificial Intelligence Approach, Volume 1, Elsevier.
2014
1) Stephen Marsland, Taylor & Francis 2009. Machine Learning: An Algorithmic Perspective.
1) Ethem Alpaydin, (2004) “Introduction to Machine Learning (Adaptive Computation and Machine Learning)”, The MIT Press.
1) Fundamentals of Machine Learning for Predictive Data Anayltics: Algorithms, Worked Examples, and Case Studies 1st
Edition by John D. Kelleher
ACSML0601.1 3 2 2 1 2 2 - - - 1 - -
ACSML0601.2 3 2 2 3 2 2 1 - 2 1 1 2
ACSML0601.3 2 2 2 2 2 2 2 1 1 - 1 3
ACSML0601.4 3 3 1 3 1 1 2 - 2 1 1 2
ACSML0601.5 3 2 1 2 1 2 1 1 2 1 1 1
AVG 2.8 2.2 1.6 2.2 1.6 1.8 1.2 0.4 1.4 0.8 0.8 1.6
Matrix of CO/PSO:
PSO1 PSO2 PSO3
ACSML0601.1 3 2 3
ACSML0601.2 3 2 2
ACSML0601.3 3 2 3
ACSML0601.4 2 1 1
ACSML0601.5 2 2 1
Prerequisites:
• Statistics.
• Linear Algebra.
• Calculus.
• Probability.
• Programming Languages.
https://www.youtube.com/watch?v=PPLop4L2eGk&list=PLLssT5z_DsK-
h9vYZkQkYNWcItqhlRJLN
Ø Unit 1 Content:
INTRODUCTION – Learning,
Types of Learning,
Well defined learning problems,
Designing a Learning System,
History of ML,
Introduction of Machine Learning Approaches,
Introduction to Model Building,
Sensitivity Analysis,
Underfitting and Overfitting,
Bias and Variance,
Concept Learning Task,
Issues in Machine Learning and Data Science Vs Machine Learning.
9/28/24 Dr. Roop Singh ACSML0601 Machine Learning Unit 1 24
THE CONCEPT LEARNING TASK
Unit Objective
Types of Learning,
Top minds in machine learning predict where AI is going in 2021 and Beyond
AI is no longer poised to change the world someday; it’s changing the world now.
Soumith Chintala
Director, principal engineer, and creator of PyTorch
• “I actually don’t think we’ve had a groundbreaking thing … since Transformer, basically.
We had ConvNets in 2012 that reached prime time, and Transformer in 2017 or
something. That’s my personal opinion,”
• Chintala also believes the evolution of machine learning frameworks like PyTorch and
Google’s TensorFlow — the overwhelming favorites among ML practitioners today —
have changed how researchers explore ideas and do their jobs.
Soumith Chintala
Director, principal engineer, and creator of PyTorch
• Depending on how you gauge it, PyTorch is the most popular
machine learning framework in the world today.
• A derivative of the Torch open source framework introduced in
2002, PyTorch became available in 2015 and is growing steadily in
extensions and libraries.
• Celeste Kidd
• Developmental psychologist at the
• University of California, Berkeley
• Celeste Kidd is director of Kidd Lab at the University of California, Berkeley, where
she and her team explore how kids learn.
• “Human babies don’t get tagged data sets, yet they manage just fine, and it’s
important for us to understand how that happens,” she said.
• Last month, Kidd delivered the opening keynote address at the Neural Information
Processing Systems (NeurIPS) conference, the largest annual AI research
conference in the world. Her talk focused on how human brains hold onto
stubborn beliefs, attention systems, and Bayesian statistics.
9/28/24 Dr. Roop Singh ACSML0601 Machine Learning Unit 1 30
Introduction
THE (CO1)TASK
CONCEPT LEARNING
• Jeff Dean
• Google AI chief
• Dean has led Google AI for nearly two years now, but he’s been at Google for two decades
and is the architect of many of the company’s early search and distributed network
algorithms and an early member of Google Brain.
• Dean spoke with VentureBeat last month at NeurIPS, where he delivered talks on machine
learning for ASIC semiconductor design and ways the AI community can address climate
change, which he said is the most important issue of our time. In his talk about climate
change, Dean discussed the idea that AI can strive to become a zero-carbon industry and that
AI can be used to help change human behavior.
• Anima Anandkumar
• Nvidia machine learning research director
• Anandkumar sees numerous challenges for the AI community in 2020, like the
need to create models made especially for specific industries in tandem with
domain experts. Policymakers, individuals, and the AI community will also
need to grapple with issues of representation and the challenge of ensuring
data sets used to train models account for different groups of people.
9/28/24 32
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Introduction
THE (CO1)TASK
CONCEPT LEARNING
• Dario Gil
• IBM Research director
• Gil heads a group of researchers actively advising the White House and enterprises
around the world. He believes major leaps forward in 2019 include progress
around generative models and the increasing quality with which plausible
language can be generated.
9/28/24 39
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Introduction
THE (CO1)TASK
CONCEPT LEARNING
9/28/24 41
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Introduction
THE (CO1)TASK
CONCEPT LEARNING
9/28/24 42
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Introduction
THE (CO1)TASK
CONCEPT LEARNING
9/28/24 43
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Introduction
THE (CO1)TASK
CONCEPT LEARNING
9/28/24 44
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Introduction
THE (CO1)TASK
CONCEPT LEARNING
9/28/24 45
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Introduction
THE (CO1)TASK
CONCEPT LEARNING
9/28/24 46
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Introduction
THE (CO1)TASK
CONCEPT LEARNING
9/28/24 50
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Introduction
THE (CO1)TASK
CONCEPT LEARNING
9/28/24 53
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Introduction
THE (CO1)TASK
CONCEPT LEARNING
9/28/24 54
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Introduction
THE (CO1)TASK
CONCEPT LEARNING
9/28/24 56
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Introduction
THE (CO1)TASK
CONCEPT LEARNING
9/28/24 57
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Introduction
THE (CO1)TASK
CONCEPT LEARNING
9/28/24 58
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Introduction
THE (CO1)TASK
CONCEPT LEARNING
9/28/24 62
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Introduction
THE (CO1)TASK
CONCEPT LEARNING
9/28/24 64
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Introduction
THE (CO1)TASK
CONCEPT LEARNING
9/28/24 65
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Introduction
THE (CO1)TASK
CONCEPT LEARNING
9/28/24 67
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Introduction
THE (CO1)TASK
CONCEPT LEARNING
9/28/24 68
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Introduction
THE (CO1)TASK
CONCEPT LEARNING
9/28/24 69
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Introduction
THE (CO1)TASK
CONCEPT LEARNING
9/28/24 70
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Introduction
THE (CO1)TASK
CONCEPT LEARNING
9/28/24 74
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Introduction
THE (CO1)TASK
CONCEPT LEARNING
9/28/24 84
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Introduction
THE (CO1)TASK
CONCEPT LEARNING
9/28/24 85
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Introduction
THE (CO1)TASK
CONCEPT LEARNING
9/28/24 89
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Learning
THE CONCEPT(CO1,2,3,4)
LEARNING TASK
Learning:
9/28/24 97
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Learning
THE CONCEPT(CO1,2,3,4)
LEARNING TASK
9/28/24 98
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Learning
THE CONCEPT(CO1,2,3,4)
LEARNING TASK
EXAMPLE:
• Checkers system learning from:
• Direct Training:
– Database example consisting of individual checkers board states and
the correct move for each.
• Indirect Training:
– Database of examples consisting of moves sequences and final
outcomes of the game played.
EXAMPLES:
• Teacher to select information base states and provide the correct move for
each. This is passed down to learner. Teacher to Learner.
• Learner might itself propose board state that it finds particularly confusing
and ask the teacher for the correct move. Learner to Teacher.
• Learner might have complete control over both the board states and
indirect training and finds solution to itself. Teacher not required.
• Learner may choose between experimenting with novel based states it is
not yet considered. Learner experiments with changes.
3. The third attribute of the training experience is how well it represents the
distribution of examples over which the final system performance P must be
measured.
1. Learning is most reliable when the training examples follow a
distribution similar to that of future test examples.
2. In checkers learning scenario the performance matrix P is the percent
of games the system within world tournament.
3. If the training experience is done only games played to itself the
problem is the experience is not distributed to the unexpected
situation.
Unexpected situation like: There are different checker champion with
different board states or moves.
• This accepts as input of board from the set of legal board states B
• Produces as output some move from the set of legal move M.
• Move 1: 1(M),(R)
• Move 2(Opponent Cut): 5
• Move3: (yourself Cut): -5
• F(x)= M1+M2+M3+M4…….
• F(x) = V(b’)
• For playing checkers game V(b) is not efficiently computable, because this
does not give the exact real time solution. Hence it is called non
operational definition.
• The solution in this case is operational description V, which helps checkers
playing program to select best moves within the realistic time bounds.
• Represented as operational description of ideal function V.
• Hence the learning algorithms uses approximation to the function called
function approximation.
• Function approximation is used to understand V represented as 𝑣! is to
refer function approximation.
• V𝒕𝒓𝒂𝒊𝒏 (b)←"
𝒗(successor(b))
• Learning algorithm must choose the right weights 𝑤! to best fit the
set of training examples (b, V#$%&' (b)).
9/28/24 126
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
CONTENT
INTRODUCTION
• 1834: In 1834, Charles Babbage, the father of the computer, conceived a device
that could be programmed with punch cards. However, the machine was never
built, but all modern computers rely on its logical structure.
• 1936: In 1936, Alan Turing gave a theory that how a machine can determine and
execute a set of instructions.
• The era of stored program computers:
• 1940: In 1940, the first manually operated computer, "ENIAC" was invented, which
was the first electronic general-purpose computer. After that stored program
computer such as EDSAC in 1949 and EDVAC in 1951 were invented.
• 1943: In 1943, a human neural network was modeled with an electrical circuit. In
1950, the scientists started applying their idea to work and analyzed how human
neurons might work.
9/28/24 169
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
THE CONCEPT
Concept LEARNING
Learning (CO1) TASK
• Now machine learning has got a great advancement in its research, and it
is present everywhere around us, such as self-driving cars, Amazon Alexa,
Catboats, recommender system, and many more. It includes Supervised,
unsupervised, and reinforcement learning with clustering, classification,
decision tree, SVM algorithms, etc.
A Neural Network
• A neural network is a processing device, either an algorithm or an actual
hardware, whose design was inspired by the design and functioning of
animal brains and components thereof.
• The neural networks have ability to learn by example, which makes them
very flexible and powerful.
• These networks are also well suited for real-time systems because of their
fast response and computational times which are because of their parallel
architecture.
• To depict the basic operation of a neural net, consider a set of neurons, say X1
and X2, transmitting signals to another neuron, Y.
• Here X1 and X2 are input neurons, which transmit signals, and Y is the output
neuron, which receives signals.
• Input neurons X1 and X2 are connected to the output neuron Y, over a
weighted interconnection links (W1 and W2).
Activation Function
• The central nervous system (which includes the brain and spinal cord) is made up
of two basic types of cells:
• neurons (1) and glia (4) & (6).
• Glia outnumber neurons in some parts of the brain, but neurons are the key
players in the brain.
• Neurons are information messengers.
• They use electrical impulses and chemical signals to transmit information between
different areas of the brain, and between the brain and the rest of the nervous
system.
• Everything we think and feel and do would be impossible without the work of
neurons and their support cells, the glial cells called astrocytes (4) and
oligodendrocytes (6).
• a
Characteristics of ANN:
Introduction to Clustering
• Clustering methods are one of the most useful unsupervised
ML methods. These methods are used to find similarity as
well as the relationship patterns among data samples and
then cluster those samples into groups having similarity based
on features.
• Clustering is important because it determines the intrinsic
grouping among the present unlabeled data. They basically
make some assumptions about data points to constitute their
similarity. Each assumption will construct different but equally
valid clusters.
• It is not necessary that clusters will be formed in spherical form. Followings are some other cluster
formation methods −
Density-based
• In these methods, the clusters are formed as the dense region. The advantage of these methods is
that they have good accuracy as well as good ability to merge two clusters. Ex. Density-Based
Spatial Clustering of Applications with Noise (DBSCAN), Ordering Points to identify Clustering
structure (OPTICS) etc.
Hierarchical-based
• In these methods, the clusters are formed as a tree type structure based on the hierarchy. They
have two categories namely, Agglomerative (Bottom up approach) and Divisive (Top down
approach). Ex. Clustering using Representatives (CURE), Balanced iterative Reducing Clustering
using Hierarchies (BIRCH) etc.
Partitioning
• In these methods, the clusters are formed by portioning the objects into k clusters. Number of
clusters will be equal to the number of partitions. Ex. K-means, Clustering Large Applications based
upon randomized Search (CLARANS).
Grid
• In these methods, the clusters are formed as a grid like structure. The advantage of these methods
is that all the clustering operation done on these grids are fast and independent of the number of
data objects. Ex. Statistical Information Grid (STING), Clustering in Quest (CLIQUE).
9/28/24 205
Dr. Roop Singh ACSML0601 Machine Learning Unit 1
Introduction
THE CONCEPTof Machine Learning
LEARNING TASK
Approaches(CO1,2,3,4)
Types of ML Clustering Algorithms
The following are the most important and useful ML clustering algorithms −
• K-means Clustering
• This clustering algorithm computes the centroids and iterates until it finds optimal
centroid. It assumes that the number of clusters are already known. It is also called flat
clustering algorithm. The number of clusters identified from data by algorithm is
represented by ‘K’ in K-means.
• Mean-Shift Algorithm
• It is another powerful clustering algorithm used in unsupervised learning. Unlike K-
means clustering, it does not make any assumptions hence it is a non-parametric
algorithm.
• Hierarchical Clustering
• It is another unsupervised learning algorithm that is used to group together the
unlabeled data points having similar characteristics.
• Data summarization and compression − Clustering is widely used in the areas where
we require data summarization, compression and reduction as well. The examples are
image processing and vector quantization.
• Collaborative systems and customer segmentation − Since clustering can be used to
find similar products or same kind of users, it can be used in the area of collaborative
systems and customer segmentation.
• Serve as a key intermediate step for other data mining tasks − Cluster analysis can
generate a compact summary of data for classification, testing, hypothesis generation;
hence, it serves as a key intermediate step for other data mining tasks also.
• Trend detection in dynamic data − Clustering can also be used for trend detection in
dynamic data by making various clusters of similar trends.
• Social network analysis − Clustering can be used in social network analysis. The
examples are generating sequences in images, videos or audios.
• Biological data analysis − Clustering can also be used to make clusters of images, videos
hence it can successfully be used in biological data analysis.
Reinforcement learning
• Example: The problem is as follows: We have an agent and a reward, with many hurdles in
between. The agent is supposed to find the best possible path to reach the reward. The
following problem explains the problem more easily.
Decision Tree :
• Decision Tree is a Supervised learning technique that can be used for both
classification and Regression problems, but mostly it is preferred for solving
Classification problems. It is a tree-structured classifier, where internal nodes
represent the features of a dataset, branches represent the decision rules and
each leaf node represents the outcome.
• In a Decision tree, there are two nodes, which are the Decision Node and Leaf
Node. Decision nodes are used to make any decision and have multiple
branches, whereas Leaf nodes are the output of those decisions and do not
contain any further branches.
• The decisions or the test are performed on the basis of features of the given
dataset.
• Root Node: Root node is from where the decision tree starts. It represents the
entire dataset, which further gets divided into two or more homogeneous sets.
• Leaf Node: Leaf nodes are the final output node, and the tree cannot be
segregated further after getting a leaf node.
• Splitting: Splitting is the process of dividing the decision node/root node into sub-
nodes according to the given conditions.
• Branch/Sub Tree: A tree formed by splitting the tree.
• Pruning: Pruning is the process of removing the unwanted branches from the tree.
• Parent/Child node: The root node of the tree is called the parent node, and other
nodes are called the child nodes.
• For the next node, the algorithm again compares the attribute value with the other sub-
nodes and move further. It continues the process until it reaches the leaf node of the tree.
The complete process can be better understood using the below algorithm:
• Step-1: Begin the tree with the root node, says S, which contains the complete dataset.
• Step-2: Find the best attribute in the dataset using Attribute Selection Measure (ASM).
• Step-3: Divide the S into subsets that contains possible values for the best attributes.
• Step-4: Generate the decision tree node, which contains the best attribute.
• Step-5: Recursively make new decision trees using the subsets of the dataset created in step -
3. Continue this process until a stage is reached where you cannot further classify the nodes
and called the final node as a leaf node.
• Bayesian networks:
• Bayesian networks are a type of Probabilistic Graphical Model that
can be used to build models from data and/or expert opinion.
• Bayesian networks:
• Bayesian networks:
• They are also commonly referred to as Bayes nets, Belief networks and sometimes
Causal networks.
• Probabilistic
• Bayesian networks are probabilistic (conditional probability) because they are
built from probability distributions and also use the laws of probability for
prediction and anomaly detection, for reasoning and diagnostics, decision making
under uncertainty and time series prediction.
• Nodes
• In many Bayesian networks, each node represents a Variable such as someone's height, age
or gender. A variable might be discrete, such as Gender = {Female, Male} or might be
continuous such as someone's age.
• The nodes and links form the structure of the Bayesian network, and we call this the
structural specification.
• Discrete
• A discrete variable is one with a set of mutually exclusive states such as Gender = {Female,
Male}.
• Continuous
• Bayes Server support continuous variables with Conditional Linear Gaussian distributions
(CLG). This simply means that continuous distributions can depend on each other (are
multivariate) and can also depend on one or more discrete variables.
• Bayesian networks:
• Links
• Links are added between nodes to indicate that one node directly influences the other. When
a link does not exist between two nodes, this does not mean that they are completely
independent, as they may be connected via other nodes. They may however become
dependent or independent depending on the evidence that is set on other nodes.
• Structural learning
• Bayes Server include a Structural learning algorithm for Bayesian networks, which can
automatically determine the required links from data.
• Feature selection
• Bayes Server supports a Feature selection algorithm which can help determine which
variables are most likely to influence another. This can be helpful when determining the
structure of a model.
• https://www.bayesserver.com/docs/introduction/bayesian-
networks
• Introduction to SVM
• Support vector machines (SVMs) are powerful yet flexible
supervised machine learning algorithms which are used both
for classification and regression.
• But generally, they are used in classification problems.
• In 1960s, SVMs were first introduced but later they got
refined in 1990.
• SVMs have their unique way of implementation as compared
to other machine learning algorithms.
• Lately, they are extremely popular because of their ability to
handle multiple continuous and categorical variables.
minimize an error.
• The core idea of SVM is to find a maximum marginal hyperplane(MMH) that best
divides the dataset into classes.
• Support Vectors
• Support vectors are the data points, which are closest to the hyperplane. These
points will define the separating line better by calculating margins. These points
are more relevant to the construction of the classifier.
• Hyperplane
• A hyperplane is a decision plane which separates between a set of objects having
different class memberships.
• Margin
• A margin is a gap between the two lines on the closest class points. This is
calculated as the perpendicular distance from the line to support vectors or closest
points. If the margin is larger in between the classes, then it is considered a good
margin, a smaller margin is a bad margin.
• In such situation, SVM uses a kernel trick to transform the input space to a
higher dimensional space as shown on the right. The data points are
plotted on the x-axis and z-axis (Z is the squared sum of both x and y:
z=x^2=y^2). Now you can easily segregate these points using linear
separation.
• SVM Kernels
• The SVM algorithm is implemented in practice using a kernel.
• A kernel transforms an input data space into the required
form.
• SVM uses a technique called the kernel trick. Here, the kernel
takes a low-dimensional input space and transforms it into a
higher dimensional space.
• In other words, you can say that it converts no separable
problem to separable problems by adding more dimension to
it.
• It is most useful in non-linear separation problem.
• Radial Basis Function Kernel: The Radial basis function kernel is a popular
kernel function commonly used in support vector machine classification.
RBF can map an input space in infinite dimensional space.
Advantages of GAs
Limitations of GAs
• GAs are not suited for all problems, especially problems which
are simple and for which derivative information is available.
• Fitness value is calculated repeatedly which might be
computationally expensive for some problems.
• Being stochastic, there are no guarantees on the optimality or
the quality of the solution.
• If not implemented properly, the GA may not converge to the
optimal solution.
1. Define Machine Learning. Discuss with examples why machine learning is important.
2. Discuss with examples some useful applications of machine learning.
3. Explain how some areas/disciplines that influenced the machine learning.
4. What do you mean by a well–posed learning problem? Explain the important features that are
required to well–define a learning problem.
5. Define learning program for a given problem. Describe the following problems with respect to
Tasks, Performance and Experience:
a. Checkers Learning Problems
b. Handwritten Recognition Problem
Youtube video-
•https://www.youtube.com/watch?v=PDYfCkLY_DE
•https://www.youtube.com/watch?v=ncOirIPHTOw
•https://www.youtube.com/watch?v=cW03t3aZkmE
Assignment 1
•Describe Machine Learning with suitable examples.[CO1]
•Brief the well defined learning problems. [CO1]
•Define the process of Designing a Learning System? [CO1]
•Explain different perspective and Issues in Machine Learning. [CO1]
•Define hypotheses. [CO1]
•Analyze THE CONCEPT LEARNING TASK - General-to-specific
ordering of hypotheses? [CO1]
•Illustrate Find-S algorithm. [CO1]
•Describe List-Then-Eliminate algorithm. [CO1]
•Define Candidate elimination algorithm with example. [CO1]
•Briefly explain Inductive bias. [CO1]
Text books: