University Institute of Engineering Department of Computer Science and Engg

UNIVERSITY INSTITUTE OF
ENGINEERING
DEPARTMENT OF COMPUTER SCIENCE
AND ENGG.
Bachelor of Engineering (Computer Science & Engineering)
Artificial Intelligence and Machine Learning(21CSH-316)
Prepared by:
Sitaram patel(E13285)
DISCOVER . LEARN . EMPOWER

1
www.cuchd.in Computer Science and Engineering Department
Course Outcomes
CO1 -Understand the fundamental concepts and techniques of

artificial intelligence and machine learning.
2
Course Objectives
To study learning processes:

To provide a comprehensive
supervised and unsupervised,
To understand the history and foundation to Machine To understand modern
deterministic and statistical
development of Machine Learning and Optimization techniques and practical
knowledge of Machine
Learning. methodology with trends of Machine learning.
learners, and ensemble
applications t.
learning
3
Syllabus
• UNIT-I
• Introduction to Machine Learning Probability, Statistics, and Linear Algebra for

Machine Learning, convex optimization, data visualization, hypothesis function And
testing, data distributions, data preprocessing, data augmentation, normalizing data
sets
4
Probability, Statistics, and Linear Algebra for Machine Learning
• Topic: Probability in Machine Learning

Definition: Probability is the measure of the likelihood of an event occurring.
Importance in ML:
• Probabilistic models are widely used in ML for making predictions and estimating uncertainties.
• Bayesian inference provides a framework for updating beliefs based on new evidence.
• Probability distributions model uncertainty and variability in data Key Probability Concepts
• Random Variables: Variables whose values depend on the outcome of a random event.
• Probability Distributions: Mathematical functions that describe the likelihood of different
outcomes.
• Bayes' Theorem: A fundamental principle for updating beliefs based on new evidence.
• Expectation and Variance: Measures of the central tendency and spread of a probability
distribution.
5
• Statistics in Machine Learning
• Definition: Statistics involves collecting, analyzing, interpreting, presenting, and
organizing data.
• Importance in ML:
• Statistical techniques help us make inferences from data and validate ML models.
• Hypothesis testing assesses the significance of relationships between variables.
• Sampling techniques enable efficient data collection and analysis.
Key Statistical Concepts:
• Descriptive Statistics: Summarizing and visualizing data using measures such as mean,
median, mode, and standard deviation.
• Inferential Statistics: Making predictions or drawing conclusions about a population
based on a sample.
• Hypothesis Testing: Assessing the likelihood of a hypothesis given sample data.
• Correlation and Regression: Analyzing the relationship between variables and making
predictions. 6
• Linear Algebra in Machine Learning
Definition: Linear algebra deals with vectors, matrices, and linear transformations.
• Importance in ML:
ML models often represent data as matrices and utilize linear algebra operations for
optimization and analysis.
Linear algebra provides the foundation for understanding and implementing
Various ML algorithms.
• Linear Algebra Concepts
Vectors and Matrices: Representing and manipulating multi-dimensional data structures.
• Matrix Operations: Addition, subtraction, multiplication, and transposition.
• Eigen values and Eigenvectors: Important concepts for dimensionality reduction
and feature extraction.
• Singular Value Decomposition (SVD): A matrix factorization technique used in
various ML algorithms.
7
convex optimization, data visualization, hypothesis function And testing, data
distributions
• Convex Optimization
• Definition: Convex Optimization refers to the optimization of convex
functions over convex sets. It involves finding the global minimum or
maximum of a convex function, subject to a set of constraints.
• Applications: Parameter estimation, Support Vector Machines, Linear
Regression, Neural Networks, etc.
• Topic: Key Concepts in Convex Optimization
• Convex Sets: Sets that satisfy the condition that a line segment connecting any two
points in the set lies entirely within the set.
• Convex Functions: Functions that satisfy the condition that the line segment
connecting any two points on the function lies above or on the function.
• Global Minima/Maxima: Points at which the convex function attains its lowest or
highest value, respectively.
8
convex optimization, data visualization, hypothesis function And testing, data distributions
Data Visualization
Definition: Data Visualization is the graphical representation of data to uncover patterns, trends, and relationships that
are not immediately evident in raw data.
Importance: Helps in understanding complex datasets, communicating insights effectively, and aiding decision-
making processes.
Techniques: Scatter plots, line graphs, bar charts, histograms, heatmaps, etc.
Topic: Hypothesis Function and Testing
Definition: Hypothesis Function is a function that maps input variables to predicted output values. Hypothesis Testing
involves evaluating the validity of a hypothesis or a claim about a population based on sample data.
Steps in Hypothesis Testing:
Formulate null and alternative hypotheses.
Collect sample data.
Determine a statistical test and significance level.
Calculate test statistics and p-value.
Make a decision based on the p-value and the chosen significance level.
9
convex optimization, data visualization, hypothesis function And testing, data distributions
• Topic: Data Distributions
• Definition: Data Distributions describe the possible values and their probabilities in a dataset.
• Common Distributions:
• Normal Distribution (Gaussian)
• Uniform Distribution
• Binomial Distribution
• Exponential Distribution
• Poisson Distribution
• Log-Normal Distribution
• Importance: Understanding data distributions helps in making assumptions, selecting appropriate
statistical tests, and generating realistic synthetic data.
• Topic: Examples of Data Distributions
• Show visual examples of different data distributions, highlighting their shapes and characteristics.
• Summary: Convex Optimization provides a powerful framework for optimization problems. Data
Visualization enables us to gain insights from complex data. Understanding Hypothesis Function and
Testing is essential for making statistical inferences. Knowledge of Data Distributions helps in
understanding and analyzing datasets effectively. 10
data preprocessing, data augmentation, normalizing data sets
• Topic: Introduction to Data Preprocessing

• Definition: Data preprocessing is a crucial step in preparing data for
machine learning models by transforming raw data into a clean and
structured format.
• Importance: Data preprocessing helps improve the quality and reliability of
the data, enhances model performance, and mitigates the impact of noisy
or irrelevant information.
• Topic: Common Data Preprocessing Techniques
• Data Cleaning: Handling missing values, correcting errors, and removing outliers.
• Data Integration: Combining multiple data sources into a consistent format.
• Data Transformation: Scaling, encoding categorical variables, and feature
engineering.
• Dimensionality Reduction: Reducing the number of features while preserving
important information. 11
• Data Augmentation
• Definition: Data augmentation is a technique used to increase the size and diversity of
the training dataset by applying various transformations to the existing data.
• Benefits: Data augmentation helps prevent overfitting, improves model generalization,
and increases the robustness of the model to variations in the input data.
• Examples of Data Augmentation Techniques: Image rotation, flipping, cropping,
zooming, adding noise, etc.
• Topic: Normalizing Data Sets
• Definition: Normalization is the process of scaling numerical features in the dataset to
a standard range, usually between 0 and 1 or -1 and 1.
• Purpose: Normalization ensures that features with different scales or units contribute
equally to the learning process, prevents dominance of certain features, and aids
convergence during model training.
• Popular Normalization Techniques: Min-Max Scaling, Z-score Standardization, Decimal
Scaling, etc.
12
• Topic: Data Preprocessing Workflow
• Visual representation of the sequential steps involved in data preprocessing, including
data cleaning, integration, transformation, and dimensionality reduction.
• Topic: Data Augmentation Workflow
• Visual representation of the process of data augmentation, illustrating various
transformations applied to the original data to create augmented samples.
• Topic: Normalization Techniques Comparison
• Comparative analysis of different normalization techniques, highlighting their
advantages, disadvantages, and suitable use cases.
• Summary: Data preprocessing, data augmentation, and normalization are essential
techniques in machine learning that contribute to improving model performance,
generalization, and robustness.
• Key Takeaways: Preprocessing ensures clean and structured data, augmentation
increases data diversity, and normalization standardizes feature scales.
13
References
• Books and Journals
• Understanding Machine Learning: From Theory to Algorithms by Shai Shalev-Shwartz and Shai
Ben-David-Cambridge University Press 2014
• Introduction to machine Learning – the Wikipedia Guide by Osman Omer.
• Video Link-
• https://www.youtube.com/watch?v=9f-GarcDY58
• https://www.youtube.com/watch?v=GwIo3gDZCVQ
• Web Link-
• https://data-flair.training/blogs/types-of-machine-learning-algorithms/
• https://towardsdatascience.com/machine-learning-an-introduction-23b84d51e6d0
• https://towardsdatascience.com/introduction-to-machine-learning-f41aabc55264
14
THANK YOU

University Institute of Engineering Department of Computer Science and Engg

Uploaded by

Copyright:

Available Formats

University Institute of Engineering Department of Computer Science and Engg

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

University Institute of Engineering Department of Computer Science and Engg

Uploaded by

Copyright:

Available Formats

UNIVERSITY INSTITUTE OF

DISCOVER . LEARN . EMPOWER

CO1 -Understand the fundamental concepts and techniques of

To study learning processes:

• Introduction to Machine Learning Probability, Statistics, and Linear Algebra for

• Topic: Probability in Machine Learning

• Topic: Introduction to Data Preprocessing

You might also like