What Is Machine Learning
What Is Machine Learning
Goal:
The primary goal of AI is to create systems that can perform tasks that normally require human
intelligence. These tasks include reasoning, learning, problem-solving, perception, and language
understanding.
Methods:
AI involves a variety of techniques such as machine learning, neural networks, natural language
processing, robotics, and expert systems.
Applications:
AI is used in a wide range of applications including autonomous vehicles, speech and image recognition,
recommendation systems, virtual assistants, and game playing.
Focus:
AI focuses on creating intelligent agents that can autonomously perform tasks and make decisions.
Example:
Developing a self-driving car that can navigate through traffic, make decisions based on its surroundings,
and learn from new driving scenarios.
Goal:
The primary goal of Data Science is to extract meaningful insights and knowledge from data. It involves
analyzing and interpreting complex data to help inform decision-making.
Methods:
Data Science uses statistical analysis, data mining, machine learning, data visualization, and big data
technologies to analyze data.
Applications:
Data Science is used in business intelligence, healthcare analytics, financial forecasting, market analysis,
and scientific research.
Focus:
Data Science focuses on data manipulation, analysis, and visualization to derive actionable insights.
Example:
Analyzing customer purchase data to identify trends, preferences, and patterns that can inform
marketing strategies and business decisions.
AI Project Framework:
Data Science involves pre-processing, processing, and post-processing, which are also known as data
preparation, modeling, and deployment, respectively.
1. Problem Definition
2. Data
Pre-processing: Clean and transform the data to make it suitable for analysis.
3. Evaluation Criteria
4. Features
5. Modelling
Select and train appropriate machine learning or statistical models.
Validate and fine-tune the models.
6. Iteration
Continuously iterate on the process by refining the data, features, and models based on
evaluation results.
7. Post-processing
8. Deployment
Deploy the model to the production environment.
2. Unsupervised Learning
Clustering: Groups similar data points together (e.g., customer segmentation, topic
modeling).
3. Transfer Learning
Fine-tuning an existing trained model to adapt it to a new, but related, task (e.g., using a pre-
trained image recognition model for medical image analysis).
4. Reinforcement Learning
Learning through trial and error to maximize a reward (e.g., AlphaGo, robotics).
Characteristics: Agent learns by interacting with the environment and receiving feedback in
the form of rewards or penalties.
2. Step2- Data
Types:
Unstructured Data: Data that doesn’t have a predefined format (e.g., audio files).
Time Series Data: Data that is indexed in time order (e.g., stock market data).
Tools:
3. Step3- Evaluation
Accuracy: For classification tasks, e.g., 95% accuracy for medical treatment.
Note that evaluation criteria can vary depending on the model and task.
4. Step4- Features
Identify and engineer features that will be used in the model.
Understand the features of the data. For example, a table may consist of:
Id: Identifier
Ensure that at least 10% of the data is in derived variables; otherwise, it may be considered useless.
5. Step5- Modelling
Basic components for modelling.
On the other hand, if preparation is not thorough and the test is not performed well, accuracy is
low, and this is known as underfitting.
Data Validation
100 patients: 70 for training, 15 for tuning, and 15 for testing.
Let's say the model achieves 98% accuracy on training data (MACC) and 92% accuracy on test data
(TACC) -> This indicates good performance.
6. Step6- Iterations
Iteratively refine the model to balance computational cost and accuracy.
Adjust hyperparameters and re-train the model to improve performance.
Use cross-validation to assess the model's generalizability.
Continuously monitor and evaluate the model's performance on validation and test data.
Aim to achieve an optimal balance between high accuracy and manageable computational
cost.
4. Which type of machine learning is used for predicting a continuous output variable?
Classification
Clustering
Regression
Classification & Regression
6. Which type of machine learning is used for grouping similar data points together?
Classification
Clustering
Regression
Classification & Regression
Structured data is organized into tables or spreadsheets, while unstructured data is not
Structured data is easy to analyze, while unstructured data is difficult to analyze
There is no difference between structured and unstructured data
None
8. What is the main difference between artificial intelligence and machine learning?
9. Which statistical technique is used to determine the relationship between two variables?
Regression analysis
Classification analysis
Clustering analysis
Both Classification analysis & Clustering analysis
10. What is the main difference between reinforcement learning and supervised learning?
Reinforcement learning is used for classification tasks, while supervised learning is used for
regression tasks
Reinforcement learning uses labeled data, while supervised learning uses unlabeled data
Reinforcement learning learns by trial and error, while supervised learning learns from labeled
data
None
Interpreter is used to convert human written code into understandable code by computer (binary form)
and used to interpret line by line.
Data Types:
1. Boolean
2. Numeric type (Int, Float, Double)
3. Ordered Sequence (Char, String, List, Tuple)
4. Unordered Sequence (Dictionary, Set)
80/20 Rule 20% makes 80% -> 20% learning ,80% practice.
Variables:
Naming Rules for variables:
1. Not start with numeric value
2. Not start with special character
3. Not similar to keyword
4. Not space between words
String Concatenation:
Message = “Hello”
Message2 = “world”
Print (Message + Message2)
Type Conversion:
Val =str (2)
Name = int (input (“Enter a number name”))
String Formatting:
Print (“Hello” + name + “Welcome to Application”) can be used efficiently as
Print (‘Hello {} Welcome to Application {}’. format (name, name2))
And more updated and easier to use method is
Print (f ‘Hello {name} Welcome to Application {name2}’)
Immutability
Immutable objects are those whose values, once created, cannot be changed. They form the
cornerstone of Python programming, offering a sense of predictability and stability.
//strings are immutable in python but arrays are mutable
s="sksks"
s [0] ="x"
// compilation error as can't assign
Built-in Functions: Globally available, called without an object, apply to multiple data
types.
Methods: Associated with objects, called on instances using dot notation, specific to the
object's class.
Boolean Data type
True, false, bool (treat 0 as false and any other value as true)
Exercise
message = input ('Please Enter your message: ‘) [: -1]
print (f'Hackers is reading {message}')
print (f'My friend is reading {message [: -1]}')
Please Enter your message: Hackers is reading narmaK
My friend is reading Kamran
List
list_1 = [1,2,3,4,5,6,7,8,9,10]
print(list_1)
print(list_2[7][1])
print(list_3)
print(list_4)
Output:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[True, False]
Lists Continued
Just like string, list is ordered sequence data type but it is mutable.
name = 'JohnDoe'
name [0] = 'j'
print(name)
ERROR: TypeError: 'str' object does not support item assignment
print(matrix)
List Methods
Append, extend, remove, pop, clear, count, sort, insert etc.
Dictionary
Keys are immutable.
Methods: popitem, get, items
Tuple Data Types
Immutable, just like list, write in round brackets
Only items can’t be reassigned.
Conditional Statements
If-else, if-elif-else, And, OR
Logical Conditions
Equal, Not Equal, Greater than, Less than, Greater than or equal to, less than or equal to
Identity Operator
Print([1,2,3]==[1,2,3]) comparing values
Even print(list(range(0,10,2)))
Odd print(list(range(1,10,2)))
While Loop
Infinite loop until a condition is true.
As the name suggests pass statement simply does nothing. The pass statement in Python is used when a
statement is required syntactically but you do not want any command or code to execute. It is like a null
operation, as nothing will happen if it is executed. Pass statements can also be used for writing empty
loops. Pass is also used for empty control statements, functions, and classes.
Functions
DRY (Don’t Repeat yourself)
Why of functions:
#1 Wraper Encapsulate
#2 Departmentalize
#3 DRY rule
Parameter vs Argument
Function call, invoke, execute are same
Doc String
A Python docstring is a string used to document a Python module, class, function or method, so
programmers can understand what it does without having to read the details of the implementation.
Also, it is a common practice to generate online (html) documentation automatically from docstrings.
"""_summary_
Args:
a (int): _description_
b (str): _description_
c (bool, optional): _description_. Defaults to True.
Returns:
bool: _description_
"""
if a == c:
return True
else:
return False
Good Programming Practices
Simplifying the code is good programming practices. Reducing the number of lines of the code is
recommended.
*args (arguments) allows you to pass a variable number of positional arguments to a function.
**kwargs (keyword arguments) allows you to pass a variable number of keyword arguments
(key-value pairs) to a function.
Return sum(args)
Print(adder(1,2,3,4))
Return sum(args)
Print(adder(1,2,3,4))
Exercise
Scope of Function
A variable created inside a function belongs to the local scope of that function, and can only be used
inside that function.
Scope Rules:
if an entity (i.e., variable, parameter and function) is "visible" or accessible at certain places. Thus,
places where an entity can be accessed or visible is referred to the scope of that entity. in which it is
declared.
The names or objects which are accessible are called in-scope. The names or objects which are not
accessible are called out-of-scope. The Python scope concept follows the LEGB (Local, Enclosing, Global
and built-in) rule. The scope concept helps to avoid the name collision and use of global names across
the programs.
LEGB:
In Python, the `global` and `nonlocal` keywords are used to modify the behavior of variable scope within
functions. The `global` keyword allows you to access and modify a variable defined at the global
(module) scope from within a function, ensuring that changes to the variable inside the function affect
the global variable. On the other hand, the `nonlocal` keyword is used to modify variables in the nearest
enclosing scope that is not global, typically within nested functions. This allows you to work with
variables in an outer function from within an inner function, enabling changes to persist outside the
inner function but still within the outer function’s scope. Both keywords help manage variable scope in
complex functions and nested structures.
Quality of Code:
#map
#Filter
#Zip
#Reduce
List Comprehension
Consist of expression , loop and condition
Sets and Dictionary Comprehension
Python Modules
Python packages
Pandas
Add column, Remove Column, Change Data, Replacing NAN values(fillna()), Manipulating data(frac=1 or
0 or 0.5), reset_index(drop=true, inplace = true) will drop an extra index or another method
variable.drop(‘index’,axis=1,inplace=true). Here axis = 1 show column and axis = 0 show row, inplace
show to build changes in the original file of data.
Resources:
https://ml-cheatsheet.readthedocs.io/en/latest/
https://ml-playground.com/
https://replit.com/