Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
11 views

What Is Machine Learning

Uploaded by

Sehrish Saddique
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

What Is Machine Learning

Uploaded by

Sehrish Saddique
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Contents

What is Machine Learning?.........................................................................................................................2


1.1 AI and Data Science.......................................................................................................................2
AI Project Framework:.............................................................................................................................3
1. Step1- Problem Definition............................................................................................................4
2. Step2- Data..................................................................................................................................4
3. Step3- Evaluation.........................................................................................................................5
4. Step4- Features............................................................................................................................5
5. Step5- Modelling..........................................................................................................................6
6. Step6- Iterations..........................................................................................................................6
Quiz1:..........................................................................................................................................................7
What is Programming Language?................................................................................................................8
Data Types:..............................................................................................................................................9
Exercise..............................................................................................................................................11
Resources:.............................................................................................................................................11
What is Machine Learning?
Machine Learning is the capability of a computer to learn and recognize patterns. For example, consider
a shopkeeper who needs to record the daily loan data of his customers. Recording hundreds of records
manually is challenging, and hiring someone would come with its own limitations (such as computational
capacity, time management, and employee availability). Instead, he could buy a computer system and
train a model to recognize people and their loan amounts. This approach would not only make things
easier for him but also help accomplish tasks in a timely and efficient manner.

1.1 AI and Data Science


Data science combines statistical tools, methods, and technology to generate meaning from data.
Artificial Intelligence takes this one step further and uses the data to solve cognitive problems commonly
associated with human intelligence, such as learning, pattern recognition, and human-like expression.

1.3 Artificial Intelligence (AI)

Goal:

The primary goal of AI is to create systems that can perform tasks that normally require human
intelligence. These tasks include reasoning, learning, problem-solving, perception, and language
understanding.

Methods:

AI involves a variety of techniques such as machine learning, neural networks, natural language
processing, robotics, and expert systems.

Applications:

AI is used in a wide range of applications including autonomous vehicles, speech and image recognition,
recommendation systems, virtual assistants, and game playing.

Focus:

AI focuses on creating intelligent agents that can autonomously perform tasks and make decisions.

Example:

Developing a self-driving car that can navigate through traffic, make decisions based on its surroundings,
and learn from new driving scenarios.

1.4 Data Science

Goal:

The primary goal of Data Science is to extract meaningful insights and knowledge from data. It involves
analyzing and interpreting complex data to help inform decision-making.

Methods:

Data Science uses statistical analysis, data mining, machine learning, data visualization, and big data
technologies to analyze data.
Applications:

Data Science is used in business intelligence, healthcare analytics, financial forecasting, market analysis,
and scientific research.

Focus:

Data Science focuses on data manipulation, analysis, and visualization to derive actionable insights.

Example:

Analyzing customer purchase data to identify trends, preferences, and patterns that can inform
marketing strategies and business decisions.

AI Project Framework:
Data Science involves pre-processing, processing, and post-processing, which are also known as data
preparation, modeling, and deployment, respectively.

1. Problem Definition

 Clearly define the problem you are trying to solve.

2. Data

 Types: Structured, Unstructured, Static, and Time-series data.

 Acquisition: Collect relevant data from various sources.

 Pre-processing: Clean and transform the data to make it suitable for analysis.

3. Evaluation Criteria

 Define metrics and criteria to evaluate the model's performance.

4. Features

 Identify and engineer features that will be used in the model.

5. Modelling
 Select and train appropriate machine learning or statistical models.
 Validate and fine-tune the models.
6. Iteration

 Continuously iterate on the process by refining the data, features, and models based on
evaluation results.

7. Post-processing

 Prepare the final model and results for deployment.

 Interpret and present the results.

8. Deployment
 Deploy the model to the production environment.

 Monitor the model's performance and make necessary updates

1. Step1- Problem Definition


Clearly define the problem you are trying to solve. Determine if the problem requires a machine learning
solution. Not all problems need machine learning; some may be addressed with simpler statistical
methods or rule-based systems. Consider factors such as data availability, the complexity of the task, and
the potential benefits of using machine learning over traditional methods. Outline the specific goals and
objectives of the project to ensure that machine learning is the appropriate tool for the task.

Machine Learning Types:


1. Supervised Learning

 Classification: Predicts discrete labels (e.g., spam detection, image classification).

 Regression: Predicts continuous values (e.g., house prices, temperature forecasting).

 Characteristics: Known input and output data.

2. Unsupervised Learning

 Clustering: Groups similar data points together (e.g., customer segmentation, topic
modeling).

 Characteristics: Known input data, but the output is not pre-defined.

3. Transfer Learning

 Fine-tuning an existing trained model to adapt it to a new, but related, task (e.g., using a pre-
trained image recognition model for medical image analysis).

4. Reinforcement Learning

 Learning through trial and error to maximize a reward (e.g., AlphaGo, robotics).

 Characteristics: Agent learns by interacting with the environment and receiving feedback in
the form of rewards or penalties.

2. Step2- Data
Types:

 Structured Data: Data in a tabular format (e.g., .csv files).

 Unstructured Data: Data that doesn’t have a predefined format (e.g., audio files).

 Static Data: Data that doesn’t change over time.

 Time Series Data: Data that is indexed in time order (e.g., stock market data).

Tools:

 Jupyter Notebook: For interactive data analysis and documentation.


 Pandas: For data manipulation and analysis.

 Matplotlib: For data visualization.

 Scikit-Learn: For applying machine learning algorithms.

3. Step3- Evaluation

Define metrics and criteria to evaluate the model's performance.

 Accuracy: For classification tasks, e.g., 95% accuracy for medical treatment.

 Mean Absolute Error (MAE): For regression tasks.

 Root Mean Square Error (RMSE): For regression tasks.

Note that evaluation criteria can vary depending on the model and task.

4. Step4- Features
Identify and engineer features that will be used in the model.

Understand the features of the data. For example, a table may consist of:

 Id: Identifier

 Weight: Numeric variable

 Sex: Categorical variable

 Heart Rate: Numeric variable

 Disease: Target variable

 Smoke: Derived variable

Ensure that at least 10% of the data is in derived variables; otherwise, it may be considered useless.
5. Step5- Modelling
Basic components for modelling.

1. Choose and Train (70%) -> Train data


2. Tune the Model (15%) -> Validation
3. Model Compare (15%) -> Test data

Train data (70%) and test data (30%):


 For example, preparing for exams: If you are trained on the exact exam paper, there is no need
to prepare for the exam because being trained over test data results in 100% accuracy. However,
the results don’t show the true capability of the student. This is known as overfitting.

 On the other hand, if preparation is not thorough and the test is not performed well, accuracy is
low, and this is known as underfitting.

Tuning the Model:


 Tuning the model is very important. If the accuracy is not good, hyperparameters (i.e., settings
that control the training process) will be adjusted to improve accuracy.

Data Validation
100 patients: 70 for training, 15 for tuning, and 15 for testing.

Let's say the model achieves 98% accuracy on training data (MACC) and 92% accuracy on test data
(TACC) -> This indicates good performance.

 If MACC >> TACC, it indicates overfitting.


 If MACC << TACC, it indicates underfitting.

6. Step6- Iterations
 Iteratively refine the model to balance computational cost and accuracy.
 Adjust hyperparameters and re-train the model to improve performance.
 Use cross-validation to assess the model's generalizability.
 Continuously monitor and evaluate the model's performance on validation and test data.
 Aim to achieve an optimal balance between high accuracy and manageable computational
cost.

Tools: Anaconda (IDE Jupiter Note Book)


Quiz1:
1. What is the main goal of machine learning?
 To make computers more intelligent
 To create intelligent machines that can learn from data and improve over time
 To mimic human-like thinking
 None of the above

2. Which of the following is NOT a type of machine learning?


 Supervised learning
 Unsupervised learning
 Semi-supervised learning
 Reinforcement learning

3. Which of the following is an application of machine learning in the healthcare industry?


 Image recognition in radiology
 Fraud detection in banking
 Customer segmentation in marketing
 Fraud detection in banking & Customer segmentation in marketing

4. Which type of machine learning is used for predicting a continuous output variable?
 Classification
 Clustering
 Regression
 Classification & Regression

5. Which of the following is NOT a step in the data science process?


 Data collection
 Data cleaning
 Data visualization
 Data storage

6. Which type of machine learning is used for grouping similar data points together?
 Classification
 Clustering
 Regression
 Classification & Regression

7. What is the main difference between structured and unstructured data?

 Structured data is organized into tables or spreadsheets, while unstructured data is not
 Structured data is easy to analyze, while unstructured data is difficult to analyze
 There is no difference between structured and unstructured data
 None

8. What is the main difference between artificial intelligence and machine learning?

 Artificial intelligence is a subset of machine learning


 Artificial intelligence refers to machines that can think and reason like humans, while machine
learning refers to machines that can learn from data
 There is no difference between artificial intelligence and machine learning
 None

9. Which statistical technique is used to determine the relationship between two variables?

 Regression analysis
 Classification analysis
 Clustering analysis
 Both Classification analysis & Clustering analysis

10. What is the main difference between reinforcement learning and supervised learning?

 Reinforcement learning is used for classification tasks, while supervised learning is used for
regression tasks
 Reinforcement learning uses labeled data, while supervised learning uses unlabeled data
 Reinforcement learning learns by trial and error, while supervised learning learns from labeled
data
 None

What is Programming Language?


A programming language is a set of instructions written by a programmer to deliver instructions to the
computer to perform and accomplish a task.

Interpreter is used to convert human written code into understandable code by computer (binary form)
and used to interpret line by line.
Data Types:
1. Boolean
2. Numeric type (Int, Float, Double)
3. Ordered Sequence (Char, String, List, Tuple)
4. Unordered Sequence (Dictionary, Set)

Built-in functions (type, round, abs, etc.)

80/20 Rule  20% makes 80% -> 20% learning ,80% practice.

Operations (+, -, /, *, % (modulus), // (whole part), ** (power))

Order of Precedence PEMDAS


1. () Parenthesis
2. ** Exponent
3. */
4. + -

Variables:
Naming Rules for variables:
1. Not start with numeric value
2. Not start with special character
3. Not similar to keyword
4. Not space between words

Statement and Expressions:


A = 2/7  expression
Print(A)  statement

Augmented Assignment Operator:


A *= 10
A += 10

String Concatenation:
Message = “Hello”
Message2 = “world”
Print (Message + Message2)

Type Conversion:
Val =str (2)
Name = int (input (“Enter a number name”))
String Formatting:
Print (“Hello” + name + “Welcome to Application”) can be used efficiently as
Print (‘Hello {} Welcome to Application {}’. format (name, name2))
And more updated and easier to use method is
Print (f ‘Hello {name} Welcome to Application {name2}’)

Indexing and Ordered Sequence:


Indexing:
Indexing refers to accessing individual elements within a sequence using their position
number. In most programming languages, including Python, indexing starts at 0. This means
the first element has an index of 0, the second element has an index of 1, and so on.
Ordering:
Ordering refers to the arrangement of elements in a sequence. In the context of a string,
list, or array, the order is the sequence in which the elements appear. For instance, the
string 'abcdefgh' has a specific order from 'a' to 'h'.
Example:
name3 ='abcdefgh'
# 01234567
# [start: stop: skip]
print(name3[0:7:6])
# Output: ag

Immutability
Immutable objects are those whose values, once created, cannot be changed. They form the
cornerstone of Python programming, offering a sense of predictability and stability.
//strings are immutable in python but arrays are mutable
s="sksks"
s [0] ="x"
// compilation error as can't assign

Built in Functions and Methods


print (string. Capitalize ())
. is indicating that after dot method is being called. Print is function.

 Built-in Functions: Globally available, called without an object, apply to multiple data
types.
 Methods: Associated with objects, called on instances using dot notation, specific to the
object's class.
Boolean Data type
True, false, bool (treat 0 as false and any other value as true)

Exercise
message = input ('Please Enter your message: ‘) [: -1]
print (f'Hackers is reading {message}')
print (f'My friend is reading {message [: -1]}')
Please Enter your message: Hackers is reading narmaK
My friend is reading Kamran

List
list_1 = [1,2,3,4,5,6,7,8,9,10]

list_2 = ['a','b','c','d','e','f','g’, [1,3]]

list_3 = [True, False]

list_4 = [1,2,3,'a’, False]

print(list_1)

print(list_2[7][1])

print(list_3)

print(list_4)

Output:

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

[True, False]

[1, 2, 3, 'a', False]

Lists Continued
Just like string, list is ordered sequence data type but it is mutable.

name = 'JohnDoe'
name [0] = 'j'
print(name)
ERROR: TypeError: 'str' object does not support item assignment

But for list


list_items = ['John', 'Doe', 'is', 'a', 'coder']
list_items [0] = 'Jane'
print(list_items)
Output: ['Jane', 'Doe', 'is', 'a', 'coder']
If we want that our original list will be retained and newlist will be created with change then we can
apply this by using this example:

list_items = ['John', 'Doe', 'is', 'a', 'coder']


new_list = list_items[::]
new_list [0] = 'Jane'
print(list_items)
print(new_list)
Output:

['John', 'Doe', 'is', 'a', 'coder']


['Jane', 'Doe', 'is', 'a', 'coder']
Matrix
matrix = [[1, 2, 3],[4, 5, 6], [7, 8, 9] ]

print(matrix)

Output: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

List Methods
Append, extend, remove, pop, clear, count, sort, insert etc.

Print(numbers.index(1))  find 1 , 1 is element of list

Print(numbers.index(1,1,3))  1 ko find kro 1 index sy 3 index tk

Creating list programmatically


number = [21,2,3,4,5,26,7,8,9,10]
# print(list(range(1,100)))
a,b,*r=number
print("A: ",a,"::B ",b,"::R ",r)
Output:

Dictionary
Keys are immutable.
Methods: popitem, get, items
Tuple Data Types
Immutable, just like list, write in round brackets
Only items can’t be reassigned.

Sets Data Types


Union, Intersection, difference

Conditional Statements
If-else, if-elif-else, And, OR

Logical Conditions
Equal, Not Equal, Greater than, Less than, Greater than or equal to, less than or equal to

Identity Operator
Print([1,2,3]==[1,2,3]) comparing values

Print([1,2,3] is [1,2,3])  comparing memory location

For loop and Iterables


Mylist = ['Apples','Oranges','Grapes']
print(Mylist)
print('x'*20)
for newlist in Mylist:
print(newlist,"Price TAG")
Output:

Nested For loop


Range Function
Range(10)range(0,10),

Even  print(list(range(0,10,2)))

Odd  print(list(range(1,10,2)))

While Loop
Infinite loop until a condition is true.

Continue, Break, Pass keywords


function/ condition / loop:
pass

As the name suggests pass statement simply does nothing. The pass statement in Python is used when a
statement is required syntactically but you do not want any command or code to execute. It is like a null
operation, as nothing will happen if it is executed. Pass statements can also be used for writing empty
loops. Pass is also used for empty control statements, functions, and classes.

Functions
DRY (Don’t Repeat yourself)
Why of functions:

#1 Wraper Encapsulate

#2 Departmentalize

#3 DRY rule

Parameter vs Argument
Function call, invoke, execute are same

Doc String
A Python docstring is a string used to document a Python module, class, function or method, so
programmers can understand what it does without having to read the details of the implementation.
Also, it is a common practice to generate online (html) documentation automatically from docstrings.

def function(a: int, b: str, c = True) -> bool:

"""_summary_

Args:
a (int): _description_
b (str): _description_
c (bool, optional): _description_. Defaults to True.

Returns:
bool: _description_
"""

if a == c:
return True
else:
return False
Good Programming Practices
Simplifying the code is good programming practices. Reducing the number of lines of the code is
recommended.

Args and kwargs


Argument vs keyword argument.

*args and **kwargs allow functions to accept a variable number of arguments:

 *args (arguments) allows you to pass a variable number of positional arguments to a function.

 **kwargs (keyword arguments) allows you to pass a variable number of keyword arguments
(key-value pairs) to a function.

Def adder (*args):

Return sum(args)

Print(adder(1,2,3,4))

Keyword arguments: having more variables arguments

Def adder (*args, **kwargs):

Return sum(args)

Print(adder(1,2,3,4))

Exercise
Scope of Function
A variable created inside a function belongs to the local scope of that function, and can only be used
inside that function.

Scope Rules:

if an entity (i.e., variable, parameter and function) is "visible" or accessible at certain places. Thus,
places where an entity can be accessed or visible is referred to the scope of that entity. in which it is
declared.

The names or objects which are accessible are called in-scope. The names or objects which are not
accessible are called out-of-scope. The Python scope concept follows the LEGB (Local, Enclosing, Global
and built-in) rule. The scope concept helps to avoid the name collision and use of global names across
the programs.

LEGB:

1. Starts with local.


2. Then it looks into parents local.
3. Then it looks in global scope.
4. Builtin Function

Global and nonlocal keywords


Nonlocal keyword make it to define to have variable from parents local.

Global keyword makes it to find the variable from global scope.

In Python, the `global` and `nonlocal` keywords are used to modify the behavior of variable scope within
functions. The `global` keyword allows you to access and modify a variable defined at the global
(module) scope from within a function, ensuring that changes to the variable inside the function affect
the global variable. On the other hand, the `nonlocal` keyword is used to modify variables in the nearest
enclosing scope that is not global, typically within nested functions. This allows you to work with
variables in an outer function from within an inner function, enabling changes to persist outside the
inner function but still within the outer function’s scope. Both keywords help manage variable scope in
complex functions and nested structures.

Quality of Code:

1. Always return expected output


2. Side Effects

Special Functions map

#map
#Filter

#Zip

#Reduce

List Comprehension
Consist of expression , loop and condition
Sets and Dictionary Comprehension

Python Modules
Python packages
Pandas
Add column, Remove Column, Change Data, Replacing NAN values(fillna()), Manipulating data(frac=1 or
0 or 0.5), reset_index(drop=true, inplace = true) will drop an extra index or another method
variable.drop(‘index’,axis=1,inplace=true). Here axis = 1 show column and axis = 0 show row, inplace
show to build changes in the original file of data.

File.apply(lambda x: x/1000) as in example 1000 will convert kb to mb.

Resources:
https://ml-cheatsheet.readthedocs.io/en/latest/

https://ml-playground.com/

Online Interpreter: https://glot.io/

https://replit.com/

You might also like