0% found this document useful (0 votes)

7 views

Machine learning assignment (3)

The document is an individual assignment by Solomon Abrha from MIT, discussing key concepts in machine learning, including input and output variables, the difference between models and algorithms, and the distinctions between parametric and non-parametric algorithms. It also covers overfitting and underfitting, the process of using data in ML, and the importance of feature engineering. The assignment emphasizes the significance of data preparation and feature selection in building effective machine learning models.

Uploaded by

selemunabrha276

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views

Machine learning assignment (3)

Uploaded by

selemunabrha276

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

MU -- MIT

MACHINE LEARNING-INDIVISUAL ASSIGNMENT

NAME SOLOMON ABRHA
ID - MIT/UR/122/12

1,What is mean input and output variables in ML? Which is dependent and
independent variable ?

Input Variables

 Also known as features, predictors, and it is independent variables.

 These are the variables or attributes used to predict the output variable.
 They are the input to the model and represent the information or data you provide to the
algorithm for analysis or prediction.

 Example: In a house price prediction problem, input variables could be:

o Number of bedrooms
o Square footage
o Location of the house

Output Variable

 Also known as the target variable, label, and it is dependent variable.

 This is the variable the model is trying to predict or explain.
 It is dependent on the input variables since its value is determined by them.
 Example: In the same house price prediction problem, the output variable is:
o Price of the house

 Input variable is Independent Variable: It stands alone and is not affected by other
variables in the dataset.
 Output variable is Dependent Variable: It depends on the independent variables. In
ML, the goal is to find a relationship or function that maps the independent variables to
the dependent variable.

2. Write the difference between model and algorithm ?

Model

 A model is a mathematical representation of a real-world process. It is the result of applying an

algorithm to data and is used to make predictions or decisions based on new data.
 A model can make predictions on unseen data based on the patterns it has learned.
 A model is the final output of an ML algorithm after it has been trained on data. It represents
the learned patterns, relationships, or rules that the algorithm discovered in the training data.
 Examples include linear regression models, decision trees, neural networks, etc.

Algorithm

 An algorithm is a set of rules or instructions for solving a problem or performing a

task. In the context of machine learning, it refers to the procedure used to learn the
model from data.
 An algorithm is a set of mathematical instructions or a procedure that is used to train a model
by finding patterns in data.
 Used To process data, optimize the model parameters, and derive the model.
 Examples include gradient descent, random forest, support vector machines, etc.

3, Write the difference between parametric and nonparametric ML algorithms.

Parametric ML algorithm

 A parametric algorithm has a fixed number of parameters.

 Parametric methods make large assumptions about the mapping of the input variables to the
output variable.
 Parametric machine learning algorithms simply the mapping to a know functional form
 Its Model Structure is defined by a fixed number of parameters (e.g., coefficients in
linear regression).
 Its Training Speed is Generally faster to train because they require estimating a limited
number of parameters.

Non-Parametric ML Algorithms

 These algorithms do not make strong assumptions about the data and have no fixed
number of parameters. The complexity of the model can grow with the amount of data.
 Non-parametric algorithms uses a flexible number of parameters, and the number of
parameters often grows as it learns from more data.
 Non-parametric algorithms uses a flexible number of parameters, and the number of
parameters often grows as it learns from more data.
 Model Structure: The model complexity can grow with the amount of data (e.g., the
number of training samples).
 Training Speed: Typically slower to train, especially with large datasets, since they may
involve storing the entire dataset or a large portion of it.

4, Write the difference between over fitting and under fitting. Explain the cause of
over fitting and under fitting.

overfitting
 Over fitting refers to a model learns the training data too well but not generalizing well to new
data,.
 High accuracy on the training dataset but poor performance on the validation/test dataset.

 The model is overly complex, often having too many parameters relative to the amount of
training data.
 It reflects a situation where the model memorizes the training data instead of learning
general patterns.
 Overfitting occurs when a model learns the details and noise in the training data to the
extent that it negatively impacts its performance on new data. In essence, the model
becomes too complex and captures patterns that do not generalize.

underfitting

 Under fitting refers to a model that can neither well the training data not generalize to new
data. It failing to learn the problem from the training data sufficiently.
 Causes due to Too Simple Models: Using overly simplistic models (e.g., a linear model
for a non-linear relationship) which cannot capture the data's complexity.
 Causes due to Insufficient Training: Not training the model long enough for it to learn
from the training data effectively.
 Causes due to Excessive Feature Reduction: Removing too many features can lead to
loss of important information necessary for making accurate predictions.
 Underfitting occurs when a model is too simple to capture the underlying structure of the data.
It fails to learn the relationships in the data, leading to poor performance on both the training
and test datasets.

5, Explain the way to use data in ML. Describe attribute or feature.

 Using data effectively in machine learning (ML) is crucial for building models that generalize well
to unseen data. The process involves several steps

1,problem understanding and Data Collection:

 Gather data from various sources, which can include databases, external APIs, web
scraping, or existing datasets.
 Ensure that the data is relevant to the problem you're trying to solve.

2, Data Understanding:

 Explore and analyze the data to understand its structure and characteristics.

3,Data Cleaning:

 Handle missing values, duplicates, and outliers.

 Correct inconsistencies and format the data properly to ensure quality inputs for the
model.
4, Feature Selection and Engineering:

 Select relevant features that contribute to the prediction of the target variable.
 Create new features from existing ones (feature engineering) to improve model
performance. This might involve combining, transforming, or encoding variables.

5, Data Splitting:

 Divide the dataset into training, validation, and test sets. Common splits are 70% for
training, 15% for validation, and 15% for testing.

6, Model Training:

 Choose an appropriate algorithm and train the model using the training data.
 Adjust the model's parameters to minimize prediction errors.

7,Model Evaluation:

 Test the trained model using the validation/test dataset.

 Use performance metrics (like accuracy, precision, recall, F1-score, etc.) to evaluate how
well the model predicts on unseen data.

8,Model Tuning:

 Fine-tune the model's hyperparameters, structure, or features based on evaluation results.

 This may involve techniques such as cross-validation.

9, Deployment:

 Once the model is trained and evaluated, it can be deployed to make predictions on new
data in a production environment.

10, Monitoring and Maintenance:

 Continuously monitor the model's performance over time.

 Update the model and data as necessary to ensure it remains relevant and accurate.

Attributes or features are individual measurable properties or characteristics of the data being used in
the machine learning model. They are the input variables that the model uses to make predictions.

Types of Features:
 Numerical Features: Continuous numerical values (e.g., age, temperature, salary) that
can be further categorized into:
o Continuous: Values can take on any real number (e.g., height, price).
o Discrete: Countable values (e.g., number of children, number of cars).
 Categorical Features: Represent discrete categories or groups (e.g., gender, color, city).
They can be further divided into:
o Nominal: No inherent order (e.g., red, blue, green).
o Ordinal: There is an order or ranking (e.g., ratings from 1 to 5).
 Binary Features: A specific type of categorical feature that has only two values (e.g.,
yes/no, true/false).
 etc

6. What is feature engineering ?

Traditional ML algorithms require carefully handcrafted features also called feature engineering. It uses
external feature extraction algorithms and the extracted features depend on the algorithms.

Feature Engineering is a crucial step in the machine learning (ML) process that involves creating,
selecting, and transforming features (attributes) from raw data to improve the performance of machine
learning models. The goal of feature engineering is to provide the models with the most informative and
relevant data, enabling them to make better predictions or classifications.

Feature engineering is an iterative and creative process that requires domain knowledge, analytical
skills, and a deep understanding of the data. It plays an essential role in building effective machine
learning models and is often what distinguishes successful models from those that fail to perform well.

TO INSTRUCTOR SIMON H.
DUE DATE DECEMBER 20

Machine Learning Assignment
No ratings yet
Machine Learning Assignment
55 pages
JNTUK R20 ML UNIT-I (Chapter-I)
No ratings yet
JNTUK R20 ML UNIT-I (Chapter-I)
9 pages
DL Unit-2
No ratings yet
DL Unit-2
24 pages
Machine learning assignment (3) (1)
No ratings yet
Machine learning assignment (3) (1)
5 pages
Machine Learning Assignment (1)
No ratings yet
Machine Learning Assignment (1)
5 pages
dbms-10 marks
No ratings yet
dbms-10 marks
32 pages
MACHINE LEARNING 1-5 (Ai &DS)
100% (1)
MACHINE LEARNING 1-5 (Ai &DS)
60 pages
Supervised Learning in Machine Learning
No ratings yet
Supervised Learning in Machine Learning
6 pages
Machine learning_question bank
No ratings yet
Machine learning_question bank
45 pages
ML 22-23 Sem, GPT
No ratings yet
ML 22-23 Sem, GPT
14 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
13 pages
DSF - UNIT III Notes
No ratings yet
DSF - UNIT III Notes
17 pages
Fam QB Ans
No ratings yet
Fam QB Ans
9 pages
ML Unit 2
No ratings yet
ML Unit 2
18 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
112 pages
SML Updated UNIT 4
No ratings yet
SML Updated UNIT 4
44 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
10 pages
Regression
No ratings yet
Regression
24 pages
Machine Learning Question Bank - (NOT GIVEN BY MAM)
No ratings yet
Machine Learning Question Bank - (NOT GIVEN BY MAM)
50 pages
ML Module 1
No ratings yet
ML Module 1
12 pages
Key Terms in Machine Learning
No ratings yet
Key Terms in Machine Learning
6 pages
Untitled
No ratings yet
Untitled
11 pages
Lecture 1 introduction PM (1)
No ratings yet
Lecture 1 introduction PM (1)
21 pages
Data Science Important Interview Questions & Answers✅
No ratings yet
Data Science Important Interview Questions & Answers✅
19 pages
Unit 1 AAM
No ratings yet
Unit 1 AAM
16 pages
ML_DA
No ratings yet
ML_DA
55 pages
Machine Learning
No ratings yet
Machine Learning
34 pages
Notes XII AI.docx
No ratings yet
Notes XII AI.docx
11 pages
ML Unit 2
No ratings yet
ML Unit 2
33 pages
UNIT 1
No ratings yet
UNIT 1
4 pages
Predictive Analysis 1
No ratings yet
Predictive Analysis 1
22 pages
Unit-I
No ratings yet
Unit-I
23 pages
ML Question Bank Solution
No ratings yet
ML Question Bank Solution
95 pages
DSF Unit 4
No ratings yet
DSF Unit 4
12 pages
_ML cheet
No ratings yet
_ML cheet
14 pages
Unit 5 Intro To Machine Learning
No ratings yet
Unit 5 Intro To Machine Learning
25 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
7 pages
Unit 3
No ratings yet
Unit 3
13 pages
AIML-UNIT-3
No ratings yet
AIML-UNIT-3
17 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
64 pages
20CB913 Machine Learning Module 2
No ratings yet
20CB913 Machine Learning Module 2
52 pages
Approach Towards Model Evaluation, Model Selection
No ratings yet
Approach Towards Model Evaluation, Model Selection
13 pages
ML-Unit 2
No ratings yet
ML-Unit 2
15 pages
Sample Paper For The Machine Learning Course Ajay Sharma
No ratings yet
Sample Paper For The Machine Learning Course Ajay Sharma
19 pages
Introduction to ML Unit-1 PPT
No ratings yet
Introduction to ML Unit-1 PPT
90 pages
Introduction to Machine Learning Basics
No ratings yet
Introduction to Machine Learning Basics
12 pages
LECTURE-2
No ratings yet
LECTURE-2
36 pages
Introduction To AI - Part Three
No ratings yet
Introduction To AI - Part Three
7 pages
Unit - 1 1.introduction To ML
No ratings yet
Unit - 1 1.introduction To ML
74 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
3 pages
ml_unit1
No ratings yet
ml_unit1
31 pages
unit 1 ml pdf
No ratings yet
unit 1 ml pdf
19 pages
In Depth Explanation of Machine Learning Concepts
No ratings yet
In Depth Explanation of Machine Learning Concepts
3 pages
Book Machine Learning Finance Python
100% (1)
Book Machine Learning Finance Python
75 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
26 pages
General AI Concepts
No ratings yet
General AI Concepts
6 pages
ML Short U1-4
No ratings yet
ML Short U1-4
60 pages
Ai Unit-4 ML
No ratings yet
Ai Unit-4 ML
4 pages
(AIML) : Pimpri Chinchwad College of Engineering & Research, Ravet
No ratings yet
(AIML) : Pimpri Chinchwad College of Engineering & Research, Ravet
9 pages
ML Unit-2
No ratings yet
ML Unit-2
17 pages
Unit1 ML
No ratings yet
Unit1 ML
15 pages
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
Lecture Notes For Chapter 4: by Tan, Steinbach, Kumar
No ratings yet
Lecture Notes For Chapter 4: by Tan, Steinbach, Kumar
107 pages
1708443470801
No ratings yet
1708443470801
71 pages
Guarding Barlow Twins Against Overfitting With Mixed Samples
No ratings yet
Guarding Barlow Twins Against Overfitting With Mixed Samples
17 pages
Unit 5-1
No ratings yet
Unit 5-1
6 pages
Predicting Pavement Structural Condition Using Machine Learning M
No ratings yet
Predicting Pavement Structural Condition Using Machine Learning M
55 pages
ML Question Bank - Beena Kapadia
No ratings yet
ML Question Bank - Beena Kapadia
3 pages
House_Price_Prediction_using_AI[1]
No ratings yet
House_Price_Prediction_using_AI[1]
12 pages
The Power of Deep Learning Techniques for Predicting Student Performance in Virtual Learning Environments a Systematic Literature Review
No ratings yet
The Power of Deep Learning Techniques for Predicting Student Performance in Virtual Learning Environments a Systematic Literature Review
29 pages
Crop Yield Prediction Using ML Algorithms
No ratings yet
Crop Yield Prediction Using ML Algorithms
8 pages
An Overview On Application of Machine Learning Techniques in Optical Networks
No ratings yet
An Overview On Application of Machine Learning Techniques in Optical Networks
26 pages
ML PDF
No ratings yet
ML PDF
17 pages
Fundamentals of Machine Learning II
No ratings yet
Fundamentals of Machine Learning II
13 pages
Btech III Year i Semester (Ar20)
No ratings yet
Btech III Year i Semester (Ar20)
7 pages
Karthik Nambiar 60009220193
No ratings yet
Karthik Nambiar 60009220193
9 pages
Ai Foundation Syllabus
No ratings yet
Ai Foundation Syllabus
18 pages
1-s2.0-S0140700723001524-main
No ratings yet
1-s2.0-S0140700723001524-main
16 pages
2023-24 AIML ML Mid-Semester Regular QP Anwer-Keys
No ratings yet
2023-24 AIML ML Mid-Semester Regular QP Anwer-Keys
4 pages
Train Adaptive Neuro
No ratings yet
Train Adaptive Neuro
10 pages
The Friendly Data Science Handbook 2020
No ratings yet
The Friendly Data Science Handbook 2020
17 pages
2023-Effort Estimation in Agile Software Development Using Autoencoders
No ratings yet
2023-Effort Estimation in Agile Software Development Using Autoencoders
7 pages
Decision Tree Version 3
No ratings yet
Decision Tree Version 3
16 pages
Question Set Machine Learning A Revolution in Risk Management and Compliance
100% (11)
Question Set Machine Learning A Revolution in Risk Management and Compliance
11 pages
Ocean Engineering: Pin Zhang, Zhen-Yu Yin, Yuanyuan Zheng, Fu-Ping Gao
No ratings yet
Ocean Engineering: Pin Zhang, Zhen-Yu Yin, Yuanyuan Zheng, Fu-Ping Gao
13 pages
1 s2.0 S0169207020300224 Main
No ratings yet
1 s2.0 S0169207020300224 Main
19 pages
AC-ED L04 - Logistic Regression, Regularization
No ratings yet
AC-ED L04 - Logistic Regression, Regularization
80 pages
Final QB ML PT1
No ratings yet
Final QB ML PT1
2 pages
Dyslexia Prediction Using Machine Learning
No ratings yet
Dyslexia Prediction Using Machine Learning
9 pages
CS229 Bias-Variance and Error Analysis: Yoann Le Calonnec October 2, 2017 1 The Bias-Variance Tradeoff
No ratings yet
CS229 Bias-Variance and Error Analysis: Yoann Le Calonnec October 2, 2017 1 The Bias-Variance Tradeoff
5 pages