Machine Learning Lab Manual
Machine Learning Lab Manual
Dr. Rashmi M
Faculty in Computer Science,
GFGC, T.Dasarahalli, Bengaluru.
NEP- BCA Sixth Semester LAB- (CA-C29L)
(NEP Scheme) 2023-2024
Lab Program 1: Install and et up Python and essential libraries like NumPy and pandas.
Installation of Python
Type “Python download” in the Google search bar and press Enter key. In the list of links
shown, select the very link or click on the official website link:
https://www.python.org/downloads/
Choose the correct link for your device from the options provided: either Windows installer
(64-bit) or Windows installer (32-bit) and proceed to download the executable file.
Once you have downloaded the installer, open the .exe file, such as python-3.11.5-
amd64.exe, by double-clicking it to launch the Python installer. Choose the option to install
the launcher for all users by checking the corresponding checkbox, so that all users of the
computer can access the Python launcher application. Enable users to run Python from the
command line by checking the Add python.exe to PATH checkbox. After Clicking the Install
Now Button the setup will start installing Python on your Windows system.
After successful installation of Python, close the installation window. You can check if the
installation of Python was successful by using either the command line or the Integrated
Development Environment (IDLE), which you may have installed. To access the command line,
click on the Start menu and type “cmd” in the search bar.
Then click on Command Prompt, type the command “python –V” or “python –version”. You
can see installed version of Python on your system.
Go to Python Integrated Development Environment (IDLE). In Windows search bar, type IDLE
and you can see “IDLE (Python 3.11.64- bit)”. Open IDLE on the IDLE screen itself you can see
version. This gives the conformation of successful installation of python.
It is defined as a Python package used for performing the various numerical computations
and processing of the multidimensional and single-dimensional array elements. The
calculations using Numpy arrays are faster than the normal Python array. It is also capable of
handling a vast amount of data and convenient with Matrix multiplication and data reshaping.
Steps to install Numpy is,
Or
Or
Step 4: On the successful installation, you can type following two commands at command
prompt. If python prompt “>>>” appears then package is successfully installed.
1. C:\Users\DELL>python
2. >>>import numpy
Pandas is a very popular library for working with data (its goal is to be the most powerful and
flexible open-source tool, and in our opinion, it has reached that goal). DataFrames are at the
center of pandas. A DataFrame is structured like a table or spreadsheet. The rows and the
columns both have indexes, and you can perform operations on rows or columns separately.
It can perform five significant steps required for processing and analysis of data irrespective
of the origin of the data, i.e., load, manipulate, prepare, model, and analyze. Steps to install
pandas is,
Or
Or
Step 4: On the successful installation, you can type following two commands at command
prompt. If python prompt “>>>” appears then package is successfully installed.
1. C:\Users\DELL>python
2. >>>import pandas
Write a python program to show the installed library versions to provide conformation of
successful installing.
import numpy
import pandas
print("numpy library version is: ")
print(numpy.__version__) #please type two underscore symbols.
print("numpy library is successfully installed")
print(" ")
Program Output:
Scikit-learn (Sklearn) is the most useful and robust library or tool for data analysis and
statistical modeling in Python. It provides a wide range of efficient tools such as
classification, regression, and clustering and dimensionality reduction via a consistence
interface in Python. This library, which is largely written in Python, is built upon means
requires, NumPy, Pandas, SciPy and Matplotlib libraries. Before installing scikit-learn, ensure
that you have NumPy and SciPy installed. The latest version of Scikit-learn is 1.1 and it
requires Python 3.8 or newer version of pythonX. Features of Scikit-learn
Simple and efficient tools for data mining and data analysis. It features various
classification, regression, and clustering algorithms including support vector
machines, random forests, gradient boosting, k-means, etc.
Accessible to everybody and reusable in various contexts.
Built on the top of NumPy, Pandas, SciPy, and matplotlib (or seaborn). Hence they are
the essential libraries and tools required for Scikit-learn Library.
NumPy Library:
It is defined as a Python package used for performing the various numerical computations
and processing of the multidimensional and single-dimensional array elements. The
calculations using Numpy arrays are faster than the normal Python array. It is also capable of
handling a vast amount of data and convenient with Matrix multiplication, linear algebra and
data reshaping. NumPy provides functions for generating random numbers.
Pandas Library:
Pandas is a very popular library for working with data for data manipulation and analysis.
DataFrames and Series are at the center of pandas. A Series is a 1D labelled array data
structure and a DataFrame is a 2D labelled structured like a table or spreadsheet. The rows
and the columns both have indexes, and you can perform operations on rows or columns
separately. It can perform five significant steps required for processing and analysis of data
irrespective of the origin (CSV, Excel, etc.,) of the data, i.e., loading, manipulate, prepare,
model and analyze.
Pandas integrates well with data visualization libraries like Matplotlib and Seaborn, allowing
you to create informative visualizations of your data. These visualizations can be instrumental
in understanding data patterns, identifying trends, and uncovering potential issues.
Mathplotlib Library:
Matplotlib is a very popular Python library for data visualization and Plots. Like Pandas, it is
not directly related to Machine Learning. It particularly comes in handy when a programmer
wants to visualize the patterns in the data. It is a 2D plotting library used for creating 2D
graphs and plots. A module named pyplot makes it easy for programmers for plotting as it
provides features to control line styles, font properties, formatting axes, etc. It provides
various kinds of graphs and plots for data visualization, viz., histogram, error charts, bar
chats, etc.,
Scipy Library:
SciPy an essential library for Scikit-learn is built upon Numpy library. It contains different
modules for scientific and technical computing such as optimization, linear algebra,
integration, interpolation and statistics. Its integration with NumPy array simplifies scientific
computations, enabling tasks like solving differential equations, signal processing, and
statistical optimization.
SciPy is also very useful for image manipulation. Remember, we also come across Scipy stack
which is different from the SciPy library (used to extend the capabilities of Scipy). The SciPy is
one of the core packages that make up the SciPy stack.
Lab Program 3: Install and set up scikit-learn and other necessary tools.
PIP is a package manager for Python, which means it allows you to install and manage
libraries and dependencies that are supplemental to the standard library. (A package contains
all the files you need for a module, and modules are Python code libraries that you can include
in your projects.)
PIP3 is also a package manager, designed to replace PIP to solve few problems caused by it.
Latest versions of python 3.x allows the use of pips command for installing python libraries.
Scikit-learn is the most useful machine learning library. It provides modules for data analysis
and statistical modelling. It provides a wide range of efficient tools such as classification,
regression, and clustering and dimensionality reduction via a consistence interface in
Python. This library, which is largely written in Python, is built upon following essential
libraries: NumPy, Pandas, SciPy and Matplotlib libraries.
Or
Or
Step 4: On the successful installation, you can type following two commands at command
prompt. If python prompt “>>>” appears then package is successfully installed.
1. C:\Users\DELL>python
2. >>>import numpy
Or
Or
Step 4: On the successful installation, you can type following two commands at command
prompt. If python prompt “>>>” appears then package is successfully installed.
1. C:\Users\DELL>python
2. >>>import pandas
Or
Or
Step 4: On the successful installation, you can type following two commands at command
prompt. If python prompt “>>>” appears then package is successfully installed.
1. C:\Users\DELL>python
2. >>>import scipy
Or
Or
Step 4: On the successful installation, you can type following two commands at command
prompt. If python prompt “>>>” appears then package is successfully installed.
1. C:\Users\DELL>python
2. >>>import matplotlib
Or
Or
Step 4: On the successful installation, you can type following two commands at command
prompt. If python prompt “>>>” appears then package is successfully installed.
1. C:\Users\DELL>python
2. >>>import sklearn
Write a python program to show the Scikit-learn and essential installed libraries’ versions
to provide conformation of successful installing.
import numpy
import pandas
import scipy
import matplotlib
import sklearn
Program Output:
Lab Program 4: Write a program to Load and explore the dataset of .CVS and excel files
using pandas.
Note: Before execution of this program, please check pandas and openpyxl libraries are
installed. The “openpyxl” library is used to load the excel files.
Output:
Lab Program 5: Write a program to visualize the dataset to gain insights using Matplotlib or
Seaborn by plotting scatter plots, bar charts.
Note: Before execution of this program, please check pandas, mathplotlib and seaborn
libraries are installed. The “seaborn” library is used for barchart, a data visualization. And
“mathplotlib” library is used for scattered plot, a data visualization. The “pandas” library is
essential for data visualization libraries for its successful implementation.
Program Output:
Note: Before execution of this program, please check pandas, sklearn libraries are
installed. Dataset file used in the program is “Dataset.csv”.
CA-C29L: Machine Learning Lab BCA VI Sem
Program Output:
ataset file=”Dataset.csv”
Note: Before execution of this program, please check pandas, sklearn libraries are installed.
Dataset file used is “ExampleKNN.csv”
CA-C29L: Machine Learning Lab BCA VI Sem
Program Output:
Output:
Output:
Lab Program 10. Write a program to Implement K-Means clustering and Visualize clusters.
Output: