Internship Report
Internship Report
Internship Report
submitted
in partial fulfillment
Bachelor of Technology
I hereby declare that the work, which is being presented in the summer training report,
entitled “Machine learning Internship” in partial fulfillment for the award of Degree of
“Bachelor of Technology” in Department of Computer Science and Engineering, and
submitted to the Department of Computer Science and Engineering, Amity School of
Engineering & Technology, Amity University, Rajasthan. This is a record of my own
training preparing under the guidance of Dr Sanjay Jain
Ayush Paul
Computer Science and Engineering
Enrolment No.: A20405220124
Amity University Rajasthan, Jaipur
Counter Signed by
Dr Sanjay Jain
Offer Letter
Course Certificate
TABLE OF CONTENTS
Abstract ............................................................................................... 06
Introduction ........................................................................................ 08
About The Company ................................................................. 08
About Internship ........................................................................ 08
Rationale and Goals of Project .................................................. 09
Chapter 1: Introduction to Projects .................................................... 10
Project 1 ..................................................................................... 11
Approach to the System ................................................... 11
Sections ............................................................................ 13
Technology Used in the project ....................................... 15
Supported Operating System ........................................... 16
Chapter 2: Literature Review ............................................................. 23
Chapter 3: Coding. ............................................................................. 26
Conclusion… ...................................................................................... 29
Bibliography…………………………………………… .................. 30
Abstract
I worked as an intern trainee at Teachnook. The training took place during the summer
vacations. It started from 1st August 2022 and went on till 30th September, thus it was
internship of 8 weeks. The internship was focused primarily on Python with Machine
Learning. Main objective and aim of internship were to get experience of Machine Learning
domain as a whole and how to build a career around it.
Internship provided me with lots of hands-on experience with Python and ML. The main
achievement of this internship was learning discipline to work. Secondly, have good
knowledge of Python language. There I saw real time (live) project which gave me an idea of
how to apply theoretical knowledge into something useful and these are things that really
matters.
During my Internship I worked on two projects. My work was to create a working model of a
clock and apply Machine Learning on a particular dataset.
About Internship:
Internship Description
My Internship was primarily focused upon training and implementation on real life projects.
Me and my course mates were first trained in basics of Python language and then were
introduced with the ins and outs of Machine Learning.
Profile Requirements
Should have a basic knowledge of a programming language
Should have a good knowledge of mathematical analytical skills.
Rationale and Goals of Project
The project under which I was working can be subdivided into the following goals:
• Choosing and cleaning of dataset provided
• Applying Machine Learning algorithms on the cleaned dataset
Chapter 1: Introduction to Projects
Project 1: Building a Machine Learning Model To Predict Housing Price in
Bengaluru
My project was to build a machine learning model with at least 75% accuracy, to predict the
price of a house in Bengaluru.
Aim:
To analyze data
Applying measures to clean the data and make it uniform
Check the accuracy of the model
Approach
In this project, we will be using the time module and its sleep() function. Follow the below steps to
create a countdown timer:
Step 1: Choose the dataset which best suits our needs.
Step 2: Import the necessary libraries (in our case it was pandas and numpy).
Step 3: storing the data from the dataset in a dataframe.
Step 4: Showing and determining the required aspects/columns from the dataset.
Step 5: cleaning of the data by removing entries with null values or combining certain columns or by
converting some data to make the dataset universally acceptable and uniform.
Step 6: After cleaning the dataset we divide the dataset in 80-20 ration, where 80% of the data is used
to train the model and the rest is used to test the outcome of the data.
Step 7: Creating a model to predict the prices.
Step 8: Testing the model to check its accuracy.
Step 9: Testing other algorithms to see if the model created provides the best results.
Step 10: Deployment.
Functionality:
Windows :This project can easily be configured on windows operating system. For
running this project on Windows system, you will have to install Python 2., PIP,
Django.
Linux : We can run this project also on all versions of Linux operating system
Mac : We can also easily configured this project on Mac operating system
CHAPTER 2: LITERATURE REVIEW
2.1 Python: -
Python is an interpreted high-level programming language for general-purpose programming.
Created by Guido van Rossum and first released in 1991, Python has a design philosophy that
emphasizes code readability, notably using significant whitespace. It provides constructs that
enable clear programming on both small and large scales. In July 2018, Van Rossum stepped
Python features a dynamic type system and automatic memory management. It supports multiple
Python interpreters are available for many operating systems. CPython, the reference
model, as do nearly all of Python's other implementations. Python and CPython are managed by
Python has a simple, easy to learn syntax emphasizes readability hence, it reduces the cost of
program maintenance. Also, Python supports modules and packages, which encourages program
The diverse application of the Python language is a result of the combination of features which
give this language an edge over others. Some of the benefits of programming in Python include:
Machine learning approaches are traditionally divided into three broad categories, which
correspond to learning paradigms, depending on the nature of the "signal" or "feedback"
available to the learning system:
Supervised learning
Unsupervised Learning
Reinforcement learning
Supervised Learning:
Supervised learning algorithms build a mathematical model of a set of data that contains both
the inputs and the desired outputs. The data is known as training data and consists of a set of
training examples. Each training example has one or more inputs and the desired output, also
known as a supervisory signal.
In the mathematical model, each training example is represented by an array or vector,
sometimes called a feature vector, and the training data is represented by a matrix. Through
iterative optimization of an objective function, supervised learning algorithms learn a function
that can be used to predict the output associated with new inputs. An optimal function will
allow the algorithm to correctly determine the output for inputs that were not a part of the
training data. An algorithm that improves the accuracy of its outputs or predictions over time
is said to have learned to perform that task.
Unsupervised Learning:
Unsupervised learning algorithms take a set of data that contains only inputs, and find
structure in the data, like grouping or clustering of data points. The algorithms, therefore,
learn from test data that has not been labeled, classified, or categorized. Instead of responding
to feedback, unsupervised learning algorithms identify commonalities in the data and react
based on the presence or absence of such commonalities in each new piece of data. A central
application of unsupervised learning is in the field of density estimation in statistics, such as
finding the probability density function.
Reinforcement Learning:
Reinforcement learning is an area of machine learning concerned with how software agents
ought to take actions in an environment so as to maximize some notion of cumulative reward.
Due to its generality, the field is studied in many other disciplines, such as game theory,
control theory, operations research, information theory, simulation-based optimization, multi-
agent systems, swarm intelligence, statistics, and genetic algorithms. In machine learning, the
environment is typically represented as a Markov decision process (MDP). Many
reinforcement learning algorithms use dynamic programming techniques. Reinforcement
learning algorithms do not assume knowledge of an exact mathematical model of the MDP,
and are used when exact models are infeasible. Reinforcement learning algorithms are used in
autonomous vehicles or in learning to play a game against a human opponent.
Models:
Performing machine learning involves creating a model, which is trained on some training
data and then can process additional data to make predictions. Various types of models have
been used and researched for machine learning systems.
Decision trees
Support-vector machines
Regression analysis
Bayesian networks
Gaussian processes
Genetic algorithms
The process of preparing and labelling the data is usually completed by a data scientist and is
often labour intensive. Unsupervised machine learning models on the other hand won’t need
labelled data, so the training dataset will just contain input variables or features. In both types of
machine learning the quality of data has a major effect on the overall effectiveness of the model.
The model learns from the data so poor-quality training data quality may mean the model is
ineffective once deployed. The data should be checked and cleaned so data is standardised, any
missing data is identified, and any outliers are detected.
Split the prepared dataset and perform cross validation
The real-world effectiveness of a machine learning model depends on its ability to generalise, to
apply the logic learned from training data to new and unseen data. Models are often at risk of
being overfitted to the training data, which means the algorithm is too closely aligned to the
original training data. The result will be a drop in accuracy or even a loss in function when
encountering new data in a live environment.
To counter this, the prepared data is usually split into training and testing data. The majority of the
dataset is reserved as training data (for example around 80% of the overall dataset), and a subset
of testing data is also created. The model can then be trained and built off the training data, before
being measured on the testing data. The testing data acts as new and unseen data, allowing the
model to be assessed for accuracy and levels of generalisation.
The process is called cross validation in machine learning, as it validates the effectiveness of the
model against unseen data. There are a range of cross validation techniques, categorised as either
exhaustive and non-exhaustive approaches. Exhaustive cross validation techniques will test all
combinations and iterations of a training and testing dataset. Non-exhaustive cross validation
techniques will create a randomised partition of training and testing subsets. The exhaustive
approach will provide more in-depth insight into the dataset but will take much more time and
resources in contrast to a non-exhaustive approach.
CHAPTER 3: CODING
Project 1: Building a machine learning model
In conclusion, the summer training was time well spent. I learned a lot, both technical and
nontechnical aspects. In the company environment I also got to know that my strength lies in
teamwork, analytical thinking and technical skills.
Machine Learning in future–
As we head towards an even more tech-driven future, Machine Learning is one of the best
career choices of the 21st century. It has plenty of job opportunities with a high-paying salary.
Also, the future scope of Machine Learning is on its way to make a drastic change in the
world of automation. Further, there is a wide scope of Machine Learning in India. Thus, you
can make a lucrative career in the field of Machine Learning to contribute to thus growing
digital world.
Bibliography
https://www.expert.ai/
https://www.geeksforgeeks.org/python-programming-language/
https://www.python.org/
open source platforms (GitHub)