Many students nowadays are pursuing their education outside of their home nations. These
international students are primarily interested in the United States of America, Canada, Ireland, and
Germany. India and China account for the majority of international students in the United States. The
number of Indian students pursuing postgraduate education in the United States has surged
dramatically during the last decade. With the growing number of international students studying in
the United States, each candidate must compete fiercely for admission to their preferred university.
In educational institutions, the issue of student admittance is critical. This research focuses on using
machine learning models to predict a student's chances of being accepted into a master's degree.
Students will be able to see ahead of time if they have a probability of being admitted. This project
predicts a student's admittance based on a variety of factors such as the university's rating, the student's
undergraduate GPA, GRE score, research experience, and so on. This forecasts whether or not the
student will get admitted to the university of his choice. I employed a variety of methods in this study,
including linear regression, artificial neural networks (ANN), random forest regression, and decision
tree regression. Finally, I put this model on a Web-based GUI to check a student's acceptance
possibilities, and it worked perfectly.
Admission Prediction Analysis About the Department
2.3 Testing
Testing was done according to the Corporate Standards. As each component was being built,
Unit testing was performed in order to check if the desired functionality is obtained. Each
component in turn is tested with multiple test cases to verify if it is properly working. These
unit tested components are integrated with the existing built components and then integration
testing is performed. Here again, multiple test cases are run to ensure the newly built
component runs in co-ordination with the existing components. Unit and Integration testing are
iteratively performed until the complete product is built. Once the complete product is built, it
is again tested against multiple test cases and all the functionalities.
The product could be working fine in the developer’s environment but might not necessarily
work well in all other environments that the users could be using. Hence, the product is also
tested under multiple environments (Various operating systems and devices). At every step, if
a flaw is observed, the component is rebuilt to fix the bugs. This way, testing is done
hierarchically and iteratively.
➢ Python Programming
➢ Artificial Intelligence
➢ Machine Learning Algorithms
This section describes, in brief, the data that has been used for the research. Data from
multiple sources was used in this project, the major amount of data was extracted from public
website Yocket (, data regarding the rankings, fees and enrolment in colleges was
obtained from a leading educational consultancy firm The Mentors Circle in India. Data from
both the sources was integrated together to form a staging data-set. For predicting the chance
of a student getting shortlisted in universities the final data-set was divided into multiple data-
sets each representing a particular university. For predicting the list of universities suitable for
students based on their profile data of all the students the staging data-set was updated only to
have records of students who had successfully secured admission in the universities. Below
table shows the different features of the data-sets.
GRE Marks scored by the student in GRE
TOEFL Score Marks scored by the student in English Proficiency Test
Ranking The University Ranking
SOP Quality of Statement of Purpose or Statement of Intent
LOR Quality of Letter of Recommendations documents
CGPA Result of the student in their Undergraduate Course
Research Relevant experience in Research field.
Admission Prediction Analysis Task Performed
Data related to the college ranking was collected in .csv format, the data related to students’
profile was extracted using data extraction tool provided by (Mozenda (n.d.)) in .csv files. Data
being from public portal had multiple records with missing and irrelevant values; data cleaning
was performed in Microsoft Excel by deleting the records having unwanted and missing values.
Unwanted columns were removed from the data-set. Once the data-set was cleaned data was
transformed to be suitable for the model. The original data-set had TOEFL score as a
representation of language, to have a consistent metrics for the language score. Similarly, the
Undergraduate score of the students were represented in terms of percentage and CGPA; all the
records of percentage were converted to CGPA by multiplying percentage score by 9.5.
➢ Linear Regression
Linear Regression is a machine learning algorithm based on supervised learning. It performs a
regression task. Regression models a target prediction value based on independent variables. It is
mostly used for finding out the relationship between variables and forecasting. Different
regression models differ based on – the kind of relationship between dependent and independent
variables they are considering, and the number of independent variables getting used.
• Training Phase
As the final task, a main project was developed using machine learning models to predict the
chance of a student to be admitted to a master’s program. This will assist students to know in
advance if they have a chance to get accepted. This project predicts the admission of a student
based on different features including university rating, student’s undergraduate GPA, GRE
score, research experience and etc. This predicts that how much chances are there that the
student will get admission in his selected university or not. In this project I have used multiple
algorithms including linear regression, artificial neural network (ANN), random forest
regressor, decision tree regressor. In the end I have deployed this model on a Web Based GUI
to check student’s admission chances and these models are working fine.
4.1 Experience
According to our internship experience, Knowledge Solutions India offers a positive work
culture and courteous personnel at all levels, from staff to management. The instructors are
knowledgeable in their subjects and treat everyone fairly. There are no distinctions made
between new graduates and corporate executives, and everyone is treated equally. Every
activity, no matter how difficult or simple, requires a lot of teamwork, and the mood is always
peaceful and welcoming. Because of the excellent communication and support available, there
is a lot of room for self-improvement. Interns were well treated and educated, and all of our
questions and concerns about the training or the firms were addressed. All in all, Knowledge
Solutions India was a great place for a fresher to start career and also for a corporate to boost
his/her career. It has been a great experience to be an intern in such a reputed organization.
Admission Prediction Analysis Reflection
(Bibodi et al. (n.d.)) used multiple machine learning models to create a system that would help
the students to shortlist the universities suitable for them also a second model was created to
help the colleges to decide on enrolment of the student. Nave Bayes algorithm was used to
predict the likelihood of success of an application, and multiple classification algorithms like
Decision Tree, Random Forest, Nave Bayes and SVM were compared and evaluated based on
their accuracy to select the best candidates for the college.
GRADE system was developed by (Waters and Miikkulainen (2013)) to support the admission
process for the graduate students in the University of Texas Austin Department of Computer
Science. The main objective of the project was to develop a system that can help the admission
committee of the university to take better and faster decisions. Logistic regression and SVM
were used to create the model, both models performed equally well and the final system was
developed using Logistic regression due to its simplicity. The time required by the admission
committee to review the applications was reduced by 74% but human intervention was
required to make the final decision on status if the application. (Nandeshwar et al. (2014))
created a similar model to predict the enrolment of the student in the university based on the
factors like SAT score, GPA score, residency race etc. The Model was created using the
Multiple Logistic regression algorithm, it was able to achieve accuracy rate of 67% only.
• Limitation of this system only relied on the GRE, TOEFL and Undergraduate Score
of the student and missed on taking into consideration other important factors like
SOP and LOR.
• The existing system lagged the factor of the research work in the related field.
The principal objective of the research is to help the students who are aspiring to pursue their
education in the USA. The Graduate Admissions Prediction system will help them to evaluate
the chances of success in any university without being dependent on any education consultancy
firm. It will help them in saving a huge amount of time and money spent in the application
process. Also, it will help them to limit the number of applications made by the students by
suggesting them the best universities where they have high chances of securing admission
thereby by saving the amount of money spent by the students by applying in universities where
they have less chance to secure admit based on their profile.
• Information about the prediction analysis is clear to enter all the required information
to predict the admission.
• The user interface code will interact with the Linear Regression, ANN, random forest
regressor, decision tree regressor to provide the users with the required result.
• The ANN algorithm and Linear Regression Algorithm will be used to determine the
chance of the student of securing admission in a particular university based on his/her
• Once the models have been executed the result will be provided to the student as the
output on the user interface.
The machine learning models are trained with the given dataset. The machine learning models
used in this project are linear regression, artificial neural network (ANN), random forest
regressor, decision tree regressor. Once the models are trained, the student’s profile details are
entered to predict the chances of getting the admit to the university.
4.2 Implementation
4.2.1 Modules
2. Data Visualization
Data Visualization: Using data visualization, I summarized the data with graphs, pictures and
maps, so that the human mind has an easier time processing and understanding the given data. Data
visualization plays a significant role in the representation of both small and large data sets, but it is
especially useful when we have large data sets, in which it is impossible to see all of our data, let
alone process and understand it manually.
Training and Testing: In this project, datasets are split into two subsets. The first subset is known
as the training data - it's a portion of our actual dataset that is fed into the machine learning model
to discover and learn patterns. In this way, it trains our model. The other subset is known as the
testing data.
Train and Evaluate Linear Regression: Simple linear regression is an approach for predicting
a quantitative response using a single feature (or "predictor" or "input variable"). It takes the
following form: y=β0+β1x
4.4 Screenshots
The major goal of this study was to create a prototype of a system that students interested in studying in
the United States might use. For this study, several machine learning algorithms were created and used.
When compared to the Logistic regression model, Linear Regression demonstrated to be the greatest fit
for system development. The programme was designed with a basic user interface to make it interactive
and simple to use for non-technical people.
The ultimate goal of the study was met since the approach allows students to save time and money that
they would otherwise spend on education advisors and application fees for colleges where they have a
lower chance of being accepted. It will also assist students in making better and faster decisions on
university applications.
• Bibodi, J., Vadodaria, A., Rawat, A. and Patel, J. (n.d.). Admission Prediction System
Using Machine Learning.