Movie Recommendations
Movie Recommendations
Movie Recommendations
(NBA Accredited)
i
2020-2021
SUMMER INTERNSHIP REPORT
A REPORT SUBMITTED IN PARTIAL FULFILLMENT OF THE
BACHELOR OF ENGINEERING
IN
BY
VENNELA(1608-17-735-054)
2020-21
ii
DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING
Date:
Certificate
M. Ravi Kumar
Internship Coordinator Head of the Department
iii
Certificate
iv
v
vi
ACKNOWLEDGEMENT
Firstly, I earnestly thank our Principal Dr. D. Hanumantha Rao, sir for giving us this
incredible opportunity of doing an internship even during these unprecedented times.
I offer my gratitude to our HOD Dr. N. Srinivasa Rao,sir for his constructive criticism
which indeed helped me all through the internship.
It gives me immense pleasure to thank Bapuji Sir,(The Director Of Appleton
Innovations) for guiding us through the crux of machine learning and helping us out with
the projects.
I would express my sincere thanks to the Internship Co-Ordinator P. Ravi Kumar sir and
all my beloved teachers for their constant support and guidance.
I would also thank my fellow scribes Jesseica, Vaishnavi, Preethi Sriya and Divya Sree
for extending their help and bolstering me.
I’d also express my indebtedness to my parents and extended family without which this
wouldn’t have been possible.
(Vennela)
vii
ABSTRACT/EXECUTIVE SUMMARY
Problem: The advent of movie streaming services has made thousands of movies just a
click away. We now have movies not only from Hollywood, but also from international
cinema, documentaries, indie movies, etc. With so many movies at hand, the consumer
faces the dilemma of what to watch. At the end of the day, people just want to relax and
watch something that resonates with their mood, taste and style.
Solution: Three main approaches are used for our recommender systems.One is
Demographic Filtering i.eThey offer generalized recommendations to every user,based
on movie popularity genre.The System recommends the same movies to users with
similar demographic features.Since each user is different,this approach is considered to
be too simple.The basic idea behind this system is that movies that are more popular and
critically acclaimed will have a higher probability of being liked by the average audience.
Second is content-based filtering, where we try to profile the users interests using
information collected and recommend items based on that profile.The other is
collaborative filtering, where we try to group similar users together and use information
about the group to make recommendations to the use.
viii
INDEX
ix
1. INDUSTRY/ORGANIZATION PROFILE
VISION
“The company’s aim is to empower millions of students across the globe with the skills
required for the industry.”
MISSION
“They intend to create more than 20,000 Young Engineers across the country with
special focus on opportunities in Internet of Things (IOT), solar smart energy systems
and data analytics”.
1
2. INTERNSHIP OBJECTIVES/LEARNING
OBJECTIVES
The main objective of an internship is to expose you to a particular job and a profession
or industry. While we might have an idea about what a job is like, we won't know until
we actually perform it if it's what we thought it was.
And the main objectives for this Internship are:
2
3. INTRODUCTION
A recommendation system is a type of information filtering system which attempts to
predict the preferences of a user and makes suggestion based on those preferences. There
are a wide variety of applications for recommendation systems. These have become
increasingly popular over the last few years and are now utilized in most of the online
platforms that we use. The content of such platforms varies from movies, music, books
and videos, to friends and stories on social media platforms and to products on e-
commerce websites, to people on professional and dating websites, to search results
returned on Google. Often, these systems are able to collect information about the users
choices and use this information to improve their suggestions in the future. For example,
Facebook can monitory our interaction with various stories on your feed in order to learn
what types of stories appeal to you. Sometimes, the recommender systems can make
improvements based on the activities of a large number of people. For example if
Amazon observes that a large number of customers who buy the latest Apple Mac-book
also buy a USB-C-to USB Adapter, they can recommend the Adapter to a new user who
has just added a Mac-book to his cart. Due to the advances in recommender systems,
users constantly expect good recommendations. They have a low threshold for services
that are not able to make appropriate suggestions. If a music streaming app is not able to
predict and play music that the user likes, then the user will simply stop using it. This has
led to a high emphasis by tech companies on improving their recommendation systems.
However, the problem is more complex than it seems. Every user has different
preferences and likes .In addition, even the taste of a single user can vary depending on a
large number of factors ,such as mood, season or type of activity the user is doing. For
example the type of music one would like to hear while exercising differs greatly from
the type of music she’d listen to when cooking dinner. Another issue that
recommendation systems have to solve is the exploration vs exploitation problem. They
must explore new domains to discover more about the user while still making the most of
what is already known about of the user. Three main approaches are used for our
recommender systems. One is Demographic Filtering offers generalized
3
recommendations to every user-based on movie popularity or genre. The System
recommends the same movies to users with similar demographic features. Since each
user is different this approach is considered to be too simple. The basic idea behind this
system is that movies that are more popular and critically acclaimed will have a higher
probability of being liked by the average audience. Second is content-based filtering
where we try to profile the users interests using information collected and recommend
items based on that profile. The other is collaborative filtering where we try to group
similar users together and use information about the group to make recommendations to
the user.
4
4. TOOLS EMPLOYED
1.Python
5
5. TYPES OF RECOMMENDTION SYSTEMS
● Sort the scores and recommend the best rated movie to the users.
We can use the average ratings of the movie as the score but using this won't be fair
enough since a movie with 8.9 average rating and only 3 votes cannot be considered
better than the movie with 7.8 as average rating but 40 votes. So, we'll be using
IMDB's weighted rating (wr) which is given as :-
6
Demographic Filtering
7
5.2 CONTENT-BASED FILTERING SYSTEMS:
In content-based filtering, items are recommended based on comparisons between
item profile and user profile. A user profile is content that is found to be relevant to
the user in form of keywords (or features). A user profile might be seen as a set of
assigned keywords (terms, features) collected by algorithm from items found
relevant (or interesting) by the user. A set of keywords (or features) of an item is the
Item profile. For example, consider a scenario in which a person goes to buy his
favorite cake ‘X’ to a pastry. Unfortunately, cake ‘X’ has been sold out and as a
result of this the shopkeeper recommends the person to buy cake ‘Y’ which is made
up of ingredients similar to cake ‘X’. This is an instance of content-based filtering
We will be using the cosine similarity to calculate a numeric quantity that denotes the
similarity between two movies. We use the cosine similarity score since it is
independent of magnitude and is relatively easy and fast to calculate. Mathematically, it
is defined as follows:
8
We are now in a good position to define our recommendation function. These
are the following steps we'll follow :-
● Get the list of cosine similarity scores for that particular movie with all movies.
Convert it into a list of tuples where the first element is its position and the second
is the similarity score.
● Sort the aforementioned list of tuples based on the similarity scores; that is, the
second element.
● Get the top 10 elements of this list. Ignore the first element as it refers to
self (the movie most similar to a particular movie is the movie itself).
9
5.3 COLLABORATIVE FILTERING BASED SYSTEMS:
Our content based engine suffers from some severe limitations. It is only capable of
suggesting movies which are close to a certain movie. That is, it is not capable of
capturing tastes and providing recommendations across genres.
Also, the engine that we built is not really personal in that it doesn't capture the
personal tastes and biases of a user. Anyone querying our engine for
recommendations based on a movie will receive the same recommendations for that
movie, regardless of who she/he is.
Therefore, in this section, we will use a technique called Collaborative Filtering to
make recommendations to Movie Watchers. It is basically of two types:-
10
User Based Filtering
11
Item based collaborative system
12
RESULTS
1. Demographic Filtering
Demographic Output
13
2. Content-based Filtering Systems
14
3. Collaborative filtering based systems
15
Collaborative Based Output_2
16
CONCLUSION
A hybrid approach is taken between context based filtering and collaborative filtering to
implement the system. This approach overcomes drawbacks of each individual
algorithm and improves the performance of the system. Techniques like Clustering,
Similarity and Classification are used to get better recommendations thus reducing mean
absolute error and increasing precision and accuracy. In future we can work on hybrid
recommender using clustering and similarity for better performance. Our approach can be
further extended to other domains to recommend songs, video, venue, news, books,
tourism and e-commerce platforms etc.
In this project, we developed the prototyping system for extracting movie features i.e.
topics. We trained a model on a collection of movie reviews and used the trained model
to find similar movies. Evaluation results shows that such an approach gives good result
even with a small movie collection. Results shows that the movie topics are efficient
features as they performs fairly well in capturing movie genre and mood.
Movie plot resultsare somewhat satisfactory but need descriptive plot information and
better methods that can capture the story-line. Our small sized movie corpus resulted in
very few overlap between actors. The topics as an explanation in movie recommendation
are quite useful but need to be fine-tuned with the ability to rate individual topics.
User rated movie topics could be used as a feedback to the system. Finally, movie topics
are efficient features for movie recommendation systems as they represent the semantic
patterns behind movies. With user movie reviews as data, movie topics capture the
essential movie aspects such as genre and mood. Our prototyping approach to feature
extraction has the potential to scale for a large number of movies.
17
SKILLS ACQUIRED
PROFESSIONAL SKILLS
Leadership qualities
Professionalism
Building relationships
Work ethics
Self confidence
SCIENTIFIC SKILLS
18
WEEKLY OVERVIEW OF INTERNSHIP ACTIVITIES
19
Date Day Name of the topic/Module Competed
20
Date Day Name of the topic/Module Competed
21
Date Day Name of the topic/Module Competed
4th 1/06/2020 Monday Programs using Eigen values.
w Eigen decomposition.
2/06/2020 Tuesday Explanation about singular value
e
decomposition and its applications.
e
3/06/2020 Wednesday Explained probability
k
a. covariance and correlation.
b. program for square matrix using machine
learning.
4/06/2020 Thursday Reformulation.
Implementation of linear algebra using
inversion method.
5/06/2020 Friday Introduction to scikit learn.
Model fitting in supervised and unsupervised
learning.
22
11/06/2020 Thursday Converting Text to numerical data conversion.
Tokenizing
23
Date Day Name of the topic/Module Competed
24
BIBLIOGRAPHY
25
26