Unit 5
Unit 5
Unit 5
Recommendation Evaluation:
Recommendation Validation:
Recommendation engines are advanced data filtering systems that predict which
content, products, or services a customer is likely to consume or engage with. One doesn’t
need to look far to see one in action. Every time someone chooses a TV show using Netflix’s
“You May Also Like…” feature or buys a product Amazon recommends, they’re using
powerful recommendation engines.
Accurate recommendations don’t appear out of thin air. Businesses must invest in
data solutions capable of analyzing a high volume of products and identifying patterns in
customer behavior. Only then can they unlock the true value of their customer data and make
recommendations that positively impact revenue.
Key takeaways
Recommendation engines are advanced data filtering systems that use behavioral data,
computer learning, and statistical modeling to predict the content, product, or services
customers will like.
Recommendation engines are tools that leverage predictive analytics to help companies
anticipate their customers’ wants and needs. The engines use machine learning and statistical
modeling to create advanced algorithms based on a business’s unique historical and
behavioral data. The resulting recommendations are based on some combination of:
Recommendations are most accurate when there’s a great volume of data at a company’s
disposal. The more active users a product has, the more data there is to compare behaviors
and preferences across demographics.
However, not every bit of data collected will be relevant or even reliable. Building
recommendations on bad data results in recommendations that are inaccurate and unhelpful.
The first step in creating a workable recommendation engine is adopting a proper data
management strategy and analytics stack that collects and verifies data before it is put to use.
Not every recommendation engine uses the same methodology to form predictions.
Recommenders typically achieve results using one of three types of data filtering: content-
based, collaborative filtering, or a combination of the two.
Content-based filtering
The recommender would then compare items historically purchased by the user or those
currently in their shopping cart to other similar or linked items. Attributes are weighted by the
number of items in the database that share the tag with more common tags receiving higher
rankings than uncommon ones. This weighting determines which items appear first in a list of
recommendations.
Content-based filtering doesn’t require the input of other customers to make predictions. It
bases its predictions on similarities within a customer’s own behavioral and historical profile.
A well-designed content-based filtering engine will identify specific quirks and interests that
may not have broad appeal to other customers.
A major drawback with this type of recommendation engine is it requires a great deal of
maintenance. Attributes must be added and updated constantly to keep recommendations
accurate—a daunting task for businesses with a high volume of product. Additionally, the
attributes themselves must be accurate. Labeling a Honeycrisp apple “red” is easy, but more
complex content may require a dedicated team of subject matter experts to correctly label
each individual product.
Collaborative filtering
This method of filtering is what’s used in “People who watched this show also watched…”
types of recommenders. Collaborative filtering uses behavioral data to determine what a
person will like based on how their preferences compare to other users. Whereas content-
based filtering focuses on linking products to other products, collaborative filtering builds
predictions by linking similar customer profiles.
For example, imagine using a video streaming platform that uses collaborative filtering.
When you go to find a movie, you create data based on a number of behaviors, including:
Movies you watch
Titles you select but ultimately do not watch
Selections you hover over
Searches you make
Rankings you give films
The recommender then effectively builds a user profile for you based on this data set. It then
compares your profile against a cohort of users who behave similarly. The resulting
predictions are based on the movies this cohort has consumed and enjoyed versus the actual
content of each film.
Collaborative filtering doesn’t require product feature information. This makes maintenance
less time-consuming than that of a content-based engine. However, a reliance on other
customers’ behaviors can create data gaps. Say no one interacts with your favorite movie on a
streaming service. A movie that’s perfectly suited to your interests won’t be recommended
because the recommendation engine won’t have any behavioral data with which to form a
prediction.
Hybrid filtering
Hybrid filtering attempts to address the shortcomings of both content-based filtering and
collaborative filtering by combining the two methods. As such, it’s the most effective of the
three types of recommendation systems.
Recommendation engines are used across a variety of industries, and have become a popular
means of improving both customer experience and a company’s bottom line.
E-COMMERCE
Recommendations based on things like location, season, price point and similar users are also
common tactics in e-commerce, and are used as a way to incentivize customers to keep
shopping.
SOCIAL MEDIA
Social media platforms like Facebook and Instagram use recommendation engines to suggest
friends or groups based on a user’s existing network, interests and location. They also use
them to show relevant posts and advertisements, depending on a user’s preferences.
For example, YouTube considers a viewer’s watch history and ratings to suggest new videos.
And TikTok considers videos the user has interacted with in the past, accounts and hashtags
they’ve followed, the type of content they create, and their location and language preferences
to determine what videos to show on their For You page.
MEDIA STREAMING
When a user browses movies and TV shows on a streaming platform like Netflix, Hulu or
Max, the recommendation engine analyzes their viewing history, searches and previous
ratings to suggest content they’re likely to watch and enjoy. Once a user finishes watching
that content, the recommendation engine suggests the next title to watch. All of this is a
useful way of keeping users engaged and reducing the time they spend searching for content.
Gaming platforms, like Steam and Playstation Store, and music streaming services, like
Spotify and SoundCloud, also use recommendation engines to suggest relevant content based
on a user’s preferences and historical data.
Benefits of Recommendation Engines
Recommendation engines can be beneficial both to the companies that deploy them and the
users that encounter them.
A more personalized experience can lead to more satisfied, engaged and loyal customers,
mainly because they are being fed the content or products they want without having to put in
the effort of finding it themselves.
After all, a lack of a recommendation engine creates a “pretty subpar experience” for
customers, as Amplitude’s Thompson put it. Without it, our social media feeds would be full
of content we don’t care about. And we’d have to search for every product, movie, show and
song ourselves, which would be a pretty time-consuming undertaking
Social media platforms, media streaming services and even news outlets all want people to
spend as much time as possible on their sites. Consistently providing relevant
recommendations of more videos to watch, songs to listen to and articles to read keeps users
hooked. This translates to more click-through rates, conversions and — as is often the case
with websites — more dollars.
BOOSTS REVENUE
Perhaps the biggest benefit of recommendation engines — on the business side, at least — is
that they can help platforms make more money. Not only do recommendation engines
incentivize people to make more purchases (a technique known as cross-selling), but they can
also suggest product alternatives and draw attention to items that have been abandoned in a
customer’s online shopping cart.
Even if a company isn’t in the business of selling physical products, recommendation engines
can still do wonders for their bottom line. For example, if Netflix’s recommendation engine
consistently feeds viewers content they enjoy watching, they’re less likely to cancel their
subscription or choose another streaming service, saving Netflix about $1 billion a year,
according to the company.
“If you’re an organization that’s looking to increase revenue, being able to provide tailored
experiences for your customers based off their likelihood to purchase or likelihood to
complete a particular action, drives growth for your business,” Thompson said.
A recommendation engine is only as good as the data it’s fed. If it doesn’t have accurate or
abundant information about users or items, it likely won’t work correctly.
“They’re limited in their knowledge,” Alexander Marmuzevich, founder and CTO of InData
Labs, told Built In. “They can’t propose something which doesn’t exist, they can’t generate
completely new ideas.”
A common example of this is what Alexei Tishurov, a lead data scientist at InData Labs, calls
a “cold start problem.” This is when a recommendation engine struggles to deal with new
users who have not yet provided enough data for the engine to make accurate
recommendations. New items with little or no historical data tied to them can be challenging
for the engine as well.
Like any machine learning system, recommendation engines can produce biased results if
they are based on biased data. This can result in inaccurate or even discriminatory
recommendations, posing both functional and ethical problems.
By extension, recommendation engines may fall victim to popularity bias, where popular
items tend to be suggested more frequently than lesser-known items. This can lead to a lack
of diversity in the recommendations, and prevent users from discovering niche or less popular
items.
Data is the backbone of recommendation engines. But as regulations and policies regarding
the collection and storage of data continue to evolve, acquiring enough accurate customer
data to generate decent recommendations will be an ongoing challenge.
Companies have to be sure they’re compliant with whatever security and privacy regulations
exist within the jurisdictions they’re operating out of. And even then, customers can often opt
out of providing the data recommendation engines need.
“If a customer is not giving you permission to track them or track their behavior while they’re
browsing your website, it’s a lot harder for you to provide those tailored experiences,”
Thompson said. Sites like Netflix and Amazon “can’t operate without being able to use the
models to provide tailored recommendations,” he continued. “It’s a core, business critical
system when it comes to providing their service.”
Comparing the Types of Recommendation Engines
Recommendation
Engine Type Description Pros Cons
- Cold start problem for
- Effective for new users/items-
Recommends items based on recommending items Difficulty in handling
Collaborative user-item interactions and based on user preferences- large datasets- May suffer
Filtering similarities with other users. Can handle sparse data from the "popularity bias"
- Limited to
Recommends items similar to - No cold start problem as recommending items with
those a user has liked in the it relies on item features- known features- May
Content-based past, based on item Can provide explanations suffer from the "over-
Filtering attributes/features. for recommendations specialization" problem
Combine collaborative and - Can mitigate limitations - Increased complexity in
Hybrid content-based approaches to of individual approaches- implementation- Requires
Recommender leverage the strengths of Provides more accurate more computational
Systems both. recommendations resources
- Requires significant
Recommends items based on - Can handle cold start initial knowledge about
explicit knowledge about problem effectively- Can users and items-
Knowledge-based user preferences and item provide personalized Maintenance of
Recommender characteristics, often using recommendations based knowledge base can be
Systems rules or knowledge graphs. on user preferences labor-intensive
Techniques that decompose
user-item interaction - Effective in handling - Requires careful tuning
matrices to discover latent sparsity and scalability-of parameters- Cold start
factors and make Can capture complex user-problem for new
Matrix Factorization recommendations. item relationships users/items
- Requires large amounts
Utilizes deep learning models - Can handle complex data of data for training-
Deep Learning to capture complex patterns and relationships- Can Computationally
Recommender in user-item interactions and automatically learn expensive and resource-
Systems make recommendations. features from raw data intensive
This table outlines the key characteristics, advantages, and limitations of different
recommendation engine types.
Collecting and Manipulating Data
The first step is to clearly define the problem that the recommendation system will solve. For
instance, we want to build an Amazon-like recommendation system that suggests products to
customers based on their past purchases and browsing history.
A well-defined goal helps in determining the data required, selecting the appropriate
machine-learning models, and evaluating the performance of the recommender system.
The next step is to collect data on customer behavior, such as their past purchases, browsing
history, reviews, and ratings. To process large amounts of business data, we can use Apache
Hadoop and Apache Spark. After data collection, the data engineers preprocess and analyze
this data. This step involves cleaning the data, removing duplicates, and handling missing
values. Also, the data engineers transform this data into a format suitable for machine
learning algorithms.
Here are some popular Python-based data preprocessing libraries:
Exploratory Data Analysis (EDA) helps understand the data distribution and relationships
between variables which can be used to generate better recommendations.
For instance, you can visualize which items are sold the most in the last quarter. Or which
items are sold more when the customers purchase a specific item, like eggs are sold more
with bread and butter.
Here are some popular Python libraries for carrying out exploratory data analysis:
4. Feature Engineering
Feature engineering involves selecting the best-suited features to train your machine learning
model. This step involves creating new features or transforming existing ones to make them
more suitable for the recommendation system.
For example, within customer data, features such as product ratings, purchase frequency, and
customer demographics are more relevant for building an accurate recommendation system.
Here are some popular Python libraries for performing feature engineering:
Scikit-learn: Includes tools for feature selection and feature extraction, such as
Principal Component Analysis (PCA) and Feature Agglomeration.
Category Encoders: Provides methods for encoding categorical variables i.e.,
converting categorical variables into numerical features.
5. Model Selection
The goal of model selection is to choose the best machine learning algorithm that can
accurately predict the products that a customer is likely to purchase or a movie they are likely
to watch based on their past behavior.
i. Collaborative Filtering
Collaborative filtering is a popular recommendation technique, which assumes that users who
share similar preferences will most likely buy similar products, or products that share similar
features will most likely be bought by the customers.
ii. Content-Based Filtering
This approach involves analyzing the attributes of products, such as the brand, category, or
price, and recommending products that match a user's preferences.
6. Model Training
This step involves dividing the data into training and testing sets and using the most
appropriate algorithm to train the recommender model. Some of the popular recommendation
system training algorithms include:
i. Matrix Factorization
This technique predicts missing values in a sparse matrix. In the context of recommendation
systems, Matrix Factorization predicts the ratings of products that a user has not yet
purchased or rated.
This technique involves training neural networks to learn complex patterns and relationships
in the data. In recommendation systems, deep learning can learn the factors that influence a
user's preference or behavior.
It is a data mining technique that can discover patterns and relationships between items in a
dataset. In recommendation systems, Association Rule Mining can identify groups of
products that are frequently purchased together and recommend these products to users.
These algorithms can be effectively implemented using libraries such as Surprise, Scikit-
learn, TensorFlow, and PyTorch.
7. Hyperparameter Tuning
8. Model Evaluation
Model evaluation is critical to ensure that the recommendation system is accurate and
effective in generating recommendations. Evaluation metrics such as precision, recall, and F1
score can measure the accuracy and effectiveness of the system.
9. Model Deployment
Once the recommendation system has been developed and evaluated, the final step is to
deploy it in a production environment and make it available to customers.
Deployment can be done using in-house servers or cloud-based platforms such as Amazon
Web Services (AWS), Microsoft Azure, and Google Cloud.
For instance, AWS provides various services such as Amazon S3, Amazon EC2, and Amazon
Machine Learning, which can be used to deploy and scale the recommendation system.
Regular maintenance and updates should also be performed based on the latest customer data
to ensure the system continues to perform effectively over time.
Similarity:
Definition: Similarity measures how alike two items or users are based on some criteria, such
as their attributes or behavior.
Purpose: It helps identify items or users that are closely related to each other.
Calculation: There are various methods to calculate similarity, including cosine similarity,
Pearson correlation coefficient, and Jaccard similarity, depending on the nature of the data
and the context of the recommendation problem.
Neighborhood:
Definition: Neighborhood refers to a subset of items or users that are most similar to a given
item or user.
Purpose: It helps identify a group of items or users that are likely to be of interest to a
particular user or item.
User-based neighborhood: Consists of users who are most similar to the target user.
Item-based neighborhood: Consists of items that are most similar to the target item.
Size: The size of the neighborhood can vary based on parameters like the number of nearest
neighbors to consider.
In summary, similarity measures quantify how alike items or users are, while neighborhoods
represent subsets of items or users that are most similar to a given item or user. These
concepts are crucial for building effective recommendation systems, especially in
collaborative filtering approaches where recommendations are based on the preferences of
similar users or items.
Creating a Recommendation Engine
Recommendation engines bring together lots of data and then use machine learning to
recommend the “next best action,” Thompson said, and that could be anything from buying a
product to clicking on a video.
There are two main categories at play in a recommendation engine — users and items,
according to Eugene Medved, an AI developer at recommendation engine provider InData
Labs. “The task itself,” he explained, “is all about ranking the items for a specific user by
probability of the interaction.”
This is accomplished by a standard order of operations, starting with data collection. From
the initial data collection to the final presentation of recommendations, you will understand
how these systems expertly analyze data and transform it into personalized item suggestions.
During this initial phase, the engine collects a wide range of data, including user interactions
(such as clicks, views, or purchases), user demographics (such as age and location), and
detailed item information (such as descriptions and categories). A challenge in this step,
known as the “cold start problem,” occurs when there is insufficient data on new users or
items, making it difficult to provide accurate recommendations initially.
In the data collection phase of a recommendation engine, various methods are used to gather
comprehensive information.
One of the primary tools used is Web crawlers, which are automated programs that navigate
the Internet to collect data from various Web sites. They are particularly useful for gathering
detailed information about items such as product descriptions, customer reviews, and ratings.
In addition, user information is collected through techniques such as the use of cookies.
Cookies are small files stored on users’ devices that track their visits to and interactions with
websites. This allows the recommendation system to understand user behavior on the site by
tracking actions such as clicks, views, and purchases. Together, these methods provide a rich
data set that forms the basis for generating accurate and personalized recommendations.
User Behavior Data: This includes data on the actions users take, such as the items they
view, purchase, or add to their wishlist. It also tracks the frequency of these actions and the
time spent on each item.
User Demographic Data: This refers to personal information about the user, like age,
gender, location, and possibly income level or educational background.
Item Data: This encompasses details about the products or content available for
recommendation, such as descriptions, categories, price, brand, specifications for products, or
genre and author for books.
Contextual Data: It includes information about the context in which user interactions take
place, such as the time of day, season, or whether the interaction was on a mobile device or a
desktop.
Feedback Data: User ratings, reviews, and preferences explicitly provided by the users are
also vital. This data helps in understanding the user’s satisfaction and preferences more
directly.
The second step in the functioning of a recommendation engine is data processing, a critical
phase in which the collected data is refined and prepared for analysis.
This step is all about ensuring the quality and usability of the data.
First, data cleansing is performed to remove irrelevant, incomplete, or erroneous information.
This may involve filtering out noise or correcting data inconsistencies to ensure that the
remaining data is accurate and reliable.
Next, data transformation is performed to convert the raw data into a structured format
suitable for analysis. This can include normalizing data (scaling it to a certain range),
categorizing unstructured data (such as text or images), and creating user or object profiles.
Another key aspect is data integration, where data from different sources is combined to
create a comprehensive view. For example, users’ demographic data can be merged with their
behavioral data. Finally, feature extraction is critical, where specific attributes or “features”
are identified and extracted from the data.
These features, such as the frequency of item views or the types of products viewed, are what
the recommendation algorithms will later use to make predictions.
Overall, data processing transforms raw, unorganized data into a clean, structured format that
is essential for the recommendation engine to function effectively.
Phase 3: Filtering
By applying specific mathematical recommendation algorithms, the system can predict how
likely a user is to prefer an item, even if they haven’t interacted with it before.
In making these recommendations, the engine strives to balance relevance, user engagement,
and business goals, such as promoting new products or increasing sales in certain categories.
The ultimate goal is to enhance the user experience by providing timely and relevant
suggestions that are tailored to the user’s needs and interests.
What are these types? Let’s look at what many e-commerce sites are doing with their
recommendations:
Best Sellers: These are popular items across the platform, often recommended to new users
or those with limited interaction history. They represent what is trending or most purchased
in a certain category.
Related Items: Often seen as “Customers who viewed this also viewed” suggestions, these
are based on the correlation between products, recommending items that other users have
looked at or purchased in relation to the current item.
New Arrivals: Recommendations focusing on the newest items in a category, useful for
returning users to discover the latest products or content.
Recommending another item typically involves suggesting items that are similar to the ones a
user has interacted with or shown interest in. This is commonly known as item-based
collaborative filtering. In item-based collaborative filtering, recommendations are made
based on the similarity between items. The idea is that if a user likes one item, they are likely
to enjoy similar items. Here's how it works:
1. Calculate Item Similarity:
Given a target item for which we want to make recommendations, find items
that are most similar to it.
Rank the similar items based on their similarity to the target item.
Example:
Let's say we have a dataset of user interactions with items, such as movies watched or
products purchased. We want to recommend another movie to a user based on the movie they
recently watched. Here's how we can do it:
Items in the Dataset: A list of movies with their attributes (e.g., genre, director,
actors).
Inception 0.85
Interstellar 0.80
Movie Similarity Score
Terminator 2 0.70
In this table, we calculate the similarity scores between "The Matrix" and other movies in the
dataset. For example, "Inception" has a high similarity score of 0.85, indicating it's closely
related to "The Matrix" and thus a good candidate for recommendation.
Recommendation:
Based on the item similarity table, we recommend "Inception" to the user who watched "The
Matrix" as it has the highest similarity score among the other movies.
This approach can be scaled to handle large datasets and can provide personalized
recommendations to users based on their past interactions with items.
Build a user profile based on their past interactions, ratings, and preferences.
This could include genre, category, tags, features, and metadata associated
with the items.
3. Algorithm Selection:
4. Generate Recommendations:
Utilize the selected algorithm to generate a list of recommended items for each
user.
This can involve calculating similarity scores between items, predicting user
ratings, or using matrix factorization techniques.
Example:
Let's consider a scenario where we have a dataset of movies and user ratings. We want to
recommend movies to a user based on their preferences and past interactions. Here's how we
can do it:
User Profile: User A has previously rated several movies, and their preferences
indicate a preference for action and sci-fi genres.
Items in the Dataset: A list of movies with attributes such as genre, director, actors,
and user ratings.
Recommendation Table:
In this table, we have a list of movies with their respective genres, directors, and user ratings.
Recommendations:
Based on User A's preferences for action and sci-fi genres, we can recommend movies such
as "The Matrix," "Terminator 2," and "Inception" as they match the user's preferences and
have high user ratings.
This approach allows for personalized recommendations tailored to the individual user's
preferences and can enhance user engagement and satisfaction with the platform.
Recommending items based on other items, also known as item-based collaborative filtering
is a technique used in recommendation engines to suggest items that are similar to those a
user has interacted with or shown interest in. Item-based collaborative filtering recommends
items to users based on the similarity between items they have interacted with. The
underlying idea is that if a user likes one item, they are likely to enjoy similar items. Here's
how it works:
1. Calculate Item Similarity:
Find items that are most similar to the target item based on the calculated
similarity scores.
Rank the similar items based on their similarity to the target item.
5. Generate Recommendations:
Example:
Let's consider a scenario where we have a dataset of products and user interactions (e.g.,
purchases, views). We want to recommend other products to users based on the products they
have previously interacted with. Here's how we can do it:
User's Interacted Items: User A has previously purchased the item "iPhone 11."
Items in the Dataset: A list of products with attributes such as category, brand, price,
and user interactions.
iPhone 11 1.00
Product Similarity Score
iPhone XR 0.75
OnePlus 7T 0.70
In this table, we calculate the similarity scores between "iPhone 11" and other products in the
dataset based on their attributes or user interactions.
Recommendations:
Based on the item similarity table, we recommend products such as "Samsung Galaxy S10"
and "Google Pixel 4" to User A, who has previously purchased the "iPhone 11," as they are
the most similar items to the target item.
Divide the dataset into training and test sets to simulate real-world scenarios.
The training set is used to train the recommendation model, while the test set
is used to evaluate its performance.
3. Generate Recommendations:
Evaluate metrics both globally (across all users) and on a per-user basis to
understand performance variations.
Example:
Let's consider a scenario where we have a recommendation system for movies, and we want
to evaluate its performance using precision and recall metrics.
Ground Truth: Actual movies that users have interacted with in the test set.
Predicted Recommendations: Movies recommended by the recommendation system
for each user in the test set.
Predicted
User Ground Truth Recommendations Precision Recall
[Inception, Interstellar,
User 1 [Matrix, Inception, Interstellar] Avengers] 0.67 0.67
[Matrix, Inception,
User 3 [Matrix, Avengers, Titanic] Interstellar] 0.33 0.33
[Inception, Interstellar,
User 4 [Inception, Interstellar] Matrix] 0.67 1.00
In this table, we have evaluated the recommendation system's performance for several users.
For each user, we compare the predicted recommendations with the ground truth and
calculate precision and recall. Finally, we compute the average precision and recall across all
users to assess the overall performance of the recommendation system.
Interpretation:
Precision: The proportion of recommended items that are relevant to the user out of
the total recommended items.
Recall: The proportion of relevant items that are successfully recommended out of all
relevant items.
In this example, the recommendation system achieves an average precision of 0.58
and an average recall of 0.67, indicating its effectiveness in generating relevant
recommendations for users.
By evaluating the recommendation system using appropriate metrics, we can gain insights
into its performance and make informed decisions to improve its accuracy and relevance.
Divide the dataset into three separate sets: training, validation, and test sets.
The training set is used to train the recommendation model, the validation set
is used to tune hyperparameters and evaluate performance during training, and
the test set is used to assess the final performance of the model.
Once the recommendation system has been optimized based on the validation
set, evaluate its performance on the test set.
Example:
Let's consider a scenario where we have a recommendation system for books, and we want to
validate its performance using precision and recall metrics.
Test Set: A separate dataset containing user-book interactions for final evaluation.
[Computer Networking: A
Top-Down Approach,
Artificial Intelligence: A
Modern Approach, [Introduction to Algorithms,
Database System Database System Concepts,
User 2 Concepts] Operating System Concepts] 0.33 0.67
[Operating System
Concepts, Database [Introduction to Algorithms,
System Concepts, Computer Networking: A Top-
Computer Architecture: A Down Approach, Operating
User 3 Quantitative Approach] System Concepts] 0.67 0.33
In this table, we have evaluated the recommendation system's performance on the validation
set for several users, using real computer science textbook titles. For each user, we compare
the predicted recommendations with the ground truth and calculate precision and recall.
Interpretation:
Precision: The proportion of recommended items that is relevant to the user out of the
total recommended items.
Recall: The proportion of relevant items that are successfully recommended out of all
relevant items.