Mindsight Codex

RECOMMENDER
SYSTEM
SHARING POINTS :
● Introduction to Recommender System
● Content based Filtering
● Collaboration Filtering
● Explicit and implicit
● Matrix Factorization
● Hybrid recommendation
● Data pipeline
● Architecture and deployment
INTRODUCTION
RECOMMENDER SYSTEM
Product Recommender System
● System that predict and arrange relevant list of
items that users would purchase
● A lot of products to show? Which ones to
recommend?
The impact of recommendation and personalization
● Product recommendations drive

revenue
● Shoppers buy more (marketplace),
Visitors view more (youtube etc)
● Shoppers and visitors stay longer
● Upselling/crossselling
● Better user experience
● Lead to repeat visits
Approaches in Designing Recommender System
CONTENT-BASED
RECOMMENDER SYSTEM
CONTENT-BASED
RECOMMENDATION
● Use attibutes about the product (movie

categories: cartoon, action, drama, romance)
● Uses item features to recommend new items
that are similar to what the user has liked in the
past
● Don’t rely on other user preferences/ other user-
item interaction
● User A likes drama and romance, we can use that
information to recommend movies belonging to
those groups.
ITEM’S ATTRIBUTES OR
CHARACTERISTICS
VECTORIZING ATTRIBUTES INTO EMBEDDING MAP
Collection of items mapping

to some finite dimensional
vector space
VECTORIZING ATTRIBUTES
MOVIE CROSS TABLE
JACCARD SIMILARITY
JACCARD DISTANCE CALCULATION
SIMILARITY MEASURES
A similarity measure is a
metric for items in an
embedding space
COSINE SIMILARITY
The closer the documents are by angle, the higher is the

Cosine Similarity (Cos theta).
SIMILARITY MEASURES WITH DOT PRODUCT
TEXT-BASED SIMILARITY
TF-IDF (Term Frequency – Inverse Document)
TEXT-BASED
SIMILARITY
Find similarity with TF / IDF
COLLABORATIVE
FILTERING
Collaborative Filtering
● Recommend product based on history of user behaviors and similarities
between user.
● Similar users might like similar items
● E.g Person 1 purchased item A,B,C. Person 2 purchased item A,B,D.
Then person 1 might also purchase item D
● Involves matrix factorization for large matrix
Types of Collaborative Filtering
TWO APPROACH FOR COLLABORATIVE FILTERING
USER-BASED ITEM-BASED
● Uses the similarities ● Uses the similarities
between users between items
● Build a user-to-item matrix ● Build an item-to-user matrix
MEMORY-BASED
COLLABORATIVE
FILTERING
STEP 1
Creating user-to-item matrix
STEP 2
Calculating similarity between
users or items
USER TO ITEM MATRIX
● user 1 has purchased
items B and D
items A, B, C, and E
items A, C, and E
CALCULATE COSINE SIMILARITY BETWEEN USER
(1,2) AND (2,4)
USER TO USER ITEM TO ITEM
matrix = df.pivot_table(index='CustomerID',columns='StockCode',values='Quantity',aggfunc='sum'
)
cosine_similarity(matrix)
MATRIX FACTORIZATION
SPARSITY
● A measure how (percentage) empty a matrix is
SPARSITY
● A measure how (percentage) empty a matrix is
The ratings dataframe is

98.30% empty.
WHAT MATRIX FACTORIZATION LOOKS LIKE
LATENT FEATURES
SINGULAR VALUE DECOMPOSITION
WHAT SVD DOES
APPLYING SVD
ALS ALGORITM
ALS ALGORITM, PREDICTING THE RATING
ALS ALGORITM, PREDICTING THE RATING
ALS ALGORITM, LATENT FEATURES
ALS HYPERPARAMETER
ALS Examples Apache Spark
# Split the ratings dataframe into training and test data
(training_data, test_data) = ratings.randomSplit([0.8, 0.2], seed=42)
# Set the ALS hyperparameters

from pyspark.ml.recommendation import ALS
als = ALS(userCol="userId", itemCol="movieId", ratingCol="rating", rank = 10, maxIter = 15, regParam = .1,
coldStartStrategy="drop", nonnegative = True, implicitPrefs = False)
# Fit the mdoel to the training_data

model = als.fit(training_data)
# Generate predictions on the test_data

test_predictions = model.transform(test_data)
test_predictions.show()
IMPLICIT VS EXPLICIT DATA
IMPLICIT FEEDBACK EXPLICIT FEEDBACK

IMPLICIT VS EXPLICIT DATA
IMPLICIT FEEDBACK EXPLICIT FEEDBACK

IMPLICIT DATA
Binary Implicit rating
Recommendation engine library
1. sklearn.decomposition import TruncatedSVD

2. sklearn.metrics.pairwise.cosine_similarity
3. from pyspark.ml.recommendation import ALS
4. from surprise import SVD
5. from lightfm import LightFM
6. import implicit
APPLYING ALS WITH
SPARK
EXPLORATORY DATA ANALYSIS
COLD START PROBLEM,
HYBRID APPROACH, AND
ML OPS FOR RECSYS
HYBRID RECOMMENDATION APPROACHES
HYBRID RECOMMENDATION APPROACHES
COLD START EXAMPLE FOR NEW USER
DATA MODELING PROCESS (example)
GATHERING PREPROCESSING
01 02 DATA
DATA
CREATE
PREDICTION 03 04 INTERACTION
DATA
CREATE
EVALUATION 05 06 RECOMMENDATION
MODEL
01 02
GATHERING PREPROCESSING
DATA DATA
1. Log Event History user 1.Duplication Check
play content 2.Inconsistency Check
2. User & Content 3.Feature Engineering
Metadata
3. CRM Data
03 04 05 06
PREDICTION CREATE INTERACTION EVALUATION CREATE RECOMMENDATION

DATA MODEL
Make Interaction table with 1.Split dataset training 1.Precision 1.List of Recommendation per
user and item (80%) & testing (20%) 2.AUC user
2.hyperparameter tuning 2.List of Recommendation per
3.Create model with content
selected Model 3.List of Catalog per user
Model Evaluation (Offline Evaluation)
Recommendation Engine Journey
Model Evaluation (Online Evaluation)
Metric Evaluation (Online Evaluation)
RECOMMENDATION ENGINE
ARCHITECTURE
MACHINE LEARNING PIPELINE
NEAR REALTIME ARCHITECTURE EXAMPLE
BATCH PROCESS ARCHITECTURE EXAMPLE
(DATA ENGINEER TASK)
BATCH PROCESS ARCHITECTURE EXAMPLE
(DATA SCIENTIST TASK)
ML AUTOMATION FOR BATCH PROCESSING
PERFORMANCE CONTROL
RECOMMENDATION SYSTEM IMPLEMENTATION USEETV GO
Q&A

Mindsight Codex

Uploaded by

Copyright:

Available Formats

Mindsight Codex

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Mindsight Codex

Uploaded by

Copyright:

Available Formats

RECOMMENDER

● Product recommendations drive

● Use attibutes about the product (movie

Collection of items mapping

The closer the documents are by angle, the higher is the

The ratings dataframe is

# Set the ALS hyperparameters

# Fit the mdoel to the training_data

# Generate predictions on the test_data

IMPLICIT FEEDBACK EXPLICIT FEEDBACK

IMPLICIT FEEDBACK EXPLICIT FEEDBACK

1. sklearn.decomposition import TruncatedSVD

PREDICTION CREATE INTERACTION EVALUATION CREATE RECOMMENDATION

You might also like