Yelp Vs Zomato Analysis
Yelp Vs Zomato Analysis
Yelp Vs Zomato Analysis
Madeline Richards
Final Project Report
4/22/2019
1. Goals
● Our goal for our project was to collect data from at least 100 restaurants in Ann Arbor for
each website including star rating, price range, restaurant category, and the number of
reviews.
● Then we wanted to compare the results and trends we observed in the data between
Yelp and Zomato. Specifically, from each database, we wanted to compare the
following:
○ Number of restaurants in each restaurant category
○ Category with the most restaurants
○ Restaurant category with the highest average star rating
○ Price range with the highest average star rating
○ Overall average star rating and price range for all restaurants in the database
● For each database, we wanted to create at least two of the following graphs/charts:
○ Scatterplot of star rating vs. number of reviews for each restaurant
○ Histogram showing the star rating for each restaurant category including the
overall average star rating
○ Histogram showing the number of restaurants in each restaurant category
2. Goals Achieved
● We obtained all the data we set out to achieve from 100 restaurants from each website.
● From our original data analysis goals, we accomplished the following:
○ Number of restaurants in each restaurant category
■ Example: Yelp had 2 out of 100 restaurants in the “Italian” category, while
Zomato had 10.
■ Able to find the category with the most restaurants
● Yelp: “Coffee & Tea” (7)
● Zomato: “American” (28)
○ Average star rating based on restaurant category
■ Able to find the restaurant category with the highest average star rating
● Yelp: Breakfast & Brunch (4.75/5.0)
○ Local Flavor (5.0/5.0) but does not really count as a
restaurant so we disregarded it
● Zomato: “Cuban” (4.6/5.0)
○ Average star rating based on price range
■ Price range with the highest average star rating
● Yelp: 1 out of 4 / $ out of $$$$
● Zomato: 4 out of 4 / $$$$ out of $$$$
○ Overall average star rating
■ Yelp: 4.075 / 5
■ Zomato: 4.071 / 5
● For our data visualizations, we generated the following:
○ Scatterplot of star rating vs. number of reviews for each restaurant
○ We revised the histogram to display the distribution of star ratings for each
database
3. Issues Faced
● Our first issue was with getting information on the city level for the Zomato data. For
example, Ann Arbor was under the locality of Detroit, and we did not know, until after
carefully studying the documentation and testing with print statements, exactly how to
obtain only Ann Arbor restaurants.
● The most difficult issue we faced was adding 20 outputs at a time without dropping the
table. Our original code dropped the table if it already existed, and using a loop with
offsets, it grabbed 20 restaurants at a time, but the complete code only needed to be run
once to get 100 rows of data. After realizing this conflicted with what was asked of us for
this project, we struggled on how we could modify our code to fit the requirements. To
solve this issue, we replaced our offsets with a count system, and we added “INSERT
OR IGNORE INTO” statements to add each restaurant’s data only if it is not in the
database.
4. Calculation File
● Data-Analysis.py
5. Visualizations
● ratings_by_reviews.png
● Ratings_dist_hist.png
6. Instructions for Running the Code
get_rating_by_price (Data- Price and Rating columns for Dictionary named “averages.”
Analysis.py) each Yelp and Zomato Keys are the price ranges,
and values are the average
star rating for each range
get_overall_average_rating Rating column for each Yelp Float values of average star
(Data-Analysis.py) and Zomato rating for each Yelp and
Zomato