Using data for a start up in the ride sharing business, I want to conduct analysis that will summarize the number of rides, number of drivers, and fare. It's important to understand the differences in such data by stratifying by urban, suburban, and rural city types. In doing so, the company can project where their most and least profits are made and could infer as to why. This allows for future goals and planning to optimize efficiency and profitability. Pandas and matplotlib are used to manipulate data and generate dataframes in addition to producing plots that visually summarize the findings.
(1) Urban city type (overall) has the most drivers, the most riders, and the lowest average fare ($) compared to suburban and rural.
(2) For number of riders, suburban does overlap a bit with urban city type per the plot, and the average fare ($) is higher but its number of drivers is overall decreased (compared to urban.) Rural city type is even more extreme than suburban. This could be attributable to having less drivers in the area and a greater amount of space between destinations for rural city type.
(3) When comparing the pie charts, context should be used when noticing urban has the highest percentages for "% of total fare", "% of total drivers", and "% of total rides". If urban has the most drivers, and especially the most rides, then it would only make sense that urban city type would have the highest % of total fare. However, this doesn't mean that the rides are priced the highest for the urban city type.