ML Project - Jupyter Notebook
ML Project - Jupyter Notebook
In [1]:
import pandas as pd
df=pd.read_csv("C:\\Users\\DELL\\OneDrive\\Desktop\\computer network\\deliverytime.txt")
df
Out[1]:
In [2]:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 45593 entries, 0 to 45592
Data columns (total 11 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 ID 45593 non-null object
1 Delivery_person_ID 45593 non-null object
2 Delivery_person_Age 45593 non-null int64
3 Delivery_person_Ratings 45593 non-null float64
4 Restaurant_latitude 45593 non-null float64
5 Restaurant_longitude 45593 non-null float64
6 Delivery_location_latitude 45593 non-null float64
7 Delivery_location_longitude 45593 non-null float64
8 Type_of_order 45593 non-null object
9 Type_of_vehicle 45593 non-null object
10 Time_taken(min) 45593 non-null int64
dtypes: float64(5), int64(2), object(4)
memory usage: 3.8+ MB
In [3]:
Out[3]:
In [4]:
print(df.isnull().sum())#it will count the missing values for each column and adds it to visulaize
ID 0
Delivery_person_ID 0
Delivery_person_Age 0
Delivery_person_Ratings 0
Restaurant_latitude 0
Restaurant_longitude 0
Delivery_location_latitude 0
Delivery_location_longitude 0
Type_of_order 0
Type_of_vehicle 0
Time_taken(min) 0
dtype: int64
In [2]:
#We can use the haversine formula to calculate the distance between two locations based on their latitudes and longitudes.
import numpy as np
# Set the earth's radius (in kilometers)
R = 6371
# Function to calculate the distance between two points using the haversine formula
def distcalculate(lat1, lon1, lat2, lon2):
d_lat = deg_to_rad(lat2-lat1)
d_lon = deg_to_rad(lon2-lon1)
a = np.sin(d_lat/2)**2 + np.cos(deg_to_rad(lat1)) * np.cos(deg_to_rad(lat2)) * np.sin(d_lon/2)**2
c = 2 * np.arctan2(np.sqrt(a), np.sqrt(1-a))
return R * c
for i in range(len(df)):
df.loc[i, 'distance'] = distcalculate(df.loc[i, 'Restaurant_latitude'],
df.loc[i, 'Restaurant_longitude'],
df.loc[i, 'Delivery_location_latitude'],
df.loc[i, 'Delivery_location_longitude'])
In [20]:
df['distance']
Out[20]:
0 3.025149
1 20.183530
2 1.552758
3 7.790401
4 6.210138
...
45588 1.489846
45589 11.007735
45590 4.657195
45591 6.232393
45592 12.074396
Name: distance, Length: 45593, dtype: float64
In [5]:
Out[5]:
The plotly Python library is an interactive, open-source plotting library that supports over 40 unique chart types covering a
wide range of statistical, financial, geographic, scientific, and 3-dimensional use-cases.
In [3]:
#relationship between the distance and time taken to deliver the food:
import plotly.express as px
It means that most delivery partners deliver food within 25-30 minutes, regardless of distance.
In [4]:
#Now let’s have a look at the relationship between the time taken to deliver the food and the age of the delivery partner:
figure = px.scatter(data_frame = df,
x="Delivery_person_Age",
y="Time_taken(min)",
size="Time_taken(min)",
color = "distance",
trendline="ols",
title = "Relationship Between Time Taken and Age")
figure.show()
There is a linear relationship between the time taken to deliver the food and the age of the delivery partner. It means young
delivery partners take less time to deliver the food compared to the elder partners.
Now let’s have a look at the relationship between the time taken to deliver the food and the ratings of the delivery partner:
In [7]:
There is an inverse linear relationship between the time taken to deliver the food and the ratings of the delivery partner. It
means delivery partners with higher ratings take less time to deliver the food compared to partners with low ratings.
Now let’s have a look if the type of food ordered by the customer and the type of vehicle used by the delivery partner affects
the delivery time or not:
In [6]:
fig = px.box(df,
x="Type_of_vehicle",
y="Time_taken(min)",
color="Type_of_order")
fig.show()
So there is not much difference between the time taken by delivery partners depending on the vehicle they are driving and the
type of food they are delivering.
So the features that contribute most to the food delivery time based on our analysis are:
#splitting data
from sklearn.model_selection import train_test_split
x = np.array(df[["Delivery_person_Age",
"Delivery_person_Ratings",
"distance"]])
y = np.array(df[["Time_taken(min)"]])
xtrain, xtest, ytrain, ytest = train_test_split(x, y,
test_size=0.10,
random_state=42)
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm (LSTM) (None, 3, 128) 66560
=================================================================
Total params: 117,619
Trainable params: 117,619
Non-trainable params: 0
_________________________________________________________________
In [19]:
Epoch 1/9
41033/41033 [==============================] - 442s 11ms/step - loss: 65.1387
Epoch 2/9
41033/41033 [==============================] - 392s 10ms/step - loss: 62.0739
Epoch 3/9
41033/41033 [==============================] - 425s 10ms/step - loss: 61.0353
Epoch 4/9
41033/41033 [==============================] - 478s 12ms/step - loss: 60.1918
Epoch 5/9
41033/41033 [==============================] - 446s 11ms/step - loss: 59.7889
Epoch 6/9
41033/41033 [==============================] - 368s 9ms/step - loss: 59.3642
Epoch 7/9
41033/41033 [==============================] - 246s 6ms/step - loss: 58.9183
Epoch 8/9
41033/41033 [==============================] - 246s 6ms/step - loss: 59.3749
Epoch 9/9
41033/41033 [==============================] - 255s 6ms/step - loss: 59.0670
Out[19]:
<keras.callbacks.History at 0x26d33f111f0>
Now let’s test the performance of our model by giving inputs to predict the food delivery time
In [22]:
summary:
To predict the food delivery time in real time, you need to calculate the distance between the food preparation point and the
point of food consumption. After finding the distance between the restaurant and the delivery locations, you need to find
relationships between the time taken by delivery partners to deliver the food in the past for the same distance. I hope you
liked this article on food delivery time prediction with Machine Learning using Python. Feel free to ask valuable questions in
the comments section below.