Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

5 Machine Learning Techniques for Sales Forecasting

Comparing Linear Regression, Random Forest Regression, XGBoost, LSTMs, and ARIMA Time Series Forecasting In Python

Molly Ruby
Towards Data Science

--

Forecasting sales is a common and essential use of machine learning (ML). Sales forecasts can be used to identify benchmarks and determine incremental impacts of new initiatives, plan resources in response to expected demand, and project future budgets. In this article, I will show how to implement 5 different ML models to predict sales.

The data for this demonstration can be found on Kaggle and the full code is on GitHub.

Getting Started

The first step is to load the data and transform it into a structure that we will then use for each of our models. In its raw form, each row of data represents a single day of sales at one of ten stores. Our goal is to predict monthly sales, so we will first consolidate all stores and days into total monthly sales.

def load_data():
url = """https://www.kaggle.com/c/demand-forecasting-kernels
only/download/ryQFx3IEtFjqjv3s0dXL%2Fversions%2FzjbSfpE39fdJl
MotCpen%2Ffiles%2Ftrain.csv"""

return pd.read_csv(url)
def monthly_sales(data):
data = data.copy()
# Drop the day indicator from the date column
data.date = data.date.apply(lambda x: str(x)[:-3])

--

--