AIRLINE_SATISFACTION_Data Science Solution on Azure

July 2024
Sanela Nikodinoska
AIRLINE SATISFACTION
DATA SCIENCE SOLUTION ON

7/4/2024 Annual Review 2
Agenda
 Introduction
 Automated ML
 Designer
 Notebooks – Python SDK
 Closing

Introduction
For the last but most significant course DP – 100 –
Designing and Implementing a Data Science Solution
on Azure, part of Data Science Institute held by Semos
Education, Airline Satisfaction dataset was given to
design and implement a data science solution on
Azure. This presentation is an overview of the
implemented solutions created using Azure Machine
Learning Studio.
Since the subscription to Azure was made for learning
purposes only and is now cancelled, this presentation
is made upon screenshots of the most important steps
while developing, training and deploying ml models.

Automated ML
Screenshots
from Azure
Let’s dive in

First steps
Started with an Azure free trial, created resource group from UI and
created
Azure Machine Learning Service

Data – created data asset, uploading local dataset Airline Satisfaction to Azure
(no screenshot for that)
Automated ML – created two experiments, setting different primary
metrics and featurization parameters

Automated ML – best model in both experiments was
MaxMinScaler, LightGBM, experiments stopped due to early stopping policy based
on level of primary metric

Automated ML – metrics

Designer
Screenshots
from Azure
Let’s dive in

Ju
Authoring
Using components from Authoring - >
Designer tab, CREATED two
pipelines with two estimators and
one pipeline with single estimator
model for feature importance
component:
- Two-Class Logistic Regression and
Two-Class Decision Forest,
- Two-Class SupportVector Machine
andTwo-Class Neural Network,
(deep-learning model),
- Two-Class Boosted DecisionTree
with Cross –Validate Model
component
Jobs / Metrics
After configuring and
submitting pipelines
and images for env, a
job was created.
The overview of job,
as well as its outputs,
logs, child jobs and
metrics are
presented in the
following snapshots
Registered
Models
The best
performing models
from Automated
ML and Designer
(Neural Network
and LightGBM)
registered as
custom and mflow
models
Real-time
Endpoint
Blue/green deployment
of two best models was
made and the blue
deployment was tested
for inference / the
endpoint was invoked
DESIGNER
Compute targets
Compute instance for profiling data asset, compute
clusters for training models and pipeline sweep jobs
were created
Environments
Compute instance for profiling data asset,
compute clusters for training models and
pipeline sweep jobs were created

DESIGNER
Two-Class Logistic Regression andTwo-Class Decision Forest pipeline

DESIGNER
Two-Class SupportVector Machine andTwo-Class Neural Network

DESIGNER
Two-Class DecisionTree with Feature Importance component and Cross –Validate Model component

Designer - Feature importance

JOBS – list of all experiments

JOBS – overview of designer pipelines metrics – Random Forest Classifier
best model
Note: No snapshots of pipelines success

JOBS – overview of designer pipelines metrics –Two-Class Neural Network model

JOBS – overview of designer pipelines metrics –Two-Class Logistic Regression
least performing model

Registered models

Deployment of models

Real-time Endpoint

Custom environment for model
deployed to the endpoint

Predicting / Invoking the Endpoint

Environments Custom and curated environments for deploying and testing

Compute targets

Notebooks
Screenshots
from Azure
Let’s dive in

Notebooks – created pipeline for training and scoring
RandomForestClassifer model and tunning hyperparameters
with sweep job

Notebooks – running tunning hyperparameters with
sweep job

Notebooks – results from pipeline – child jobs

Notebooks – results from pipeline – best model

Notebooks – results from pipeline – model metrics

Notebooks – results from pipeline - predicting

Summary
No-code or programmatically
What suits the most
Great business solution
Having all resources in one place
New subscription
For future projects
Getting work done
Finished my course project
So many services
Yet to discover: Azure DataBricks, Azure Synapce
Analystics (for data ingestion), Azure AI Services,
Azure Data factory etc.
Recommend
Definitely!

Closing
Thanks to your time.
Hoping to get some of your feedback
for improving.
Sanela Nikodinoska
snikodinoska@gmail.com

AIRLINE_SATISFACTION_Data Science Solution on Azure

More Related Content

AIRLINE_SATISFACTION_Data Science Solution on Azure