Pandas DataFrame Notes

The document provides information for a job application test for SessionM. It contains a self-scoring sheet where the applicant rates their experience with various technologies on a scale of 1 to 10. It also contains a problem set with 4 questions for CoolBrand, an online retailer. The questions involve forecasting future product profits, identifying trends and anomalies in historical profit data, comparing future profit estimates for products, and steps for deploying code to production and automating jobs.

Uploaded by

scribd_sandeep

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

79 views

Pandas DataFrame Notes

Uploaded by

scribd_sandeep

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Welcome!

Thank you for applying to SessionM! Please note the following:

● This document contains two parts, a self-scoring sheet and a problem set.
● We will not consider how much time it took you to prepare your submission.
● Our intention is you don’t spend more than 2 hours on this test.
● We value partial responses and even pseudo-code on any problem in the problem set.
● Once you are ready, please submit your response to avazquezreina@sessionm.com (DropBox
link, zip file, etc). If you email us a zip file, please send us a separate email just in case to let us
know that you did so in case the original one gets blocked by our spam/anti-virus filter.

1. Self-scoring sheet
How would you grade your own experience and proficiency in the following areas and technologies?
Please use a scale of 1 to 10, where 10 = high proficiency/expertise, and add any clarification notes you
consider relevant.

● Deploying SW to a production environment ● Unix/Linux command line

● Doing customer-facing work ● Shell scripting (Bash, Zsh, etc)
● PySpark ● Scala
● Spark in Scala ● git (on the command line)
● Python ● Luigi, Airflow or Pinball
● Pandas ● Jenkins
● Scikit-learn ● AWS
● Statsmodels ○ EMR
● PyMC ○ EC2
● TensorFlow, Keras, or PyTorch ○ Redshift
● Hive ○ S3
● MySQL or PostgreSQL ● Google Cloud
● ElasticSearch ● Microsoft Azure

2. Problem set
Each problem (in the next page) has a number of points that we’ll use to score your submission. The total
maximum score you can obtain in this problem set is 100 points. We don’t expect our candidates to reach
this score, or even attempt to solve every problem. We encourage you to allocate your effort wisely to
maximize your score.

We are looking for solutions written in Python and ideally in Spark, and encourage you to use open
source packages and libraries whenever possible. Certain questions are meant to be open ended, but
feel free to contact us with questions at any time. Good luck! :)
Preliminaries
Our client CoolBrand sells a number of products online. We help them track its profit per day per product.
We have captured this number in thousands of dollars in this dataset with rows and columns representing
dates and products respectively.

1) Forecasting. 60 points

Please help CoolBrand forecast how much profit they will have per product in each day in the days
immediately following the dataset’s time window until the end of the year. Note that they need a forecast
of multiple data points (one per day) per product, not just a total single number per product for the rest of
the year. They will value results even on a handful of their top products!

2) Trends and anomalies. 60 points

CoolBrand sometimes notices some trends and anomalies in profitability. They have a hard time
separating them from “natural” random variation. Assuming that they didn’t change anything on their end
(pricing, advertising, promotions, etc) during the dataset’s time window, can you help them identify these
trends and/or anomalies in the data?

3) Comparisons. 60 points
CoolBrand executives are interested in comparing the total profitability of their products during the rest of
2017. They are looking to get two matrices. The first one P would contain elements P(i,j) with your
estimate of the difference in profit between product i and j (e.g. in thousands of $ or %) during the time left
in the year. The second one C would contain elements C(i,j) with your confidence in the corresponding
P(i,j) estimation. They would like you to compute C(i,j) and P(i,j) on at least a handful of their top products.

4) Deploying code to production (60 points)

Can you write unit tests for one of the problems above? What steps would you need to take before, during and
after deploying your code to a production system? Finally, if you had to automate one of these jobs to run them
on a daily basis, what technologies would you use, and how exactly would you use them?

GCP Associate Cloud Engineer Master Cheatsheet
No ratings yet
GCP Associate Cloud Engineer Master Cheatsheet
45 pages
PMO Proposal
50% (2)
PMO Proposal
3 pages
Beyond Effective Go: Part 1 - Achieving High-Performance Code
From Everand
Beyond Effective Go: Part 1 - Achieving High-Performance Code
Corey S Scott
No ratings yet
MarketLytics DA
No ratings yet
MarketLytics DA
3 pages
Ch2
No ratings yet
Ch2
29 pages
ML2025Spring HW2 Public
No ratings yet
ML2025Spring HW2 Public
39 pages
Data Science Product Development Lecture 1
No ratings yet
Data Science Product Development Lecture 1
39 pages
Knime Overview PDF
No ratings yet
Knime Overview PDF
101 pages
04 Version Control
No ratings yet
04 Version Control
37 pages
PySpark Notes
No ratings yet
PySpark Notes
31 pages
Data Engineering Notes
No ratings yet
Data Engineering Notes
11 pages
SE - Implementation
No ratings yet
SE - Implementation
66 pages
Jobs Proposal Skills
No ratings yet
Jobs Proposal Skills
21 pages
HW 6
No ratings yet
HW 6
4 pages
DemoUpCliplister Coding Challenge Backend (1)
No ratings yet
DemoUpCliplister Coding Challenge Backend (1)
2 pages
Caio Pinto - CV - EN (2)
No ratings yet
Caio Pinto - CV - EN (2)
3 pages
DataGrokr-Software Development Internship Assignment
No ratings yet
DataGrokr-Software Development Internship Assignment
5 pages
Introduction to Machine Learning Serving and Packaging_v2
No ratings yet
Introduction to Machine Learning Serving and Packaging_v2
33 pages
BDA_ASSIGNMENT-1
No ratings yet
BDA_ASSIGNMENT-1
3 pages
a structured learning guide for becoming a Data Scientist
No ratings yet
a structured learning guide for becoming a Data Scientist
9 pages
CRED - Interview Process (Backend Intern)
No ratings yet
CRED - Interview Process (Backend Intern)
3 pages
Notes
No ratings yet
Notes
9 pages
The Birth of Study Group 14 - Nicolas Guillemot, Sean Middleditch, Michael Wong - CppCon 2015
No ratings yet
The Birth of Study Group 14 - Nicolas Guillemot, Sean Middleditch, Michael Wong - CppCon 2015
44 pages
Seminar 1015 Twhuang
No ratings yet
Seminar 1015 Twhuang
44 pages
JS Test Shopping Cart
No ratings yet
JS Test Shopping Cart
4 pages
Microservices On GCP: How I Learned To Stop Worrying and Learned To Love The Mesh
No ratings yet
Microservices On GCP: How I Learned To Stop Worrying and Learned To Love The Mesh
31 pages
Backend Challenge
No ratings yet
Backend Challenge
7 pages
CloudxLab BDHS Course Details
No ratings yet
CloudxLab BDHS Course Details
9 pages
ArangoDB PerformanceCourse Release 1
No ratings yet
ArangoDB PerformanceCourse Release 1
71 pages
Bundling Applications With PyQtGraph - R16
No ratings yet
Bundling Applications With PyQtGraph - R16
30 pages
Petrotek Es Una Empresa Dedicada A La Venta de Equipos de Computo, Dividido en 2
No ratings yet
Petrotek Es Una Empresa Dedicada A La Venta de Equipos de Computo, Dividido en 2
1 page
Christhope Vanlancker-Deploying Your SaaS Stack OnPerm
No ratings yet
Christhope Vanlancker-Deploying Your SaaS Stack OnPerm
36 pages
Scenario:: Mulytic Labs: Developer: Step 1
No ratings yet
Scenario:: Mulytic Labs: Developer: Step 1
2 pages
AI驱动的前端开发
No ratings yet
AI驱动的前端开发
13 pages
Navttc Course Outline Python Django Angular React
No ratings yet
Navttc Course Outline Python Django Angular React
8 pages
Manage Your Data Science Project Structure in Early Stage
No ratings yet
Manage Your Data Science Project Structure in Early Stage
7 pages
React With Prashant
No ratings yet
React With Prashant
135 pages
Puppet - DevOps For Netops
No ratings yet
Puppet - DevOps For Netops
32 pages
Final Project Report
No ratings yet
Final Project Report
34 pages
Finarb Experience C
No ratings yet
Finarb Experience C
4 pages
Leadzen React + Js Assignment
No ratings yet
Leadzen React + Js Assignment
3 pages
Full Stack Engineer_Cognologix
No ratings yet
Full Stack Engineer_Cognologix
2 pages
Anoop G-Profile
No ratings yet
Anoop G-Profile
4 pages
Data Science
No ratings yet
Data Science
8 pages
Interview Process and Guidelines Document
No ratings yet
Interview Process and Guidelines Document
2 pages
Manuale Dunazip
No ratings yet
Manuale Dunazip
9 pages
Week 1 - Introduction and Quick Tour PDF
No ratings yet
Week 1 - Introduction and Quick Tour PDF
58 pages
David Piña - Resume 2022 MAY
No ratings yet
David Piña - Resume 2022 MAY
3 pages
Backtesting Crypto
No ratings yet
Backtesting Crypto
2 pages
Slide 1
No ratings yet
Slide 1
23 pages
GitLab DevOps Tool Interview Questions & Answers
No ratings yet
GitLab DevOps Tool Interview Questions & Answers
96 pages
Project 2
No ratings yet
Project 2
4 pages
Day 1 - Sap Cap
No ratings yet
Day 1 - Sap Cap
7 pages
CS553 Homework #5: Sort On Single Shared Memory Node
No ratings yet
CS553 Homework #5: Sort On Single Shared Memory Node
3 pages
Backend Challenge
No ratings yet
Backend Challenge
6 pages
Solvei8 AI_ML Internship Assignment
No ratings yet
Solvei8 AI_ML Internship Assignment
2 pages
Learning Cascading
From Everand
Learning Cascading
Michael Covert
No ratings yet
DESIGN ALGORITHMS TO SOLVE COMMON PROBLEMS: Mastering Algorithm Design for Practical Solutions (2024 Guide)
From Everand
DESIGN ALGORITHMS TO SOLVE COMMON PROBLEMS: Mastering Algorithm Design for Practical Solutions (2024 Guide)
ARCHER PAUL
No ratings yet
TensorFlow Developer Certificate Exam Practice Tests 2024 Made Easy
From Everand
TensorFlow Developer Certificate Exam Practice Tests 2024 Made Easy
Mr Troy
No ratings yet
Learning PyTorch 2.0, Second Edition
From Everand
Learning PyTorch 2.0, Second Edition
Matthew Rosch
No ratings yet
Node.js: Novice to Ninja
From Everand
Node.js: Novice to Ninja
Craig Buckler
No ratings yet
Home Airport Uber
No ratings yet
Home Airport Uber
1 page
Ubee Router Manual
No ratings yet
Ubee Router Manual
106 pages
Noras-Italian-Cuisine-Las-Vegas-menu-6.pdf
No ratings yet
Noras-Italian-Cuisine-Las-Vegas-menu-6.pdf
2 pages
Consulting Data Analyst
No ratings yet
Consulting Data Analyst
3 pages
Journal of Statistical Software: MICE: Multivariate Imputation by Chained Equations in R
No ratings yet
Journal of Statistical Software: MICE: Multivariate Imputation by Chained Equations in R
68 pages
Faqs About The Data - Table Package in R: Revised: October 2, 2014 (A Later Revision May Be Available On The)
No ratings yet
Faqs About The Data - Table Package in R: Revised: October 2, 2014 (A Later Revision May Be Available On The)
21 pages
CISSP CBK Final Exam-Answers v5.5
No ratings yet
CISSP CBK Final Exam-Answers v5.5
53 pages
Click On Analyze Tab 2. Select "Data View" As Type 3. Click On "Create"
No ratings yet
Click On Analyze Tab 2. Select "Data View" As Type 3. Click On "Create"
4 pages
Distributed Database 2
No ratings yet
Distributed Database 2
7 pages
MA NY NJ NC DC MD VA: State % of Brand % of Category
No ratings yet
MA NY NJ NC DC MD VA: State % of Brand % of Category
9 pages
Judge Disposed Appealed Reversed Court
No ratings yet
Judge Disposed Appealed Reversed Court
20 pages
Math Statistics
No ratings yet
Math Statistics
917 pages
SparkNotes GRE Arithmetic
No ratings yet
SparkNotes GRE Arithmetic
21 pages
Elective: Iv BEIT804T1 Cyber Security: B.E. Eighth Semester
No ratings yet
Elective: Iv BEIT804T1 Cyber Security: B.E. Eighth Semester
2 pages
Viterbi Decoding
No ratings yet
Viterbi Decoding
4 pages
C Inheritance Exercises PDF
0% (2)
C Inheritance Exercises PDF
2 pages
Syllabus TE Numerical Methods
No ratings yet
Syllabus TE Numerical Methods
2 pages
Avishkar Management by Salman
No ratings yet
Avishkar Management by Salman
12 pages
Ecdl Training Module 1 PDF
No ratings yet
Ecdl Training Module 1 PDF
155 pages
Configuration Guide: Smartconnector For Raw Syslog Daemon
No ratings yet
Configuration Guide: Smartconnector For Raw Syslog Daemon
8 pages
Optimized Uav Flight Mission Planning Using STK & A Algorithm
No ratings yet
Optimized Uav Flight Mission Planning Using STK & A Algorithm
3 pages
SQL Printout
No ratings yet
SQL Printout
7 pages
Arithmetic Instructions
No ratings yet
Arithmetic Instructions
18 pages
Nude Detect Algo
No ratings yet
Nude Detect Algo
6 pages
QTP 11 New Features
No ratings yet
QTP 11 New Features
9 pages
Dart Client Side Web Programming PDF
No ratings yet
Dart Client Side Web Programming PDF
7 pages
CS202-Ch1
No ratings yet
CS202-Ch1
85 pages
TD Operation Maintenance PDF
No ratings yet
TD Operation Maintenance PDF
51 pages
Apache Spark Streaming Presentation
100% (1)
Apache Spark Streaming Presentation
28 pages
Kaspersky Key and Instruction
No ratings yet
Kaspersky Key and Instruction
4 pages
(Analog Circuits and Signal Processing) Ahmed Khattab, Zahra Jeddi, Esmaeil Amini, Magdy Bayoumi (Auth.) - RFID Security - A Lightweight Paradigm-Springer International Publishing (2017)
No ratings yet
(Analog Circuits and Signal Processing) Ahmed Khattab, Zahra Jeddi, Esmaeil Amini, Magdy Bayoumi (Auth.) - RFID Security - A Lightweight Paradigm-Springer International Publishing (2017)
186 pages
CorelDRAW X7 in Simple Steps
No ratings yet
CorelDRAW X7 in Simple Steps
2 pages
Java Book
100% (1)
Java Book
148 pages
Text To Speech Documentation
No ratings yet
Text To Speech Documentation
61 pages
Public Class: Singleton, Prototype, Request, Session and Global Session
No ratings yet
Public Class: Singleton, Prototype, Request, Session and Global Session
6 pages
Ics 405 Final
No ratings yet
Ics 405 Final
1 page
Vista Deployment Using Tivoli Provisioning Manager For OS Deployment Redp4295
No ratings yet
Vista Deployment Using Tivoli Provisioning Manager For OS Deployment Redp4295
72 pages
8.3.1.3 Lab - Install A Printer in Windows
No ratings yet
8.3.1.3 Lab - Install A Printer in Windows
2 pages
Connecting To A SQLite Database (Delphi) - RAD Studio
40% (5)
Connecting To A SQLite Database (Delphi) - RAD Studio
3 pages
C++ Concepts
No ratings yet
C++ Concepts
78 pages
EENG410 Microprocessors I
No ratings yet
EENG410 Microprocessors I
3 pages
Quick Start Guide STVI With SMRT 36 81358-R6
No ratings yet
Quick Start Guide STVI With SMRT 36 81358-R6
7 pages