Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Lasse Benninga
15-02-23
GoDataFest
My Path From Data Engineer
to Analytics Engineer
Lasse
Benninga
• Analytics Engineer @ GDD since
October 2021
• Studied Informatics in Groningen
• Love OSS and Data <3
• Live in Utrecht
• Enjoy podcasts, running
GODATADRIVEN
Chapters
• My (short) career as a Data
Engineering
• The Rise of the Analytics Engineer
• Differences between DE and AE
• Tips for up-and-coming AE’s
GODATADRIVEN
•My (short) career as
a Data Engineering
My career
timeline
• Studied HBO Informatics in Groningen: Software Engineering (2013 /
2017)
• Followed the BI & Big Data Traineeship at Young Capital (2018)
• Placed as a trainee Data Engineer @ KLM (2018)
• In-house DE for KLM (2019)
• DE consultant (2020-2021)
• Analytics Engineer (2021+)
Data Engineer @ KLM
(2018-2020)
• Predictive Maintenance on Boeing fleet
• Worked in a product team of Data Scientists, Data
Engineers,
Aircraft Engineers
Main responsibilities:
• Extracting data from FTP
• Moving data into Hadoop cluster
• Writing Spark (PySpark + Scala)
• Using Containerization (Docker)
• Monitoring data quality
• Creating dashboarding
• More on the pipeline side than the insights side
(Cloud) Data Engineer @ Consultancy
(2020-2021)
• Cloud-native Data Engineering
• Tools for provisioning infrastructure like Terraform,
Ansible
• Mainly writing Python, with Servless functions
like AWS Lambda
• Less oriented on the data and the stakeholders
Working as a Data
Engineer
Liked
• Learned about a lot of new cool technology:
Spark, Hadoop etc.
• A lot of freedom to investigate solutions, try out
said technologies
• Maintaining working infrastructure is
challenging and rewarding
Disliked
• Quite far removed from the business impact
• “Generic” work that does not require very domain
specific knowledge
• Hard to become an expert on the data itself
when you are battling the system
https://godatadriven.com/careers/analytics-engineer-amsterdam/
Enter the Analytics Engineer
(2020)
GODATADRIVEN
•Rise of the
Analytics Engineer
What is an
Analytics
Engineer?
What is an
Analytics
Engineer?
• Term coined in 2018 in the blog Locally
Optimistic
• Bridges the gap between Data
Engineering and Analyst
• Relies heavily on ”the modern data
stack”
• Goes hand-in-hand with the motto of
DBT Labs – Bringing software
engineering best practices to Data
Analysts
What is an
Analytics
Engineer?
https://ourworldindata.org/grapher/historical-cost-of-computer-memory-and-storage?country=~OWID_WRL
Trend #1:
Storage has
become
incredibly cheap
over
the past years
What is an
Analytics
Engineer?
Trend #2:
Managed
Cloud
Datawarehous
e
https://www.castordoc.com/blog/cloud-data-warehousing-the-past-present-and-future
What is an
Analytics
Engineer?
Trend #3:
SQL as the Lingua
Franca
of Data
Transformations
What is an
Analytics
Engineer?
Trend #4:
Off-the-shelve
Ingestion Tooling
with 100’s of
Connectors
Fivetran
connectors
What is an
Analytics
Engineer?
Trend #5:
Move from ETL to
ELT
Architecture
https://validio.io/blog/dbt-and-the-analytics-engineer--whats-the-hype-about
What is an
Analytics
Engineer?
Trend #6:
DBT as the central part
of the Modern Data
Stack
dbt
Released in 2016 , data build tool (dbt) is a workflow tooling
that is meant for transforming data inside a data warehouse:
• Open-source software with a SaaS offering (dbt Cloud)
• Connects to most major cloud DWHs
• Runs SQL statements on the DWH
• Creates a DAG of SQL ”models”
• Supports documentation in code
• Cloud offering contains Scheduler, IDE, Documentation hosting
What is an
Analytics
Engineer?
Dbt
Labs
What is an
Analytics
Engineer?
https://www.getdbt.com/blog/the-four-priorities-of-an-analytics-team-of-one-lessons-from-lola-c
om/
GODATADRIVEN
•Difference between
DE and AE
DE vs AE
Source: DBT
Labs
Data
Engineer
• More like a traditional Software
Engineer
• Writes in one or more programming
languages
• Leverages a large set of tools,
custom built or off-the-shelve,
manages services and
infrastructure
• Less of an expert on the data itself,
more focused on getting data from
A to B
• Often more expensive hourly due
to niche knowledge
Analogy: Plumber
Analytics
Engineer
• Tends to be more like an analyst,
but with technical chops
• SQL-first, then probably Python
• Leverages a lot of SaaS
• More of a subject-matter/domain
expert, depending on the team
• Curates high-quality datasets
• Often cheaper hourly rate than Data
Engineers
Analogy: Librarian
Analytics
Engineer
• AE’s own and clean the datasets, building quality
and performant data models to help other data
team members work save time.
• AE’s also work on the design and development of
CI/CD pipelines so that data teams can efficiently
apply, test, and release analytics.
• AE’s think about the data model and modelling
techniques.
• AE’s Software Engineering best practices like
Version Control, Data Testing, and adding
documentation.
GODATADRIVEN
Tips for up-and-coming
AEs
Tips for up-and-coming
AE’s
• Try out a Cloud DataWarehouse, ingestion
+ dbt + visualization tool
• Create an end-to-end project combining
these (free) tools
• Add some CICD, data quality testing,
monitoring tooling
• Become proficient in SQL
(hackerrank/leetcode)
• Apply at Xebia Data!
Tips for up-and-coming
AE’s
• Apply at Xebia Data (formerly known as
GoDataDriven) as an Analytics
Engineer!
GODATADRIVEN
Questions

More Related Content

My Path From Data Engineer to Analytics Engineer