The L D Guide To Data Fluency
The L D Guide To Data Fluency
The L D Guide To Data Fluency
Data Fluency
1 The march to data fluency
While there may be many culprits for why digital transformation programs may fail, a key
reason is not recognizing that having sustainable organization-wide data skills is a
prerequisite for successful digital transformation. Gartner finds that fewer than 50% of
documented corporate strategies mention data analytics as a critical lever for delivering
enterprise-wide value.
The underutilization of data science and business analytics dooms digital transformation
initiatives from the start—to improve digital services and processes, you must have insights
into the data they generate. Forrester estimates that an average of 60 to 73 percent of
organization data is untouched for analysis. The key differentiators between the disruptors
and the incumbents is not technology-based but in their data-driven culture, the insights they
draw from data while examining and iterating upon their services, and the data fluency skills
they foster.
1
Many organizations have tried to bridge the data gap by creating data science and analytics
teams and investing in data tools. But merely laying out a data infrastructure and hiring data
scientists isolates data science as a service center, and won’t usher in organization-wide data
fluency. In a data-driven organization, data science—and more broadly, data fluency—is an
inclusive methodology for answering organizational questions where everyone is equipped to
answer questions with data. For example, a marketing analyst would be able to use data to
optimize their marketing spend, and a business analyst would visualize and describe data to
prescribe actionable insights. The success of your digital transformation pivots on having the
appropriate data skills across the organization.
Addressing data fluency skill gaps has become even more important, especially when
considering the cost of not doing so. Organizations digitizing their processes and services
expect their employees to work with data, despite not having the necessary skills to do their
best work. This results in employees who aren’t empowered to act on their data and a culture
where, according to Accenture, about 50% of employees avoid data-related tasks or find
alternative methods to solve tasks without relying on data. All of this hurts the organization’s
decision making process and ability to iterate, and diminishes the opportunities for career
growth across teams.
These companies are ushering in a new era for upskilling in the digital age. A McKinsey study
found that more than two-thirds of organizations have or plan on having an upskilling initiative
to address skill gaps. More importantly, the same study found that 70% of organizations that
invested in upskilling efforts are reporting positive business impacts that exceed the initial
investment in upskilling. For example, 48% of organizations have reported moderate to
significant positive effects on bottom-line growth due to upskilling—and 73% of organizations
have reported moderate to substantial improvements in employee satisfaction. A Deloitte
survey asking executives about the data maturity of their organizations found that 88% of
organizations who have undergone organization-wide analytics upskilling have exceeded
business goals.
3
The challenges in upskilling for data fluency
Data fluency is a methodology for answering business questions rather than a singular skill to
be taught and learned, like traditional learning and development initiatives. Learning journeys
will vary depending on the level of interaction different individuals may have with data. For
example, a marketing analyst who regularly works with Excel may need to learn R or Python
to succeed at their job, while a manager or leader may only need to know how to make
educated decisions using data.
This is why a role-based, persona-driven learning journey is more effective at scaling data
fluency training programs. Every persona has a different relationship with data and would
need to acquire competencies in different tools, and grow different skills to thrive in the digital
age.
Creating scalable and personalized learning paths for data fluency across the organization
requires familiarity with the broad landscape of data tools, and a nuanced view of the
different types of data personas found in data-driven organizations. In short, there’s no one-
size-fits all when it comes to data fluency.
4
2 A breakdown of data tools
and data personas
A breakdown of crucial data tools
While programming languages are at the forefront of data fluency upskilling initiatives, it’s
also essential to consider the entire landscape of data tools. Just like data science—or more
broadly, data fluency—can be considered a means to an end to solve business questions,
data tools can be considered a means to an end to perform data-related tasks.
1 Programming Languages
Open-source programming languages have skyrocketed in popularity over the last two
decades for data workflows. Apart from being free to use, open-source programming
languages provide a plethora of tools and packages that allow practitioners to hone skills
and perform data tasks across all data fluency competency areas. The most relevant
open-source programming languages for data science are Python, R, and Scala.
2 SQL
SQL is a structured query language that allows data professionals to query, access, and
manipulate data inside of database management systems. Just like one human language
can have many different dialects, there are many different dialects of SQL (i.e.,
PostgreSQL, Oracle SQL, SQL Server) that are used by different organizations. The
differences are generally minor as they share many commonalities in syntax and features.
3 Business Intelligence Tools (BI Tools)
BI tools have gained momentum over the past decade. Business intelligence tools are
essentially supercharged spreadsheet tools made for the digital age. They allow for the
organization, aggregation, and visualization of data in easy to use point-and-click
dashboards, with no coding required. While there are many business intelligence tools,
commonly used tools include the following:
Tableau offers a robust, flexible, and intuitive interface to connect to raw data and
create beautiful and interactive visualizations that allow teams to get an overview of
their data.
Power BI offers seamless connectivity with various raw data sources and an easy
point-and-click interface to visualize and process data. A key feature of Power BI is
that one version of it comes with Office 365 for the Enterprise and has an interface
that is slightly reminiscent of Excel.
4 Spreadsheets
Spreadsheets have always been the go-to tool for data practitioners. They allow for easy,
intuitive, and drag-and-drop interfaces for manipulating, aggregating, and visualizing
data. However, they fall short when processing large amounts of data and often produce
bottlenecks when creating reproducible shareable analysis. The most popular spreadsheet
tools are Microsoft Excel and Google Sheets.
As organizations started collecting more and more data, it became imperative to efficiently
structure, organize, and store big data. Many solutions have emerged that accommodate
manipulating big data and orchestrating big data workflows. Notable examples include:
Spark is a framework that allows for large-scale data processing and manipulation. It
can be used using Python, Scala, or R.
Command line tools are used to systematize file handling, enable version control, work
with cloud tools, execute data pipelines developed using other data tools—especially
programming languages—easily and scalably. The two most notable command line tools
are the following:
Shell is a command line interface that allows running programs, automating tasks,
and accessing file directories.
Git is a version control tool that allows for easily tracking and experimenting with
changes done to code repositories.
7 Cloud platforms
Cloud platforms provide tools for organizations and teams to store and process data, host
applications, and deploy data pipelines all using remote computing resources hosted by
cloud providers. They have become the de-facto solution for data infrastructure for many
organizations as they are easier to maintain, scale, and more resilient. The most widely
used cloud platform providers are Amazon Web Services (AWS), Microsoft Azure, and
Google Cloud.
7
The different relationships with data
While each organization and the data they produce is different, there are commonalities in
the different relationships individuals have with data. A useful way of thinking about and
scaling data-focused upskilling efforts is with data personas. Each data persona has a
different relationship with data and requires different data fluency competencies to become
empowered to do their best work. Organizations can then map different roles within the
organization to that persona and create a curated, personalized learning experience
depending on what they need to learn.
Data Consumers and Leaders often work in nontechnical roles, but they consume data
insights and analytics to make data-driven decisions. They often need to have
conversations with data professionals and should be able to distinguish when data can
and cannot be used to answer business questions.
DATA SKILLS:
Business Analysts are responsible for tying data insights to actionable results that increase
profitability or efficiency. They have deep knowledge of the business domain and often
use SQL alongside non-coding tools to communicate insights derived from data.
DATA SKILLS:
9
3 Data Analysts
Similar to Business Analysts, Data Analysts are responsible for analyzing data and
reporting insights from their analysis. They have a deep understanding of the data
analysis workflow and report their insights through a combination of coding and non-
coding tools.
DATA SKILLS:
10
4 Data Scientists
Data Scientists investigate, extract, and report meaningful insights in the organization’s
data. They communicate these insights to nontechnical stakeholders and have a good
understanding of machine learning workflows and how to tie them back to business
applications. They work almost exclusively with coding tools, conduct analysis, and often
work with big data tools.
DATA SKILLS:
11
5 Machine Learning Scientists
Machine Learning Scientists are responsible for developing machine learning systems at
scale. They derive predictions from data using machine learning models to solve problems
like predicting churn and customer lifetime value, and are responsible for deploying these
models for the organization to use.
DATA SKILLS:
12
6 Statisticians
Similar to Data Scientists, Statisticians work on highly rigorous analysis, which involves
designing and maintaining experiments such as A/B tests and hypothesis testing. They
focus on quantifying uncertainty and presenting findings that require exceptional degrees
of rigor, like in finance or healthcare.
DATA SKILLS:
13
7 Programmers
Programmers are highly technical individuals that work on data teams and work on
automating repetitive tasks when accessing and working with an organization’s data.
They bridge the gap between traditional software engineering and data science and have
a thorough understanding of deploying and sharing code at scale.
DATA SKILLS:
14
8 Data Engineers
Data Engineers are responsible for getting the right data in the hands of the right people.
They create and maintain the infrastructure and data pipelines that take terabytes of raw
data coming from different sources into one centralized location with clean, relevant data
for the organization.
DATA SKILLS:
15
3 How DataCamp scales
learning and development
for data fluency
While clear understanding of data roles
and tools are essential to scaling learning
and development programs for data
fluency, not every organization can create
its own data university like Airbnb has
done. Depending on your organization’s
data roles, the tools your employees use,
and the data skills they have, a variety of
educational programs, both online and in
person, can get your team up to speed.
At DataCamp, we create persona-driven, personalized learning journeys for each role based on
their skill set and desired learning outcomes. With DataCamp Signal™, learners can assess their
skills in various data competency areas, including data literacy, programming, data analysis,
and more—and receive personalized course recommendations based on their skill gaps. These
courses were created by the best instructors and industry professionals in the world. Unlike
other platforms, DataCamp features a modern learning experience with bite-sized videos and
hands-on-coding exercises, so employees learn by doing and stay engaged. Our mobile app
makes it easy for learners to hone skills on-the-go with short practice sessions to reinforce what
they’ve learned. And with DataCamp projects, they can tackle real world problems in a risk-free
environment and apply their new data skills right away.
This entire learning experience is easy to implement and manage for teams of any size, with an
administrator dashboard that allows custom learning paths based on roles and departments,
advanced analytics and insights to measure the impact of online learning, and seamless SSO
and LMS integrations. Teams benefit from our Customer Success Managers, which partner with
organizations to accelerate learning adoption, and provide valuable recommendations to help
achieve organization-wide data fluency. We have more than 6 million learners around the world
—and we’re just getting started. Close the talent gap. Visit datacamp.com.
16