Data Strategy
Data Strategy
3 key attributes that can help your organization unlock more value from data........ 5
Comprehensive. . .......................................................................................... 6
Integrated. . . . . . ........................................................................................... 12
Governed. . . . . . ............................................................................................ 16
2
I N T R O D U C T IO N
1
“Creating a data-driven culture,” CIO.com, March 2022
3
B EC O M I N G DATA-D R I V E N
Modern organizations need to easily access and analyze diverse types of data,
including log files, clickstreams, voice, and video. However, these wide-ranging
Trying to maintain data governance is a full-time job
data types are typically stored in silos across multiple data stores. To extract Traditional data architectures require risky, complicated management
intelligence, organizations must break down these silos to unify all types of procedures because data is accessed from so many places. Granting, tracking,
data. This important optimization of costs and operations is transforming auditing, and removing employee access—while simultaneously remaining
the infrastructure from a source of complexity and expense to an engine of in compliance with a growing number of regulations—is a full-time job.
value creation. Automating these mandatory data governance tasks frees modern teams to
shift their focus back to innovation.
The current state of decision making is unsustainable
Gartner reports that 65 percent of decisions made today are more complex
Data is increasingly difficult to secure
(involving more stakeholders or choices) than they were five years ago.2 To There was a time when IT teams chose between making their architectures fast
make better and faster decisions, organizations need the ability to perform or making them secure. Now, they need to deliver both. Meanwhile, security
analytics and machine learning (ML) operations in an agile, cost-effective attacks increased by 31 percent from 2020 to 2021, according to Accenture’s
way—using optimal tools and performance to scale for each use case. State of Cybersecurity Resilience 2021 report, while average attacks per
Organizations can no longer waste precious time constantly redeploying and organization increased from 206 to 270 year over year.4 How can organizations
reconfiguring infrastructure to scale performance and capacity. maximize privacy and security?
2
“How to Make Better Business Decisions,” Gartner, October 2021
”Half of AI Models Never Make It To Production: Gartner,” EnterpriseAI, August 2022
4
3
4
“State of Cybersecurity Resilience 2021: How aligning security and the business creates cyber resilience,” Accenture, 2021
B EC O M I N G DATA-D R I V E N
According to a PwC survey of more than a thousand senior executives, highly data-driven
organizations are three times more likely to report significant improvements in decision
making compared to those that rely less on data.5 Governed
AWS can help your organization implement an end-to-end strategy that makes data
management easier at every step of the journey—from ingesting, storing, and querying data
to analyzing, visualizing, and running ML models. Regardless of your business challenges, your
data strategy should be:
1. Comprehensive: Equipped with the right tools, with the optimal price performance for any
user, type of data, and use case
2. Integrated: The ability to integrate data that is stored and analyzed in different tools and
systems to gain a better understanding of your business and predict what will happen
3. Governed: Governance of all your data to securely give data access when and where your
users need it to speed innovation
A data-driven mindset may also require a broader cultural change in which both goals and
decisions are supported by the data strategy.
Follow the link below to explore why data plays a vital role in enabling this cultural change.
And learn why a growing number of companies are leveraging data-driven capabilities to
automate a set of business-critical use cases.
Farrell, M., “Data and Intuition: Good Decisions Need Both,” Harvard Business Publishing, January 2023
5
5
1
Comprehensive
Equipped with the right tools, with optimal price performance for any
user, use case, and data type
Businesses need to build future-proof data strategies that can meet their
needs now and in the future. It takes more than just a single data lake, data
warehouse, or business intelligence (BI) tools to harness data effectively. It
requires an end-to-end data strategy with a comprehensive set of tools that
accounts for the scale and variety of data and the many purposes for which
you want to use it. In fact, 94 percent of the top 1,000 AWS customers use
more than 10 AWS Database and Analytics services.
Building with a cloud provider that innovates to continuously bring you all
the data tools you will need and more with the right price performance for
your use case ensures that you have a data strategy that grows with you.
AWS has the broadest and deepest set of data capabilities to support any
data workload or use case. From data storage to analytics, machine learning
(ML), and generative AI, to end-user tools and solutions, AWS provides the
right capability to address your use case, so you don’t have to compromise
on performance, cost, or results. AWS is continually accelerating its pace of
innovation, so you will never outgrow AWS for your data needs.
6
C O M P R E H E N SI V E
To make decisions in real-time, you will need streaming data services such as Amazon Kinesis
Data Streams (Amazon KDS), which allow you to build applications for high-frequency event
94%
data, such as clickstream data, and gain access to insights in seconds. Amazon Kinesis Data
Firehose simply and reliably loads data streams into data lakes, warehouses, and analytics
services—no extract, transform, and load (ETL) or cumbersome data preparation required.
8
C O M P R E H E N SI V E Scale data-driven decision making
throughout your organization
• Amazon QuickSight: Meet varying analytic
Enabling data insights throughout the organization
needs from the same source of truth
It’s no longer just data-savvy individuals who can rapidly extract valuable, relevant insights through modern interactive dashboards,
from data to help inform decision making. ML-powered BI solutions such as Amazon paginated reports, embedded analytics, and
QuickSight, enable easy connectivity to data sources. Business analysts can utilize this data to natural language queries
showcase fresh trends and predictive insights on interactive BI visualizations and dashboards. • Amazon SageMaker Canvas: AWS no-code
interface that enables business analysts to
Amazon QuickSight Q uses ML, allowing users to query their data in plain language
generate accurate ML predictions without
without writing a single line of code. Business users can even ask “why” questions to better
prior experience
understand factors that are impacting data trends. They can also forecast metrics by stating
something such as, “Forecast sales for the next 12 months” to receive an immediate response • Amazon DataZone: Simplifies governed
based on the insights of past data and seasonality. A visual point-and-click interface enables access to data for business users
business analysts to generate accurate ML predictions without prior experience. In just a few • AWS Training and Certification: More than
clicks, analysts can import data from various sources, automatically prepare data, and build 150 professional development courses
and analyze ML models. related to data, analytics, and ML
• Amazon Bedrock: The easiest way to build
Boosting data proficiency and scale generative AI applications with
Having employees who can use data effectively will help your organization achieve its data foundation models
objectives. Invest in educating and upskilling your workforce in data, analytics, and ML with
AWS Training.
9
C U ST O M E R ST O RY
10
C U ST O M E R ST O RY
BMW Group
democratizes data
usage at scale
BMW Group moved to an AWS-based centralized data lake for its
agility, flexibility, and its ability to process terabytes of telemetry
data from millions of vehicles daily. Anonymized data from vehicle
sensors and other sources across the enterprise is now easily
accessible for internal teams who create customer-facing and
internal applications. Building up a human-readable data catalog
and clearly displaying data resources proved essential, boosting the
productivity of data analysts, data scientists, and engineers.
11
2
Integrated
Break down silos so data can be put to work effectively
Opportunities to transform your business with data exist all along the
value chain. But making such a transformation requires you to see the full
picture of your customer and business. With data spread across multiple
departments, services, on-premises databases, and third-party applications,
you need to be able to easily integrate data across silos to get the best
insights. Companies have various approaches to how they are unifying
data—data mesh, lake house, data fabric, and so on—but typically, it
involves a data lake as a foundational element. Data lakes allow you to
collect, store, organize, and process valuable data from your data silos and
make it available to analytics, visualization, and ML tools in a governed way.
12
I N T EG R AT E D Connect with hundreds of data
Zero-ETL sources
• Amazon AppFlow: Integrate data lakes and
Many organizations have multiple data lakes in addition to data warehouses, analytics
data warehouses with 50+ sources of data
tools, ML tools, and software-as-a-service (SaaS) applications. Integrating data across silos
requires complex ETL pipelines, which can take hours, if not days. That’s just not fast enough • AWS Data Exchange: Access 350+
for modern decision making. Organizations should adopt technologies that automate or third-party providers and 3,500+ public
eliminate ETL where possible. data products
• Amazon SageMaker Data Wrangler: Build
AWS is investing in a Zero-ETL future, allowing organizations to automatically integrate
ML models with 40+ data sources with a
all of their data. This includes bringing ML to the data source with SageMaker integration
single click
into Amazon Redshift, Amazon Aurora, Amazon Athena, and Amazon Neptune, integrating
Amazon Aurora and Amazon Redshift for real-time analytics; and providing a direct
integration between Amazon S3 and Amazon Redshift for real-time data streams. In addition,
you can run queries across data stored in operational databases, data warehouses, and data
lakes to provide insights across multiple data sources with no data movement using Amazon
Athena and Amazon Redshift.
13
C U ST O M E R ST O RY
14
C U ST O M E R ST O RY
15
3
Governed
Free your teams to move faster with governed data access across the
data lifecycle
As more data migrates to the cloud, driven by the cloud’s near-infinite scale
and horsepower, it’s imperative that enterprise data governance models
evolve in lockstep. IT and business leaders need up-to-date policies to
protect data as it moves back and forth among different repositories and to
accommodate changing privacy and data security regulations about where
data can be stored.5
6
Wexler, J., “A unified approach to data governance,” CIO, August 2021
16
GOVERNED Govern holistically with AWS
Simplifying data access permissions • AWS Lake Formation: Makes it easy to
Implementing a successful governance strategy continues to present a unique set of govern and audit the actions taken with data
challenges. It’s time-consuming and challenging for organizations to provide internal or in your data lake on Amazon S3
external consumers with their data with the right level of access to specific datasets. They • Amazon DataZone: A data management
often engage in heavy lifting, such as manual scripts or investigating individual data clusters, service to catalog, discover, share, and
to figure out which consumers have access to what data. govern data
Manual work can also lead to costly data quality issues across different teams and
departments. Without centralized governance tools, data gets locked down in siloes, which
means you won’t be able to access and analyze all the data you may need to solve problems
or identify large areas of opportunity.
“The Best Offense Is a Great Defense,” TechCrunch Brand Studio, sponsored by AWS, 2022
17
7
8
“The Economic Impact of Data Innovation 2023,” Splunk, 2022
GOVERNED
AWS offers a comprehensive set of resources to help you govern and ensure AI and ML models
are built in a responsible way, with data practices that mitigate bias and protect data privacy.
This includes purpose-built capabilities like Amazon SageMaker Clarify, transparency tools like
AWS AI Service Cards, and Amazon SageMaker Model Cards, and a course from Machine Learning
University (MLU) on fairness and bias. Data scientists can use governance controls in SageMaker
to gain end-to-end visibility into ML models, including training, version history, and model
performance—all in one place. Amazon Titan foundation models, which you can use to build
generative AI applications, are built to detect and remove harmful content in the data, reject
inappropriate content in the user input, and filter the models’ outputs that contain inappropriate
content such as hate speech, profanity, and violence.
18
C U ST O M E R ST O RY
By simplifying governance,
OneFootball saw a 40%
increase in the utilization
of its analytics platform
OneFootball has grown rapidly to become one of the world’s most
popular digital media platforms for soccer (“fútbol”) enthusiasts. To
better use data for the benefit of the company and 70 million fans
of “the beautiful game,” OneFootball built a nimbler solution on AWS
in just a few days. Since integrating data from its inefficient backend
databases into its cloud-based data lake, OneFootball has radically
simplified data ingestion and eliminated legacy ETL workloads
altogether. Beautiful game, indeed.
Pinterest puts
customers first
with governance
A scalable, automated fine-grained access control (FGAC) system
built using Amazon S3 ensured Pinterest’s growing data wouldn’t
outgrow the company’s existing controls. FGAC controls the access to
data and is based on multiple criteria offering options such as role-
based access control plus security for petabyte-scale datasets. It also
enabled creators and businesses on the platform to self-identify as a
member of an underrepresented group while ensuring that sensitive
data wouldn’t be used for any other purpose, such as advertising.
20
Making security more strategic A history of unmatched
reliability and security
AWS has prioritized security since day one—with continuously protected, high-performing,
resilient, and efficient infrastructure for your workloads and applications. World-class security
experts who monitor the AWS infrastructure also build and maintain a broad selection of Amazon S3
innovative security services—which can help simplify the complexities of your own security and
Store and retrieve any amount
regulatory requirements.
of data with the best security.
AWS Security services and solutions can enable a mix of important advantages:
• Getting to insights faster – Provide the right level of access to your resources at all times
while maintaining confidence that your data is protected. AWS Security is built with AWS Lake Formation
performance in mind, so you get maximum protection and data governance that doesn’t slow
Build a secure data lake in days
you down.
with fine-grained access control.
• Reducing downtime – Tougher, more modern cloud security helps keep your enterprise
moving, so you don’t have to stop analyzing data to perform a discrete security process—it
can be integrated into every step along the way.
• Staying within your budget – AWS keeps security cost-effective and scales with the evolving Multi-AZ Regions
needs of your security risks and requirements, protecting your organization’s investments Ensure seamless failovers if an
and its commitment to data initiatives. Availability Zone (AZ) is disrupted.
• Keeping your focus – From infrastructure to services, AWS is secure by taking security
into account at every step along the way, so you can spend more time transforming data
into better decisions that drive business results and less time worrying about security and
governance.
21
C O N C L U SIO N
Organizations that are data-driven seek the truth by treating data not as the sole property
of siloed departments but as an organizational asset for all to use. Realizing a modern data
strategy for your organization is possible, no matter its size, location, or business needs. AWS
provides the most comprehensive set of services over the entire end-to-end data journey for
any workload, type of data, and desired outcome.
Learn more about why AWS is the best place to unlock value from your data and turn real-
time insights into meaningful innovation. And explore how we can help your teams with
infrastructure, tooling, and implementation support via the world’s leading professional
services and partner network. When it comes to data, AWS customers know how to do it better.
©️ 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
22