The Definitive Guide To Data Classification
The Definitive Guide To Data Classification
THE DEFINITIVE
GUIDE TO DATA
CLASSIFICATION
DATA CLASSIFICATION FOR DATA
PROTECTION SUCCESS
2020 EDITION
1
THE DEFINITIVE GUIDE TO DATA CLASSIFICATION
TABLE OF CONTENTS
03 Introduction
04 Part One: What is Data Classification?
06 Part Two: Data Classification Myths
08 Part Three: Why Data Classification is Foundational
12 Part Four: The Resurgence of Data Classification
16 Part Five: How Do You Want to Classify Your Data
21 Part Six: Selling Data Classification to the Business
26 Part Seven: Getting Successful with Data Classification
32 Part Eight: Digital Guardian Data Classification & Protection
2
INTRODUCTION
InfoSec professionals will perennially be challenged with more to do than time, budget, and staffing will allow. The most effective method to address this is through prioritization, and in
the case of your growing data, prioritization comes from data classification. In this guide you will learn what classification is, why it is important, even foundational to data security, and
much more.
3
THE DEFINITIVE GUIDE TO DATA CLASSIFICATION
PART ONE
WHAT IS DATA
CLASSIFICATION?
4
PART ONE: WHAT IS DATA CLASSIFICATION?
DATA CLASSIFICATION
WHAT: Data classification is a process of consistently categorizing HOW: There are a few key questions organizations need to ask to
data based on specific and pre-defined criteria so that this data can help define classification buckets. Answering these will guide your data
be efficiently and effectively protected. classification efforts and get the program started.
• What are the data types? (Structured vs Unstructured)
• What data needs to be classified?
• Where is my sensitive data?
• What are some examples of classification levels?
• How can data be protected and which controls should be used?
• Who is accessing my data?
CONFIDENTIAL
DATA 5
THE DEFINITIVE GUIDE TO DATA CLASSIFICATION
PART TWO
DATA
CLASSIFICATION
MYTHS
6
PART TWO: DATA CLASSIFICATION MYTHS
3 MYTHS OF DATA
CLASSIFICATION
MYTH 1: MYTH 2: MYTH 3:
LONG TIME TO VALUE. IT'S TOO COMPLICATED. IT'S ANOTHER LEVEL OF
Automated classification drives insights from day one. Many data classification projects get bogged down
BUREAUCRACY.
Automation for both context and content brings order because of overly complex classification schemes. When
Data classification can be an enabler and a way to
to all your sensitive data; quickly and easily. it comes to classification more is not better; more
simplify data protection. By understanding what
is just more complex.
portion of your data is sensitive, resources are allocated
Data collection and visibility can continue until the
appropriately.
organization is prepared to deploy and operationalize a PricewatershouseCoopers, Forrester, and AWS
policy. Even without a policy, insights from automated all recommend startting with just three categories.
Everyone understands what needs to be protected.
data classification can drive security improvements. Starting with three can dramatically simplify getting
Sensitive and regulated data is prioritized; public data
your program off the ground. If after deployment more
is given lower priority, or destroyed, to eliminate future
are needed your decision will be driven by data, not
risk to its theft.
speculation.
7
THE DEFINITIVE GUIDE TO DATA CLASSIFICATION
PART THREE
WHY DATA
CLASSIFICATION IS
FOUNDATIONAL
8
PART THREE: WHY DATA CLASSIFICATION IS FOUNDATIONAL
9
PART THREE: WHY DATA CLASSIFICATION IS FOUNDATIONAL
(source: Understanding Insider Threats Published: May 2, 2016, Erik T. Heidt, Anton Chuvakin)
10
PART THREE: WHY DATA CLASSIFICATION IS FOUNDATIONAL
(source: Rethinking Data Discovery and Data Classification Strategies, Forrester Research Inc., July 10, 2018, Heidi Shey)
11
THE DEFINITIVE GUIDE TO DATA CLASSIFICATION
PART FOUR
THE RESURGENCE
OF DATA
CLASSIFICATION
12
PART FOUR: THE RESURGENCE OF DATA CLASSIFICATION
CLASSIFICATION HELPS
PROTECT AGAINST ALL THREATS
The value to classification was once limited to protection from insider threats. With the growth in outsider threats, classification takes on a new
importance. It provides the guidance for information security pros to allocate resources towards defending the crown jewels against all threats.
Internal actors cause both malicious and unintentional data loss. With a classification program
in place the mistyped email address in a message with sensitive data is flagged. Files that are
intentionally being leaked are classified as sensitive and get the attention of security solutions,
such as Data Loss Prevention (DLP).
External actors seek data that can be monetized. Understanding which data within your
organization has the greatest value, and the greatest risk for theft, is where classification
delivers value. By understanding the greater potential impact of an attack on sensitive
data, advanced threat detection tools escalate alarms accordingly to allow more immediate
response.
13
PART FOUR: THE RESURGENCE OF DATA CLASSIFICATION
14
PART FOUR: THE RESURGENCE OF DATA CLASSIFICATION
ADOPTION MOMENTUM
44% 33%
of enterprises currently use data classification
as part of their overall information risk program.
more are evaluating data
classification with over half that
number planning to implement data
classification in the next 18 months.
(source: Forrester's Global Business Technographics® Security Survey, Forrester Research Inc., 2015)
15
THE DEFINITIVE GUIDE TO DATA CLASSIFICATION
PART FIVE
HOW DO YOU WANT
TO CLASSIFY YOUR
DATA?
16
ONE SIZE CHOOSE CLASSIFICATION METHODS BASED ON
“
THE DATA TYPES MOST IMPORTANT TO YOUR
DOES NOT BUSINESS
FIT ALL
Use the combination of data
classification approaches and
techniques most appropriate for the
datasets they're trying to secure.
(source: Innovation Insight for Unstructured Data Classification, Gartner, May 2017, Marc-Antoine Meunier, Brian Reed)
17
PART FIVE: HOW DO YOU WANT TO CLASSIFY YOUR DATA
CONTENT
Content-based answers “What is in the document?”
Context-based answers “How is the data being used,” User-based relies on user knowledge and discretion at
“Who is accessing it,” “Where are they moving it,” “When creation, edit, or review to flag sensitive documents.
are they accessing it”.
CONTEXT USER
18
PART FIVE: HOW DO YOU WANT TO CLASSIFY YOUR DATA
Compliance data is often structured Intellectual property seldom follows Where a mix of regulated data and Data owners should know their data
and/or residing in predictable a pattern like a credit card number. intellectual property drive enterprise best. A user-based classification
locations. Leading with a content- To address, this context classification growth, organizations looking to better approach allows them to apply this
based classification will provide the looks to other attributes to assign understand and protect their data look knowledge to improve classification
greatest ability to accurately classify classification. The application used to a blended approach. accuracy.
PII, PHI, PCI, and GDPR data. or the storage location are two ways
IP can be classified to support data
protection.
19
AUTOMATED DYNAMIC DATA CLASSIFICATION REQUIRES BOTH
“
TOOLS AND HUMAN INTERVENTION
AND MANUAL
— BETTER
TOGETHER Recognize that data is a living thing. Dynamic data
classification requires the integration of both manual
processes involving employees as well as tools for
automation and enforcement. Human intervention
provides much-needed context for data classification,
while tools enable efficiency and policy enforcement.
(source: Rethinking Data Discovery and Data Classification Strategies, Forrester Research Inc., July 10, 2018, Heidi Shey)
20
THE DEFINITIVE GUIDE TO DATA CLASSIFICATION
PART SIX
SELLING DATA
CLASSIFICATION TO
THE BUSINESS
21
PART SIX: SELLING DATA CLASSIFICATION TO THE BUSINESS
The ultimate technical responsibility for The P&L leaders who watch the top (and The feet on the street; the knowledge Legal is there when things go wrong
data protection falls upon one, or both, bottom) line numbers of the business workers that are often writing the code, and data leaks. Often the backstop in a
of these roles. Where the CIO is running units. This role has a more immediate creating the CAD documents, or drafting data protection program, legal needs to
the IT operations, the CISO is securing reason to support data classification – the M&A proposals. They are closest understand the scope of the sensitive
the IT operations. For them to be loss of data in their business unit could to the data and are instrumental to any data (exposure) and the protection in
effective they both need to understand result in revenue impact, fines, or both. protection program, it must serve its place (mitigating factors) to ensure the
the sensitive data landscape. protective purpose without impeding organization is properly managing the
Classification drives visibility and business. risk. Risk is unavoidable in business,
• CIO: Classification guides and protection of both customer data (PII) but which risks to accept needs to be a
simplifies IT infrastructure and the product development data (IP) Including the users in a classification calculated and conscious decision.
investment decisions by cataloging that fuels growth. program heightens awareness to the
volume, location, and type of sensitive need to protect data and the negative
data. repercussions if that data leaks.
• CISO: Classification highlights where
to allocate the security resources and
can spot security gaps before they
become breaches.
22
CLASSIFICATION
"QUICK WINS" TIP
Get users involved early. Any change that requires workflow
TIP modifications can be a source of friction. If your data classification
project involves user-based classification (and not all do, some
rely wholly on automated data classification techniques), getting
the users on board ahead of the project means that when roll-out
happens they are educated, enabled, and understand the needs,
along with the benefits, in *their* terms.
23
PART SIX: SELLING DATA CLASSIFICATION TO THE BUSINESS
POSITIONING DATA
CLASSIFICATION
DATA CHAMPIONS EXECUTIVES
The data champions are those who have the most To a data intensive organization (something that
invested in the data. The goal here is to ensure most are becoming whether they realize it or not)
they understand: protecting their data is paramount to sustainable
• What they are creating has value competitive advantage.
• The value is worth protecting from both • Classification can drive revenue growth by
internal and external threats enabling secure partnerships and growth
• They are an important piece of the protection initiatives
• Classification can reduce spend by limiting
the scope of data needing protection
and increasing the efficiency of existing
investments
• Classification can reduce risk by highlighting
where sensitive data is and where it is going
24
PART SIX: SELLING DATA CLASSIFICATION TO THE BUSINESS
OVERCOMING OBJECTIONS
“We’ve gotten along just fine without it.” This passive message is akin to saying “I’ve
never needed insurance in the past,” and reflects a misunderstanding of the importance
of classification or a misperception that it is only for more mature organizations. While
organizations can protect their data without classification, it comes at the expense of
Building your data
efficiency. protection on
• With classification, data loss prevention and advanced threat protection have the insight
to understand the difference between regulated, internal only, and public data. This insight
classification is the
intelligently elevates data risks based on the impact of a breach. foundation needed for
• Without classification, data protection solutions, including data loss prevention and
advanced threat protection, will be prone to higher false positives and false negatives, and
success.
alerts will be of lower fidelity.
25
THE DEFINITIVE GUIDE TO DATA CLASSIFICATION
PART SEVEN
GETTING SUCCESSFUL
WITH DATA
CLASSIFICATION
26
PART SEVEN: GETTING SUCCESSFUL WITH DATA CLASSIFICATION
DATA PROTECTION
FRAMEWORK
Many organizations need help getting started. Forrester created a framework to guide you on this journey. Their “Data Security & Control Framework”
(figure below) breaks the problem of controlling and securing data into three steps: Define, Dissect, Defend. With these steps completed your
organizations better understands your data and can then allocate resources to more efficiently protect critical assets. At the top of their framework:
Discovery and Classification.
DEFINE
DEFINE: This involves data discovery and data classification.
Data discovery Data classification
DEFEND: To defend your data, there are only four levers you
DEFEND can pull — controlling access, inspecting data usage patterns for
abuse, disposing of data when the organization no longer needs it
Access Inspect Dispose Kill or “killing” data via encryption to devalue it in the event that it is
stolen.
(source: The Future Of Data Security And Privacy: Growth And Competitive Differentiation, Forrester Research, Inc., October 16, 2019, Heidi Shey, Enza Iannopollo)
27
PART SEVEN: GETTING SUCCESSFUL WITH DATA CLASSIFICATION
1. EXEC BUY-IN
Put in place technical and procedural controls to Document the goals, objectives, and strategic intent
enforce policies.
5. SOLUTIONS 2. POLICY behind the classification projects.
OPT
E
LYZ
IMIZ
ANA
E
Find sensitive data - wherever it resides - 4. DISCOVERY 3. SCOPE Create guardrails around your program; clearly define
including endpoint, database, and cloud. what is in and out of scope.
28
PART SEVEN: GETTING SUCCESSFUL WITH DATA CLASSIFICATION
Below is an example policy matrix illustrating the document types, risks, and protective controls. (Click here for a blank template)
EXAMPLE DOCUMENT Product datasheet, job postings. Strategic planning document, product Customer database, payment card
roadmaps, CAD drawings. information, health record information.
REPERCUSSIONS IF LEAKED None Loss of competitive advantage, loss of Fines, customer churn, reputational
brand equity, reputational damage. damage.
CONTROLS IN PLACE N/A Education and awareness training, file Education and awareness training,
encryption, data loss prevention, automated encryption, data loss
advanced threat protection, reporting and prevention, advanced threat protection,
auditing. reporting and auditing.
29
PART SEVEN: GETTING SUCCESSFUL WITH DATA CLASSIFICATION
EXAMPLE DOCUMENT
REPERCUSSIONS IF LEAKED
CONTROLS IN PLACE
30
PART SEVEN: GETTING SUCCESSFUL WITH DATA CLASSIFICATION
GUIDANCE FOR
CLASSIFYING DATA
Consistently classifying your data requires a structured approach that eliminates as much guesswork as possible. Forrester suggests
evaluating data across three dimensions, ranking it as High, Medium, or Low with regard to Identifiability, Sensitivity, and Scarcity, to
build your data protection map. Forrester’s data classification scorecard provides the detailed mapping of all the combinations.
• Data that ranks low across all three, (product datasheet), typically falls into the “Public” category.
• Data that ranks high across all three, (payment card information), typically falls into “Restricted” category.
• Data that is a mixture of high, medium, and low rankings, (intellectual property), typically resides in the “Private” category.
(source: Forrester Research: How Dirty Is Your Data? Strategic Plan: The Customer Trust And Privacy Playbook Fatemeh Khatibloo May 14, 2018)
31
THE DEFINITIVE GUIDE TO DATA CLASSIFICATION
PART EIGHT
DIGITAL GUARDIAN
DATA CLASSIFICATION
& PROTECTION
32
PART EIGHT: DIGITAL GUARDIAN NEXT GENERATION DATA CLASSIFICATION & PROTECTION
DG DATA PROTECTION
To protect your expanding and valuable pool of data from insider and outsider threats organizations need a data-centric plan.
Below is a 4 step framework to take control of and protect your knowledge assets and keep them protected without impacting
the speed of business.
1 2
Discovery - You need to know exactly where your Classification – Structure and organization for
sensitive data is to protect it. This includes on DATA DATA your data enables your data security program and
DISCOVERY CLASSIFICATION
laptops, desktops, and servers, but also in the cloud. delivers more accurate protection.
4 3
Education & Enforcement – Provide real-time
Policies – Now that you know the Where and the
alerts for potentially risky behavior allowing users ENFORCEMENT POLICIES What, it is time to define How you are going to
to self correct. If needed, implement data
protect it.
protection policies and ensure they are followed.
33
PART EIGHT: DIGITAL GUARDIAN NEXT GENERATION DATA CLASSIFICATION & PROTECTION
DG DATA CLASSIFICATION
Digital Guardian classifies via context, content, and user based to cover the spectrum from fully automated to fully manual classification.
Digital Guardian’s data classification integrates into our data protection suite. This integration, and the built-in automation, delivers a more accurate data protection program to limit false
positives and false negatives.
By combining data discovery, data classification, policies, and enforcement Digital Guardian provides the comprehensive data protection needed to stop data theft.
EXAMPLE METHODS
34
PART EIGHT: DIGITAL GUARDIAN NEXT GENERATION DATA CLASSIFICATION & PROTECTION
AUTOMATION CONTINUUM
Automation drives repeatability and predictability, it also speeds implementation time. But it needs to be augmented with the knowledge of the data owners. Digital
Guardian delivers classification options that cover the spectrum from fully automated to fully manual to match your organizations' needs.
• Automated context and content classification gets your program operational quickly and provides consistent results for more accurate data security and to
demonstrate compliance.
• Manual, user-based classification incorporates the intimate knowledge and bigger-picture view data owners possess, delivering the accuracy and compliance
automation and AI cannot (yet).
• A blend of manual and automated provides the insights needed to scale securely and protect all your sensitive data.
Most DLP solutions require you to spend time Classify and tag based on Classify and tag based on User classification relies on the data owner to apply
identifying and classifying your sensitive data before predefined context, such as predefined content. Content the tag to the document at creation, or after
protection starts. Upon installation, DG proactively file properties, file location, or inspection engine identifies modification.
finds, classifies and tags files. application used. patterns in files or databases
then applies classification
tags to them.
35
PART EIGHT: DIGITAL GUARDIAN NEXT GENERATION DATA CLASSIFICATION & PROTECTION
36
PART EIGHT: DIGITAL GUARDIAN NEXT GENERATION DATA CLASSIFICATION & PROTECTION
37
THE DEFINITIVE GUIDE TO DATA CLASSIFICATION
THE DEFINITIVE
GUIDE TO DATA
CLASSIFICATION
2020 EDITION
QUESTIONS?
1-781-788-8180
info@digitalguardian.com
www.digitalguardian.com
©2020 Digital Guardian. All rights reserved.
38