Data governance defines roles, responsibilities, and processes to ensure accountability for, and ownership of, data assets across the enterprise. Credit: Claudio Schwarz / Unsplash Data governance definition Data governance is a system to define who within an organization has authority and control over data assets, and how those data assets may be used. It encompasses the people, processes, and technologies required to manage and protect data assets. The Data Governance Institute defines it as a system of decision rights and accountabilities for information-related processes, executed according to agreed-upon models that describe who can take what actions with what information, and when, under what circumstances, using what methods. And the Data Management Association (DAMA) International defines it as the planning, oversight, and control over management of data, and the use of data and data-related sources. Data governance framework Data governance may best be thought of as a function that supports an organization’s overarching data management strategy. Such a framework provides your organization with a holistic approach to collecting, managing, securing, and storing data. To help understand what a framework should cover, DAMA envisions data management as a wheel, with data governance as the hub from which the following 10 data management knowledge areas radiate: Data architecture: The overall structure of data and data-related resources as an integral part of the enterprise architecture. Data modeling and design: Analysis, design, building, testing, and maintenance. Data storage and operations: Structured physical data assets, storage deployment, and management. Data security: Ensuring privacy, confidentiality, and appropriate access. Data integration and interoperability: Acquisition, extraction, transformation, movement, delivery, replication, federation, virtualization, and operational support. Documents and content: Storing, protecting, indexing, and enabling access to data found in unstructured sources, and making this data available for integration and interoperability with structured data. Reference and master data: Managingshared data to reduce redundancy and ensure better data quality through standardized definition, and use of data values. Data warehousing and business intelligence (BI): Managing analytical data processing and enabling access to decision support data for reporting and analysis. Metadata: Collecting, categorizing, maintaining, integrating, controlling, managing, and delivering metadata. Data quality: Defining, monitoring, and maintaining data integrity, and improving data quality. When establishing a strategy, each of the above facets of data collection, management, archiving, and use should be considered. The Business Application Research Center (BARC) warns that data governance is a highly complex, ongoing program, not a Big Bang initiative, and it runs the risk of participants losing trust and interest over time. To counter that, BARC recommends starting with a manageable or application-specific prototype project, and then expanding across the company based on lessons learned. BARC recommends the following steps for implementation: Define goals and understand benefits Analyze current state and delta analysis Derive a roadmap Convince stakeholders and budget project Develop and plan the data governance program Implement the data governance program Monitor and control Data governance vs. data management Data governance is just one part of the overall discipline of data management, though an important one. Where data governance is about the roles, responsibilities, and processes to ensure accountability for and ownership of data assets, DAMA defines data management as an overarching term that describes the processes used to plan, specify, enable, create, acquire, maintain, use, archive, retrieve, control, and purge data. While data management has become a common term for the discipline, it’s sometimes referred to as data resource management or enterprise information management (EIM). Gartner describes EIM as an integrative discipline to structure, describe, and govern information assets across organizational and technical boundaries to improve efficiency, promote transparency, and enable business insight. Data governance and gen AI Older models of data governance may need to adjust in the age of gen AI to account for the automated data pipelines required. Likewise, compliance may become a moving target as regulatory environments evolve. These issues require an end-to-end strategy for data management and data governance that covers every step of the data journey: ingesting, storing, and querying data to analyzing, visualizing, and running AI and ML models. AWS believes there are two emerging areas of focus that governance needs to address: Many LLM use cases rely on enterprise knowledge drawn from unstructured data sources, including documents, transcripts, and images, alongside structured data from data warehouses. Unstructured data is typically stored in siloed systems and not managed or governed with the same rigor as structured data. Gen AI applications introduce a much higher number of data interactions than conventional applications, requiring the implementation of data security, privacy, and access control policies as part of the gen AI user workflow. For more on these issues and others, see 3 things to get right with data management for gen AI projects. Importance of data governance Most companies already have some form of governance for individual applications, business units, or functions, even if the processes and responsibilities are informal. As a practice, it’s about establishing systematic, formal control over these processes and responsibilities. Doing so can help companies remain responsive, especially as they grow to a size in which it’s no longer efficient for individuals to perform cross-functional tasks. Several of the overall benefits of data management can only be realized after the enterprise has established systematic data governance. Some of these benefits include: Better, more comprehensive decision support stemming from consistent, uniform data across the organization Clear rules for changing processes and data that help the business and IT become more agile and scalable Reduced costs in other areas of data management through the provision of central control mechanisms Increased efficiency through the ability to reuse processes and data Improved confidence in data quality and documentation of data processes Improved compliance with data regulations Goals of data governance The goal is to establish the methods, set of responsibilities, and processes to standardize, integrate, protect, and store corporate data. According to the Data Governance Institute, the universal goals for data governance programs include: Enabling better decision-making Reducing operational friction Protecting the needs of data stakeholders Training management and staff to adopt common approaches to data issues Building standard, repeatable processes Reducing costs and increasing effectiveness through coordination of efforts Ensuring transparency of processes According to BARC, an organization’s key goals should be to: Minimize risks Establish internal rules for data use Implement compliance requirements Improve internal and external communication Increase the value of data Facilitate the administration of the above Reduce costs Help to ensure the continued existence of the company through risk management and optimization BARC notes that such programs always span the strategic, tactical, and operational levels in enterprises, and they must be treated as ongoing, iterative processes. Data governance principles According to the Data Governance Institute, eight principles are at the center of all successful data governance and stewardship programs: All participants must have integrity in their dealings with each other. They must be truthful and forthcoming in discussing the drivers, constraints, options, and impacts for data-related decisions. Data governance and stewardship processes require transparency. It must be clear to all participants and auditors how and when data-related decisions and controls were introduced into the processes. Data-related decisions, processes, and controls subject to data governance must be auditable. They must be accompanied by documentation to support compliance-based and operational auditing requirements. They must define who is accountable for cross-functional data-related decisions, processes, and controls. They must define who’s accountable for stewardship activities that are the responsibilities of individual contributors and groups of data stewards. Programs must define accountabilities in a manner that introduces checks-and-balances between business and technology teams, and between those who create and collect information, those who manage it, those who use it, and those who introduce standards and compliance requirements. The program must introduce and support standardization of enterprise data. Programs must support proactive and reactive change management activities for reference data values, and the structure and use of master data and metadata. Best practices of data governance Data governance strategies must be adapted to best suit an organization’s processes, needs, and goals. Still, there are six core best practices worth following: Identify critical data elements and treat data as a strategic resource Set policies and procedures for the entire data lifecycle Involve business users in the governance process Don’t neglect master data management Understand the value of information Don’t over-restrict data use For more on doing data governance right, see 6 best practices for good data governance. Challenges in data governance Good data governance is no simple task. It requires teamwork, investment, and resources, as well as planning and monitoring. Some of the top challenges of a data governance program include: Lack of data leadership: Like other business functions, data governance requires strong executive leadership. The leader needs to give the governance team direction, develop policies for everyone in the organization to follow, and communicate with other leaders across the company. Lack of resources: Data governance initiatives can struggle for lack of investment in budget or staff. Data governance must be owned by and paid for by someone, but it rarely generates revenue on its own. Data governance and data management overall, however, are essential to leverage data to generate revenue. Siloed data: Data has a way of becoming siloed and segmented over time, especially as lines of business or other functions develop new data sources, apply new technologies, and so on. Your data governance program needs to continually break down new silos. For more on these difficulties and others, see 7 data governance mistakes to avoid. Data governance software and vendors Data governance is an ongoing program rather than a technology solution, but there are tools with data governance features that can help support your program. The tool that suits your enterprise will depend on your needs, data volume, and budget. According to PeerSpot, some of the more popular solutions include: Microsoft Purview Data Governance: The Purview portal is a unified platform for managing and governing data across sources including Azure, Microsoft 365, on-premises, and multicloud environments. Informatica Intelligent Data Management Cloud (IDMC): Used for data governance, metadata management, masking, and transformation, IDMC enables the centralization of master data, managing ETL processes, ensuring data quality, and maintaining compliance. Collibra Governance: Collibra is an enterprise-wide solution that automates many governance and stewardship tasks. It includes a policy manager, data helpdesk, data dictionary, and business glossary. Alation Data Catalog: Alation is an enterprise data catalog that automatically indexes data by source. One of its key capabilities, TrustCheck, provides real-time guardrails to workflows. Meant specifically to support self-service analytics, TrustCheck attaches guidelines and rules to data assets. erwin Data Intelligence (DI) for Data Governance: erwin DI combines data catalog and data literacy capabilities to provide awareness of and access to available data assets. It provides guidance on the use of those data assets and ensures data policies and best practices are followed. Varonis Data Governance Suite: Varonis’s solution automates data protection and management tasks, leveraging a scalable Metadata Framework that enables organizations to manage data access, view audit trails of every file and email event, identify data ownership across different business units, and find and classify sensitive data and documents. Ataccama ONE Platform: This data management and governance solution enables data profiling, data quality management, data integration, master data management, and metadata management, all to help organizations understand the quality and structure of their data. SAS Information Governance: Combining data management capabilities and search tools, SAS Information Governance gives users the ability to find, catalog, and protect data. SAP Data Governance: SAP Data Governance consolidates and manages master data across the business. It uses prebuilt data models, rules, workflows, and user interfaces to help users quickly deploy tasks. IBM Data Governance: IBM Data Governance leverages ML to collect and curate data assets. The integrated data catalog helps enterprises find, curate, analyze, prepare, and share data. Data governance certifications Data governance is a system but there are GRC certifications and master data management certifications that can help your organization gain an edge, including the following: Certified Governance Risk and Compliance (CGRC) Certified Information Management Professional (CIMP) Master Data Management Certified in Data Protection (CDP) Certified Public Sector Data Governance Professional (PSDGP) Certified in Risk and Information Systems Control (CRISC) Certified in Risk Management Assurance (CRMA) Certified in the Governance of Enterprise IT (CGEIT) DAMA Certified Data Management Professional (CDMP) Data Governance and Stewardship Professional (DGSP) GRC Professional (GRCP) Information Governance Professional (IGP) Master Data Management Certification (MDM) SAP Certified Application Associate – SAP Master Data Governance For related certifications, see Top 10 governance, risk, and compliance certifications and 10 master data management certifications that will pay off. Data governance roles Each enterprise composes its data governance differently, but there are some commonalities. Steering committee Governance programs span the enterprise, generally starting with a steering committee comprising of senior management, often C-level individuals or VPs accountable for lines of business. Morgan Templar, author of Get Governed: Building World Class Data Governance Programs, says steering committee members’ responsibilities include setting the overall governance strategy with specific outcomes, championing the work of data stewards, and holding the governance organization accountable to timelines and outcomes. Data owner Templar says data owners are people responsible to ensure that information within a specific data domain is governed across systems and lines of business. They’re generally members of the steering committee, though may not be voting members. Data owners are responsible for: Approving data glossaries and other data definitions Ensuring the accuracy of information across the enterprise Direct data quality activities Reviewing and approving master data management approaches, outcomes, and activities Working with other data owners to resolve data issues Second-level review for issues identified by data stewards Providing the steering committee with input on software solutions, policies, or regulatory requirements of their data domain Data steward Data stewards are accountable for the day-to-day management of data. They’re subject matter experts who understand and communicate the meaning and use of information, Templar says, and they work with other data stewards across the organization as the governing body for most data decisions. Data stewards are responsible for: Being subject matter experts for their data domain Identifying data issues and working with other data stewards to resolve them Acting as a member of the data steward council Proposing, discussing, and voting on data policies and committee activities Reporting to the data owner and other stakeholders within a data domain Working cross-functionally across lines of business to ensure their domain’s data is managed and understood SUBSCRIBE TO OUR NEWSLETTER From our editors straight to your inbox Get started by entering your email address below. Please enter a valid email address Subscribe