Solution Path For Implementing A Comprehensive Architecture For Data and Analytics Strategies
Solution Path For Implementing A Comprehensive Architecture For Data and Analytics Strategies
Carlton Sapp
1. Enterprise Information 2 Acquire and 3. Enable for 4. Enable Business 5. Extend and Automate 6. Deploy and Integrate
intelligence and advanced analytics capabilities? Management Organize Analytics Insights With AI/ML Into Operations
management architecture to support business
How can I build and maintain a holistic data
Continuous Align With Governance People Account for Agile Database Cloud Versus
Improvement Business Strategy and Skills Citizen Roles Development On-Premises
"Enable Essential Data "Applying Effective Data "Implement Agile Database "Migrating Enterprise
Governance for Successful Big Governance to Secure Development to Support Your Databases and Data to
Data Architecture Deployment" Your Data Lake" Continuous Delivery Initiative" the Cloud"
ID: 351281 © 2018 Gartner, Inc.
External
Metadata
Management
Information
Master Data
Lifecycle
Management
Management
Information
Governance
Governance Compliance
Value Protection
Insights Trust
Findability
ID: 351281 © 2018 Gartner, Inc.
Social
▪ Specialist insight
▪ Tribal knowledge
▪ Usage tips
▪ Context and veracity
Business
▪ Definitions
▪ Quality rules
▪ Integration rules
▪ Usage rules
Technical Operational
▪ Data model ▪ Lineage
▪ Type ▪ Quality
▪ Format ▪ Provenance
▪ Structure
Cloud
Security
and Emerging
Technology
Security
Threat and Applications ▪ Data security governance
Vulnerability and Data defines policies and controls
Management IAM and Security
Security ▪ Data is pervasive; data
Team security must be too
▪ Data residency and hacking
are top risks
Security
Endpoint and
Monitoring
Mobile ▪ Application security testing
and is critical
Security
Operations
Network and
Gateway
Security
Streaming/In Motion
LOB Apps
Policy Orchestration
Stream Ingestion RT Algorithm
IT Log IoT Feeds
Real Time
Streaming
Video API
RT Traditional Interact
Image Audio Reporting
Device
Message High-Performance Pub/Sub
Broker Ingestion Strategy Self-Service
Analytic Smart Data
Staging/At Rest Data
Capabilities Discovery
Preparation/
Operational Systems Data Access Analyze
Logical Data Warehouse Layer Optimize Visual Embed
Distributed Columnar
Forecast Exploration
Other NoSQL Traditional Distributed Storage/
Data Process In-Memory Report
NoSQL NoSQL Management Temporary Plan Analytic
Data Dashboards App
Virtualization NoSQL Storage
Other IMDB API Integration Discover API
Transform SQL Data Collaborate
Other/ Storytelling/
Centralized/Monolithic Aggregate Hadoop Services Predict
Market- Narrative Automate
SQL RDBMS Model Enrich
place or Data
Datasets Store
Advanced
ERPs API
Analytics
Image Feeds Message
Doc
Queues Administration
Streaming/In Motion
LOB Apps
Policy Orchestration
Stream Ingestion RT Algorithm
IT Log IoT Feeds
Real Time
Streaming
Video API
RT Traditional Interact
Image Audio Reporting
Device
Message High-Performance Pub/Sub
Broker Ingestion Strategy Self-Service
Analytic Smart Data
Staging/At Rest Data
Capabilities Discovery
Preparation/
Operational Systems Data Access Analyze
Logical Data Warehouse Layer Optimize Visual Embed
Distributed Columnar
Forecast Exploration
Other NoSQL Traditional Distributed Storage/
Data Process In-Memory Report
NoSQL NoSQL Management Temporary Plan Analytic
Data Dashboards App
Virtualization NoSQL Storage
Other IMDB API Integration Discover API
Transform SQL Data Collaborate
Other/ Storytelling/
Centralized/Monolithic Aggregate Hadoop Services Predict
Market- Narrative Automate
SQL RDBMS Model Enrich
place or Data
Logical Data Lakes Datasets
Advanced Store
ERPs API
Internal External Data Science Analytics
Image Feeds Message Data Lake Data Lake Data Lake
Doc
Queues Administration
Object
Video Text IT Log Cloud Stores
External
Apps
Semantic Management
Management
Metadata
Repository Virtualization Distributed Process
Data Sources
Operational Query and Reporting Standard and Self-Service Reporting/ Data Science, AI, Machine Learning,
APIs
Reporting Ad Hoc Data Sourcing Statistical, Predictive
Access/Virtualization/Federation
Remote & Third- ODS DW Marts Data Lake
Party Systems Graph,
(Not Just Hadoop) Document,
Legacy DWs Sandboxes Sandboxes …
Staging
Security
Infrastructure Platforms And/Or
On-Premises Servers (Compute) Cloud Servers (Compute)
Node Node Node Node Node Node Node Node Node Node Node Node
Key Queryable Process Governance Source Source Source Source Source Source Source
Ingest Platform HTAP HTAP HTAP
ID: 351281 © 2018 Gartner, Inc.
Analytic Traditional
Capabilities Reporting
Interact
Device
Analyze
Optimize
Self-Service Smart Data
Model Automate
Storytelling/
Narrative
Enrich Data
Store
Administration Advanced
Analytics
ERP
Databases
Stream Preprocessing Sample Training/
Processing Data Selection Testing Set
Platform
Clustering Learning
Algorithm Algorithm Execution
External Administration
Compute
▪ AutoML (DataRobot)
R Distribution
▪ Decision engines Expert Systems ▪ Archive network service
Services
(Insight Engines)
▪ Intelligent search (CRAN, MRAN) ▪ R distribution services
▪ Knowledge- ▪ Package managers
based systems
Quantitative
ID: 351281 © 2018 Gartner, Inc.
Execution
in-database
Connecting
Store procedure
contains Python
code that executes
in-database
Application
Results
Server
ML Services
Python Runtime
Self-service Deliver
Acquire Organize Analyze
Analytics
Data Sources
Streaming/In Motion
LOB Apps
Push-method
Stream Policy Orchestration analytics
IT Log IoT Feeds Enterprise Analytics
Ingestion RT Algorithm
Streaming
Video API Real Time
RT Traditional Interact
Image Audio Reporting
IoT Self-Service Device
Pub/Sub
Staging/At Rest Platform Data Analytic Smart Data
Capabilities Discovery
Preparation/
Operational Systems Data Access
Analyze
Distributed
Logical Data Warehouse Columnar Optimize Visual
Embed
Layer Exploration
Distributed Storage/ Forecast
Other NoSQL Traditional Report
Process In-Memory Analytic
Data Data
NoSQL NoSQL Temporary Plan Dashboards
Virtualization Management NoSQL App
Other IMDB API Integration Data Storage Discover API
Transform SQL Other/Hadoop Services Collaborate Storytelling/
Narrative
Centralized/Monolithic Aggregate Market- Predict
Automate
SQL RDBMS place or Model
Datasets Advanced Enrich
Analytics Data
ERPs API Store
Image Feeds Administration
Doc
External
Integrated Analytic/
Processing Engines
Model Interchange Model Interchange Embedded
Format Standards Executes code Format Standards Analytic Model
associated with
analytic models
Bottom-Line and
Stakeholders Outcomes Other Results
Performance Feedback on
Executives
Metrics Strategy
Data and
Information Data Metrics Business
Stewards (Process, Task, Data)
Data Quality
Compliance
▪ Regulatory compliance
▪ Corporate policies and procedures
▪ IT policies and procedures
Information Profiling
▪ Data inventory and analysis
▪ Data quality, veracity
▪ Lineage and provenance
Data Stewards
Data Modeller Operations Planning Report/Analysis Admin/
Report/Analysis Business
IT/Business Developers
Report/Analysis Monitoring
Developers
Platforms, Including Developers
Appliances, Cloud, …
Virtualization
Information
Business Architect
Self-Service Governance
Virtualization Architect Intelligence Manager
Enablement
Specialist IT or
Business
Database Administrators
Infrastructure, Cloud, App Key DW Agile Lake Governance
This is the traditional option, but it can tie Unlike public cloud, the infrastructure is
business down into processing and maintaining dedicated to a single organization. This can
infrastructure and paying the associated costs. help meet certain needs such as strong security
It is best-suited for data needs that are well- control. Private clouds can also be hosted at an
understood and predictable or are governed external provider. These are best-suited for
by stringent regulations. mission-critical applications that have very high
security and uptime requirements.
Public
IaaS BDaaS
Provides clients full control of deploying This provides clients with access to an already-
their data stores(s) in public cloud. This also installed data store that clients can configure to
reduces some of the infrastructure overhead their needs. The advantage of this option is that
but still requires operating system and data businesses don’t need to procure and manage
store skills. infrastructure, and they don’t need to invest in
the corresponding skills. Cloud providers use
multitenancy, in which multiple clients use the
same data store in the public cloud but have a
clear separation of data.