Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
SNOWFLAKE AND
POWER BI
ANGEL ABUNDEZ
VP SOLUTIONS ARCHITECTURE, DESIGNMIND
2/18/2020
AGENDA
 Benefits of each technology
 Architectural Scenarios
 Usage scenarios
 Developer Best Practices
 Data Governance
© 2018 Snowflake Computing Inc. All Rights Reserved
© 2018 Snowflake Computing Inc. All Rights Reserved
A NEW ARCHITECTURE FOR DATA WAREHOUSING
Multi-cluster, shared data, in the cloud
4
Traditional Architectures Snowflake
Shared storage
Single cluster
Shared-disk
Decentralized, local storage
Single cluster
Shared-nothing Multi-cluster, shared data
• Centralized, scale-out storage
• Multiple, independent compute clusters
© 2018 Snowflake Computing Inc. All Rights Reserved
SEPARATE COMPUTE, SAME DATA
5
Data science
ETL
Dev/QA
BI/Visualization
(Auto scaling)
Elastic scaling for storage
Low-cost cloud storage, fully replicated and resilient
Elastic scaling for compute
Virtual warehouses scale up & down instantly without
downtime to support workload needs
Dedicated performance SLAs
Each warehouse can access the same tables at the same
time without performance penalty (including ETL)
Test/Dev/Staging/QA
Reference objects in multiple databases with
one SQL statement
Elastic scaling for concurrency
Auto-scaling maintains constant query performance
© 2018 Snowflake Computing Inc. All Rights Reserved 6
SECURE BY DESIGN
Authentication Access control Data encryption External validation
• Embedded multi-
factor authentication
• Federated
authentication
available
• Role-based access
control model
• Granular privileges on
all objects & actions
• All data encrypted,
always, end-to-end
• Encryption keys
managed
automatically
• Certified against
enterprise-class
requirements
© 2018 Snowflake Computing Inc. All Rights Reserved
Any data, any way, anywhere
Power BI
© 2017 MICROSOFT. ALL RIGHTS RESERVED.
CREATE  COLLABORATE  DISTRIBUTE
App Workspaces
Co-owned Dashboards
Co-owned Reports
Co-owned Datasets
Co-owned Apps
Distribute
Sharing
App Workspaces / Apps
Mobile apps
Embedding in apps
Static: PPT, Email
Other: Cortana, Publish to
web, Static (Alerts, Print)
Development Production
App Workspaces
Co-owned Dashboards
Co-owned Reports
Co-owned Datasets
Co-owned Apps
IT Pro
Existing
Data
LOB
Applications
FilesData Marts
End Users
Power BI/Excel Power Tools
Power BI/SharePoint
Analysis
Cubes
Data
Warehouse
A balanced approach…
Power BI Delivery Approaches
Business-Led
Self-Service BI
Bottom-Up Approach
IT-Managed
Self-Service BI
Blended Approach
Corporate BI
Top-Down Approach
Analysis using any type of
data source; emphasis on
data exploration and
freedom to innovate
Ownership:
Business supports all
elements of the solution
Scope of Power BI use
by business users:
Data preparation, data
modeling, report creation
& execution
Governed by:
Business
A “managed” approach
wherein reporting utilizes
only predefined/governed
data sources
Ownership:
IT: data + semantic layer
Business: reports
Scope of Power BI use
by business users:
Creation of reports and
dashboards
Governed by:
IT: data + semantic layer
Business: reports
Utilization of reports and
dashboards published by
IT for business users to
consume
Ownership:
IT supports all elements
of the solution
Scope of Power BI use
by business users:
Execution of
published reports
Governed by:
IT
Ownership Transfer
Over time, certain self-service solutions deemed as critical to the business may transfer ownership
and maintenance to IT. It’s also possible for business users to adopt a prototype created by IT.
POWER BI ARCHITECTURE
USAGE SCENARIO DEMOS
TRADITIONAL STAR SCHEMA, DENORMALIZED DATASETS, SHARED DATASETS/DATAFLOWS
PUTTING IT TOGETHER
ANALYTICS ARCH W/ GATEWAY
Pros
 Better governance on service accounts used
 Dedicated resources for scheduled refresh or direct
query
 Control refresh or direct query performance
 Low cost per node
Cons
 Additional maintenance of Gateway Cluster
 Workspace contributors or higher need access to
Gateway Data Sources
NOW THINGS ARE SIMPLER… OR ARE THEY?
ANALYTICS ARCH WITHOUT GATEWAY
Pros
 No infrastructure dependency
 Less steps to deploy new datasets
 Zero resource cost to integrate
 Share datasets across workspaces
Cons
 Cannot enforce credentials used by datasets
 Can introduce inconsistent deployments leading to:
 Dataset sprawl
 Unauthorized access
 Varying Terminology for the same metrics
WANT EVEN BETTER PERFORMANCE?
ANALYTICS ARCH W/ ANALYSIS SERVICES
Pros
 Best performance possible
 Shared dataset across any workspaces
 Gets around the 1GB / file limitation
Cons
 Relies on AAD and/or AD
 Need to learn Visual Studio
 Different UI
 Different deployment process
 Need to learn Analysis Services
 Performance best practices
 Processing options
 Monitoring and Maintenance
DEVELOPER BEST PRACTICES
 Make sure you are Query Folding (pushing down Native queries)
 Create DATE dimension in Snowflake with Fiscal Calendar
 If report is in Direct Query, double-check the Scheduled Cache Refresh Setting is set to Weekly or Monthly. Otherwise YOU
WILL GET CHARGED!
 Use Certification process in Power BI for Certified Datasets
DATA GOVERNANCE
w/ Gateway w/o Gateway

More Related Content

Snowflake + Power BI: Cloud Analytics for Everyone

  • 1. SNOWFLAKE AND POWER BI ANGEL ABUNDEZ VP SOLUTIONS ARCHITECTURE, DESIGNMIND 2/18/2020
  • 2. AGENDA  Benefits of each technology  Architectural Scenarios  Usage scenarios  Developer Best Practices  Data Governance
  • 3. © 2018 Snowflake Computing Inc. All Rights Reserved
  • 4. © 2018 Snowflake Computing Inc. All Rights Reserved A NEW ARCHITECTURE FOR DATA WAREHOUSING Multi-cluster, shared data, in the cloud 4 Traditional Architectures Snowflake Shared storage Single cluster Shared-disk Decentralized, local storage Single cluster Shared-nothing Multi-cluster, shared data • Centralized, scale-out storage • Multiple, independent compute clusters
  • 5. © 2018 Snowflake Computing Inc. All Rights Reserved SEPARATE COMPUTE, SAME DATA 5 Data science ETL Dev/QA BI/Visualization (Auto scaling) Elastic scaling for storage Low-cost cloud storage, fully replicated and resilient Elastic scaling for compute Virtual warehouses scale up & down instantly without downtime to support workload needs Dedicated performance SLAs Each warehouse can access the same tables at the same time without performance penalty (including ETL) Test/Dev/Staging/QA Reference objects in multiple databases with one SQL statement Elastic scaling for concurrency Auto-scaling maintains constant query performance
  • 6. © 2018 Snowflake Computing Inc. All Rights Reserved 6 SECURE BY DESIGN Authentication Access control Data encryption External validation • Embedded multi- factor authentication • Federated authentication available • Role-based access control model • Granular privileges on all objects & actions • All data encrypted, always, end-to-end • Encryption keys managed automatically • Certified against enterprise-class requirements
  • 7. © 2018 Snowflake Computing Inc. All Rights Reserved
  • 8. Any data, any way, anywhere Power BI © 2017 MICROSOFT. ALL RIGHTS RESERVED.
  • 9. CREATE  COLLABORATE  DISTRIBUTE App Workspaces Co-owned Dashboards Co-owned Reports Co-owned Datasets Co-owned Apps Distribute Sharing App Workspaces / Apps Mobile apps Embedding in apps Static: PPT, Email Other: Cortana, Publish to web, Static (Alerts, Print) Development Production App Workspaces Co-owned Dashboards Co-owned Reports Co-owned Datasets Co-owned Apps
  • 10. IT Pro Existing Data LOB Applications FilesData Marts End Users Power BI/Excel Power Tools Power BI/SharePoint Analysis Cubes Data Warehouse A balanced approach…
  • 11. Power BI Delivery Approaches Business-Led Self-Service BI Bottom-Up Approach IT-Managed Self-Service BI Blended Approach Corporate BI Top-Down Approach Analysis using any type of data source; emphasis on data exploration and freedom to innovate Ownership: Business supports all elements of the solution Scope of Power BI use by business users: Data preparation, data modeling, report creation & execution Governed by: Business A “managed” approach wherein reporting utilizes only predefined/governed data sources Ownership: IT: data + semantic layer Business: reports Scope of Power BI use by business users: Creation of reports and dashboards Governed by: IT: data + semantic layer Business: reports Utilization of reports and dashboards published by IT for business users to consume Ownership: IT supports all elements of the solution Scope of Power BI use by business users: Execution of published reports Governed by: IT Ownership Transfer Over time, certain self-service solutions deemed as critical to the business may transfer ownership and maintenance to IT. It’s also possible for business users to adopt a prototype created by IT.
  • 13. USAGE SCENARIO DEMOS TRADITIONAL STAR SCHEMA, DENORMALIZED DATASETS, SHARED DATASETS/DATAFLOWS
  • 15. ANALYTICS ARCH W/ GATEWAY Pros  Better governance on service accounts used  Dedicated resources for scheduled refresh or direct query  Control refresh or direct query performance  Low cost per node Cons  Additional maintenance of Gateway Cluster  Workspace contributors or higher need access to Gateway Data Sources
  • 16. NOW THINGS ARE SIMPLER… OR ARE THEY?
  • 17. ANALYTICS ARCH WITHOUT GATEWAY Pros  No infrastructure dependency  Less steps to deploy new datasets  Zero resource cost to integrate  Share datasets across workspaces Cons  Cannot enforce credentials used by datasets  Can introduce inconsistent deployments leading to:  Dataset sprawl  Unauthorized access  Varying Terminology for the same metrics
  • 18. WANT EVEN BETTER PERFORMANCE?
  • 19. ANALYTICS ARCH W/ ANALYSIS SERVICES Pros  Best performance possible  Shared dataset across any workspaces  Gets around the 1GB / file limitation Cons  Relies on AAD and/or AD  Need to learn Visual Studio  Different UI  Different deployment process  Need to learn Analysis Services  Performance best practices  Processing options  Monitoring and Maintenance
  • 20. DEVELOPER BEST PRACTICES  Make sure you are Query Folding (pushing down Native queries)  Create DATE dimension in Snowflake with Fiscal Calendar  If report is in Direct Query, double-check the Scheduled Cache Refresh Setting is set to Weekly or Monthly. Otherwise YOU WILL GET CHARGED!  Use Certification process in Power BI for Certified Datasets