Data Modeling with Snowflake: A practical guide to accelerating Snowflake development using universal data modeling techniques

Ebook664 pages3 hours

Data Modeling with Snowflake: A practical guide to accelerating Snowflake development using universal data modeling techniques

Name: Data Modeling with Snowflake: A practical guide to accelerating Snowflake development using universal data modeling techniques
Author: Serge Gershkovich
ISBN: 9781837632787

By Serge Gershkovich and Kent Graziano

Rating: 0 out of 5 stars

()

Read preview

About this ebook

The Snowflake Data Cloud is one of the fastest-growing platforms for data warehousing and application workloads. Snowflake's scalable, cloud-native architecture and expansive set of features and objects enables you to deliver data solutions quicker than ever before.
Yet, we must ensure that these solutions are developed using recommended design patterns and accompanied by documentation that’s easily accessible to everyone in the organization.
This book will help you get familiar with simple and practical data modeling frameworks that accelerate agile design and evolve with the project from concept to code. These universal principles have helped guide database design for decades, and this book pairs them with unique Snowflake-native objects and examples like never before – giving you a two-for-one crash course in theory as well as direct application.
By the end of this Snowflake book, you’ll have learned how to leverage Snowflake’s innovative features, such as time travel, zero-copy cloning, and change-data-capture, to create cost-effective, efficient designs through time-tested modeling principles that are easily digestible when coupled with real-world examples.

Skip carousel

LanguageEnglish

PublisherPackt Publishing

Release dateMay 31, 2023

ISBN9781837632787

Author

Serge Gershkovich

Related authors

Skip carousel

Related to Data Modeling with Snowflake

Related ebooks

Skip carousel

Database Design and Modeling with Google Cloud: Learn database design and development to take your data to applications, analytics, and AI
Ebook
Database Design and Modeling with Google Cloud: Learn database design and development to take your data to applications, analytics, and AI
byAbirami Sukumaran
Rating: 0 out of 5 stars
0 ratings
Data Engineering with dbt: A practical guide to building a cloud-based, pragmatic, and dependable data platform with SQL
Ebook
Data Engineering with dbt: A practical guide to building a cloud-based, pragmatic, and dependable data platform with SQL
byRoberto Zagni
Rating: 0 out of 5 stars
0 ratings
Hands-On Big Data Modeling: Effective database design techniques for data architects and business intelligence professionals
Ebook
Hands-On Big Data Modeling: Effective database design techniques for data architects and business intelligence professionals
byJames Lee
Rating: 0 out of 5 stars
0 ratings
Data Analysis and Business Modeling with Excel 2013
Ebook
Data Analysis and Business Modeling with Excel 2013
byDavid Rojas
Rating: 1 out of 5 stars
1/5
Mastering Tableau 2023: Implement advanced business intelligence techniques, analytics, and machine learning models with Tableau
Ebook
Mastering Tableau 2023: Implement advanced business intelligence techniques, analytics, and machine learning models with Tableau
byMarleen Meier
Rating: 0 out of 5 stars
0 ratings
Cloud Data Architectures Demystified: Gain the expertise to build Cloud data solutions as per the organization's needs (English Edition)
Ebook
Cloud Data Architectures Demystified: Gain the expertise to build Cloud data solutions as per the organization's needs (English Edition)
byAshok Boddeda
Rating: 0 out of 5 stars
0 ratings
The Predictive Project Manager
Ebook
The Predictive Project Manager
byPuneet Mathur
Rating: 0 out of 5 stars
0 ratings
Mastering Data Engineering and Analytics with Databricks: A Hands-on Guide to Build Scalable Pipelines Using Databricks, Delta Lake, and MLflow (English Edition)
Ebook
Mastering Data Engineering and Analytics with Databricks: A Hands-on Guide to Build Scalable Pipelines Using Databricks, Delta Lake, and MLflow (English Edition)
byManoj Kumar
Rating: 0 out of 5 stars
0 ratings
Agile Machine Learning with DataRobot: Automate each step of the machine learning life cycle, from understanding problems to delivering value
Ebook
Agile Machine Learning with DataRobot: Automate each step of the machine learning life cycle, from understanding problems to delivering value
byBipin Chadha
Rating: 0 out of 5 stars
0 ratings
Data Analysis and Harmonization: A Simple Guide
Ebook
Data Analysis and Harmonization: A Simple Guide
byJeff Voivoda
Rating: 0 out of 5 stars
0 ratings
Expert T-SQL Window Functions in SQL Server 2019: The Hidden Secret to Fast Analytic and Reporting Queries
Ebook
Expert T-SQL Window Functions in SQL Server 2019: The Hidden Secret to Fast Analytic and Reporting Queries
byKathi Kellenberger
Rating: 0 out of 5 stars
0 ratings
Beginning Power BI with Excel 2013: Self-Service Business Intelligence Using Power Pivot, Power View, Power Query, and Power Map
Ebook
Beginning Power BI with Excel 2013: Self-Service Business Intelligence Using Power Pivot, Power View, Power Query, and Power Map
byDan Clark
Rating: 0 out of 5 stars
0 ratings
Deep Learning with Azure: Building and Deploying Artificial Intelligence Solutions on the Microsoft AI Platform
Ebook
Deep Learning with Azure: Building and Deploying Artificial Intelligence Solutions on the Microsoft AI Platform
byMathew Salvaris
Rating: 0 out of 5 stars
0 ratings
Optimizing Databricks Workloads: Harness the power of Apache Spark in Azure and maximize the performance of modern big data workloads
Ebook
Optimizing Databricks Workloads: Harness the power of Apache Spark in Azure and maximize the performance of modern big data workloads
byAnirudh Kala
Rating: 0 out of 5 stars
0 ratings
Mastering Snowflake Platform: Generate, fetch, and automate Snowflake data as a skilled data practitioner (English Edition)
Ebook
Mastering Snowflake Platform: Generate, fetch, and automate Snowflake data as a skilled data practitioner (English Edition)
byPooja Kelgaonkar
Rating: 0 out of 5 stars
0 ratings
Data Lakehouse in Action: Architecting a modern and scalable data analytics platform
Ebook
Data Lakehouse in Action: Architecting a modern and scalable data analytics platform
byPradeep Menon
Rating: 0 out of 5 stars
0 ratings
Deep Learning for Data Architects: Unleash the power of Python's deep learning algorithms (English Edition)
Ebook
Deep Learning for Data Architects: Unleash the power of Python's deep learning algorithms (English Edition)
byShekhar Khandelwal
Rating: 0 out of 5 stars
0 ratings
Monetizing Machine Learning: Quickly Turn Python ML Ideas into Web Applications on the Serverless Cloud
Ebook
Monetizing Machine Learning: Quickly Turn Python ML Ideas into Web Applications on the Serverless Cloud
byManuel Amunategui
Rating: 0 out of 5 stars
0 ratings
Principles of Data Fabric: Become a data-driven organization by implementing Data Fabric solutions efficiently
Ebook
Principles of Data Fabric: Become a data-driven organization by implementing Data Fabric solutions efficiently
bySonia Mezzetta
Rating: 0 out of 5 stars
0 ratings
Learning Tableau 2022: Create effective data visualizations, build interactive visual analytics, and improve your data storytelling capabilities
Ebook
Learning Tableau 2022: Create effective data visualizations, build interactive visual analytics, and improve your data storytelling capabilities
byJoshua N. Milligan
Rating: 0 out of 5 stars
0 ratings
Scalable Big Data Architecture: A practitioners guide to choosing relevant Big Data architecture
Ebook
Scalable Big Data Architecture: A practitioners guide to choosing relevant Big Data architecture
byBahaaldine Azarmi
Rating: 2 out of 5 stars
2/5
Technology Operating Models for Cloud and Edge: Create your purpose-built distributed operating model for public, hybrid, multicloud, and edge
Ebook
Technology Operating Models for Cloud and Edge: Create your purpose-built distributed operating model for public, hybrid, multicloud, and edge
byAhilan Ponnusamy
Rating: 0 out of 5 stars
0 ratings
Data Engineering with Apache Spark, Delta Lake, and Lakehouse: Create scalable pipelines that ingest, curate, and aggregate complex data in a timely and secure way
Ebook
Data Engineering with Apache Spark, Delta Lake, and Lakehouse: Create scalable pipelines that ingest, curate, and aggregate complex data in a timely and secure way
byManoj Kukreja
Rating: 0 out of 5 stars
0 ratings
Scalable Data Analytics with Azure Data Explorer: Modern ways to query, analyze, and perform real-time data analysis on large volumes of data
Ebook
Scalable Data Analytics with Azure Data Explorer: Modern ways to query, analyze, and perform real-time data analysis on large volumes of data
byJason Myerscough
Rating: 0 out of 5 stars
0 ratings
Practical Azure SQL Database for Modern Developers: Building Applications in the Microsoft Cloud
Ebook
Practical Azure SQL Database for Modern Developers: Building Applications in the Microsoft Cloud
byDavide Mauri
Rating: 0 out of 5 stars
0 ratings
BigQuery for Data Warehousing: Managed Data Analysis in the Google Cloud
Ebook
BigQuery for Data Warehousing: Managed Data Analysis in the Google Cloud
byMark Mucchetti
Rating: 0 out of 5 stars
0 ratings
Azure Synapse Analytics Cookbook: Implement a limitless analytical platform using effective recipes for Azure Synapse
Ebook
Azure Synapse Analytics Cookbook: Implement a limitless analytical platform using effective recipes for Azure Synapse
byGaurav Agarwal
Rating: 0 out of 5 stars
0 ratings
SQL Query Design Patterns and Best Practices: A practical guide to writing readable and maintainable SQL queries using its design patterns
Ebook
SQL Query Design Patterns and Best Practices: A practical guide to writing readable and maintainable SQL queries using its design patterns
bySteve Hughes
Rating: 0 out of 5 stars
0 ratings
Ultimate Data Engineering with Databricks: Develop Scalable Data Pipelines Using Data Engineering's Core Tenets Such as Delta Tables, Ingestion, Transformation, Security, and Scalability
Ebook
Ultimate Data Engineering with Databricks: Develop Scalable Data Pipelines Using Data Engineering's Core Tenets Such as Delta Tables, Ingestion, Transformation, Security, and Scalability
byMayank Malhotra
Rating: 0 out of 5 stars
0 ratings
Ultimate Data Engineering with Databricks
Ebook
Ultimate Data Engineering with Databricks
byMayank Malhotra
Rating: 0 out of 5 stars
0 ratings

Data Modeling & Design For You

Skip carousel

The Secrets of ChatGPT Prompt Engineering for Non-Developers
Ebook
The Secrets of ChatGPT Prompt Engineering for Non-Developers
byCea West
Rating: 5 out of 5 stars
5/5
Thinking in Algorithms: Strategic Thinking Skills, #2
Ebook
Thinking in Algorithms: Strategic Thinking Skills, #2
byAlbert Rutherford
Rating: 4 out of 5 stars
4/5
Data Analytics for Beginners: Introduction to Data Analytics
Ebook
Data Analytics for Beginners: Introduction to Data Analytics
byAnthony S. Williams
Rating: 4 out of 5 stars
4/5
DAX Patterns: Second Edition
Ebook
DAX Patterns: Second Edition
byMarco Russo
Rating: 5 out of 5 stars
5/5
Neural Networks for Beginners: An Easy-to-Follow Introduction to Artificial Intelligence and Deep Learning
Ebook
Neural Networks for Beginners: An Easy-to-Follow Introduction to Artificial Intelligence and Deep Learning
byBrian Murray
Rating: 2 out of 5 stars
2/5
Mastering Agile User Stories
Ebook
Mastering Agile User Stories
byDeEtta Balthazar
Rating: 4 out of 5 stars
4/5
150 Most Poweful Excel Shortcuts: Secrets of Saving Time with MS Excel
Ebook
150 Most Poweful Excel Shortcuts: Secrets of Saving Time with MS Excel
byAndrei Besedin
Rating: 3 out of 5 stars
3/5
Data Visualization: a successful design process
Ebook
Data Visualization: a successful design process
byAndy Kirk
Rating: 4 out of 5 stars
4/5
Raspberry Pi :Raspberry Pi Guide On Python & Projects Programming In Easy Steps
Ebook
Raspberry Pi :Raspberry Pi Guide On Python & Projects Programming In Easy Steps
byJason Scotts
Rating: 3 out of 5 stars
3/5
Power Pivot and Power BI: The Excel User's Guide to DAX, Power Query, Power BI & Power Pivot in Excel 2010-2016
Ebook
Power Pivot and Power BI: The Excel User's Guide to DAX, Power Query, Power BI & Power Pivot in Excel 2010-2016
byRob Collie
Rating: 4 out of 5 stars
4/5
The Esri Guide to GIS Analysis, Volume 3: Modeling Suitability, Movement, and Interaction
Ebook
The Esri Guide to GIS Analysis, Volume 3: Modeling Suitability, Movement, and Interaction
byAndy Mitchell
Rating: 0 out of 5 stars
0 ratings
Living in Data: A Citizen's Guide to a Better Information Future
Ebook
Living in Data: A Citizen's Guide to a Better Information Future
byJer Thorp
Rating: 4 out of 5 stars
4/5
Managing Data Using Excel
Ebook
Managing Data Using Excel
byMark Gardener
Rating: 5 out of 5 stars
5/5
Mastering Python Design Patterns
Ebook
Mastering Python Design Patterns
bySakis Kasampalis
Rating: 0 out of 5 stars
0 ratings
Hacks To Crush Plc Program Fast & Efficiently Everytime... : Coding, Simulating & Testing Programmable Logic Controller With Examples
Ebook
Hacks To Crush Plc Program Fast & Efficiently Everytime... : Coding, Simulating & Testing Programmable Logic Controller With Examples
byMichael Blake
Rating: 5 out of 5 stars
5/5
Machine Learning: A Comprehensive, Step-by-Step Guide to Learning and Understanding Machine Learning Concepts, Technology and Principles for Beginners: 1
Ebook
Machine Learning: A Comprehensive, Step-by-Step Guide to Learning and Understanding Machine Learning Concepts, Technology and Principles for Beginners: 1
byPeter Bradley
Rating: 0 out of 5 stars
0 ratings
Data Analytics with Python: Data Analytics in Python Using Pandas
Ebook
Data Analytics with Python: Data Analytics in Python Using Pandas
byFrank Millstein
Rating: 3 out of 5 stars
3/5
Mastering Hadoop
Ebook
Mastering Hadoop
bySandeep Karanth
Rating: 0 out of 5 stars
0 ratings
Supercharge Power BI: Power BI is Better When You Learn To Write DAX
Ebook
Supercharge Power BI: Power BI is Better When You Learn To Write DAX
byMatt Allington
Rating: 5 out of 5 stars
5/5
Supercharge Excel: When you learn to Write DAX for Power Pivot
Ebook
Supercharge Excel: When you learn to Write DAX for Power Pivot
byMatt Allington
Rating: 0 out of 5 stars
0 ratings
A Concise Guide to Object Orientated Programming
Ebook
A Concise Guide to Object Orientated Programming
byalasdair gilchrist
Rating: 0 out of 5 stars
0 ratings
Principles of Data Science
Ebook
Principles of Data Science
bySinan Ozdemir
Rating: 4 out of 5 stars
4/5
Machine Learning Interview Questions
Ebook
Machine Learning Interview Questions
byTech Interviews
Rating: 5 out of 5 stars
5/5
Microsoft Access: Database Creation and Management through Microsoft Access
Ebook
Microsoft Access: Database Creation and Management through Microsoft Access
bySteven Bright
Rating: 0 out of 5 stars
0 ratings
Hands-On Data Science for Marketing: Improve your marketing strategies with machine learning using Python and R
Ebook
Hands-On Data Science for Marketing: Improve your marketing strategies with machine learning using Python and R
byYoon Hyup Hwang
Rating: 5 out of 5 stars
5/5
Tableau Desktop Certified Associate: Exam Guide: Develop your Tableau skills and prepare for Tableau certification with tips from industry experts
Ebook
Tableau Desktop Certified Associate: Exam Guide: Develop your Tableau skills and prepare for Tableau certification with tips from industry experts
byDmitry Anoshin
Rating: 0 out of 5 stars
0 ratings
Python Data Analysis
Ebook
Python Data Analysis
byIvan Idris
Rating: 4 out of 5 stars
4/5
The Systems Thinker - Mental Models: The Systems Thinker Series, #3
Ebook
The Systems Thinker - Mental Models: The Systems Thinker Series, #3
byAlbert Rutherford
Rating: 0 out of 5 stars
0 ratings
Kafka in Action
Ebook
Kafka in Action
byDylan Scott
Rating: 0 out of 5 stars
0 ratings
Advanced Deep Learning with Python: Design and implement advanced next-generation AI solutions using TensorFlow and PyTorch
Ebook
Advanced Deep Learning with Python: Design and implement advanced next-generation AI solutions using TensorFlow and PyTorch
byIvan Vasilev
Rating: 0 out of 5 stars
0 ratings

Related podcast episodes

Skip carousel

Using Data To Illuminate The Intentionally Opaque Insurance Industry: The insurance industry is notoriously opaque and hard to navigate. Max Cho found that fact frustrating enough that he decided to build a business of making policy selection more navigable. In this episode he shares his journey of data collection and analysis and the challenges of automating an intentionally manual industry.
UNLIMITED
Using Data To Illuminate The Intentionally Opaque Insurance Industry: The insurance industry is notoriously opaque and hard to navigate. Max Cho found that fact frustrating enough that he decided to build a business of making policy selection more navigable. In this episode he shares his journey of data collection and analysis and the challenges of automating an intentionally manual industry.
byData Engineering Podcast
0 ratings
0% found this document useful
The Future of Data Science Platforms is Accessibility // Skylar Payne // Coffee Session #65
UNLIMITED
The Future of Data Science Platforms is Accessibility // Skylar Payne // Coffee Session #65
byMLOps.community
0 ratings
0% found this document useful
Defining A Strategy For Your Data Products: The primary application of data has moved beyond analytics. With the broader audience comes the need to present data in a more approachable format. This has led to the broad adoption of data products being the delivery mechanism for information. In this episode Ranjith Raghunath shares his thoughts on how to build a strategy for the development, delivery, and evolution of data products.
UNLIMITED
Defining A Strategy For Your Data Products: The primary application of data has moved beyond analytics. With the broader audience comes the need to present data in a more approachable format. This has led to the broad adoption of data products being the delivery mechanism for information. In this episode Ranjith Raghunath shares his thoughts on how to build a strategy for the development, delivery, and evolution of data products.
byData Engineering Podcast
0 ratings
0% found this document useful
Harnessing Generative AI For Creating Educational Content With Illumidesk: Generative AI has unlocked a massive opportunity for content creation. There is also an unfulfilled need for experts to be able to share their knowledge and build communities. Illumidesk was built to take advantage of this intersection. In this episode Greg Werner explains how they are using generative AI as an assistive tool for creating educational material, as well as building a data driven experience for learners.
UNLIMITED
Harnessing Generative AI For Creating Educational Content With Illumidesk: Generative AI has unlocked a massive opportunity for content creation. There is also an unfulfilled need for experts to be able to share their knowledge and build communities. Illumidesk was built to take advantage of this intersection. In this episode Greg Werner explains how they are using generative AI as an assistive tool for creating educational material, as well as building a data driven experience for learners.
byData Engineering Podcast
0 ratings
0% found this document useful
MLOps Meetup #29 // Scaling Machine Learning Capabilities in Large Organizations // Bertjan Broeksema & Axel Goblet
UNLIMITED
MLOps Meetup #29 // Scaling Machine Learning Capabilities in Large Organizations // Bertjan Broeksema & Axel Goblet
byMLOps.community
0 ratings
0% found this document useful
Surveying The Market Of Database Products: Databases are the core of most applications, whether transactional or analytical. In recent years the selection of database products has exploded, making the critical decision of which engine(s) to use even more difficult. In this episode Tanya Bragin shares her experiences as a product manager for two major vendors and the lessons that she has learned about how teams should approach the process of tool selection.
UNLIMITED
Surveying The Market Of Database Products: Databases are the core of most applications, whether transactional or analytical. In recent years the selection of database products has exploded, making the critical decision of which engine(s) to use even more difficult. In this episode Tanya Bragin shares her experiences as a product manager for two major vendors and the lessons that she has learned about how teams should approach the process of tool selection.
byData Engineering Podcast
0 ratings
0% found this document useful
Designing Data Platforms For Fintech Companies: Working with financial data requires a high degree of rigor due to the numerous regulations and the risks involved in security breaches. In this episode Andrey Korchack, CTO of fintech startup Monite, discusses the complexities of designing and implementing a data platform in that sector.
UNLIMITED
Designing Data Platforms For Fintech Companies: Working with financial data requires a high degree of rigor due to the numerous regulations and the risks involved in security breaches. In this episode Andrey Korchack, CTO of fintech startup Monite, discusses the complexities of designing and implementing a data platform in that sector.
byData Engineering Podcast
0 ratings
0% found this document useful
Building Applications With Data As Code On The DataOS: The modern data stack has made it more economical to use enterprise grade technologies to power analytics at organizations of every scale. Unfortunately it has also introduced new overhead to manage the full experience as a single workflow. At the Modern Data Company they created the DataOS platform as a means of driving your full analytics lifecycle through code, while providing automatic knowledge graphs and data discovery. In this episode Srujan Akula explains how the system is implemented and how you can start using it today with your existing data systems.
UNLIMITED
Building Applications With Data As Code On The DataOS: The modern data stack has made it more economical to use enterprise grade technologies to power analytics at organizations of every scale. Unfortunately it has also introduced new overhead to manage the full experience as a single workflow. At the Modern Data Company they created the DataOS platform as a means of driving your full analytics lifecycle through code, while providing automatic knowledge graphs and data discovery. In this episode Srujan Akula explains how the system is implemented and how you can start using it today with your existing data systems.
byData Engineering Podcast
0 ratings
0% found this document useful
Unlocking Your dbt Projects With Practical Advice For Practitioners: The dbt project has become overwhelmingly popular across analytics and data engineering teams. While it is easy to adopt, there are many potential pitfalls. Dustin Dorsey and Cameron Cyr co-authored a practical guide to building your dbt project. In this episode they share their hard-won wisdom about how to build and scale your dbt projects.
UNLIMITED
Unlocking Your dbt Projects With Practical Advice For Practitioners: The dbt project has become overwhelmingly popular across analytics and data engineering teams. While it is easy to adopt, there are many potential pitfalls. Dustin Dorsey and Cameron Cyr co-authored a practical guide to building your dbt project. In this episode they share their hard-won wisdom about how to build and scale your dbt projects.
byData Engineering Podcast
0 ratings
0% found this document useful
Modern Customer Data Platform Principles: Databases and analytics architectures have gone through several generational shifts. A substantial amount of the data that is being managed in these systems is related to customers and their interactions with an organization. In this episode Tasso Argyros, CEO of ActionIQ, gives a summary of the major epochs in database technologies and how he is applying the capabilities of cloud data warehouses to the challenge of building more comprehensive experiences for end-users through a modern customer data platform (CDP).
UNLIMITED
Modern Customer Data Platform Principles: Databases and analytics architectures have gone through several generational shifts. A substantial amount of the data that is being managed in these systems is related to customers and their interactions with an organization. In this episode Tasso Argyros, CEO of ActionIQ, gives a summary of the major epochs in database technologies and how he is applying the capabilities of cloud data warehouses to the challenge of building more comprehensive experiences for end-users through a modern customer data platform (CDP).
byData Engineering Podcast
0 ratings
0% found this document useful
Is it the Data or the Algorithm? Common pitfalls in Data Science and Deep Learning with Sara Beck: Sara Beck is the Machine Learning Solution Principal at Slalom Build. She thinks about Data Science and Deep Learning and how diagnosing and anticipating common data science pitfalls can help prevent issues before they happen.
UNLIMITED
Is it the Data or the Algorithm? Common pitfalls in Data Science and Deep Learning with Sara Beck: Sara Beck is the Machine Learning Solution Principal at Slalom Build. She thinks about Data Science and Deep Learning and how diagnosing and anticipating common data science pitfalls can help prevent issues before they happen.
byHanselminutes with Scott Hanselman
0 ratings
0% found this document useful
Data Sharing Across Business And Platform Boundaries: Sharing data is a simple concept, but complicated to implement well. There are numerous business rules and regulatory concerns that need to be applied. There are also numerous technical considerations to be made, particularly if the producer and consumer of the data aren't using the same platforms. In this episode Andrew Jefferson explains the complexities of building a robust system for data sharing, the techno-social considerations, and how the Bobsled platform that he is building aims to simplify the process.
UNLIMITED
Data Sharing Across Business And Platform Boundaries: Sharing data is a simple concept, but complicated to implement well. There are numerous business rules and regulatory concerns that need to be applied. There are also numerous technical considerations to be made, particularly if the producer and consumer of the data aren't using the same platforms. In this episode Andrew Jefferson explains the complexities of building a robust system for data sharing, the techno-social considerations, and how the Bobsled platform that he is building aims to simplify the process.
byData Engineering Podcast
0 ratings
0% found this document useful
Visualize - Bringing Structure to Unstructured Data // Markus Stoll // #258
UNLIMITED
Visualize - Bringing Structure to Unstructured Data // Markus Stoll // #258
byMLOps.community
0 ratings
0% found this document useful
[Exclusive] Databricks Roundtable // Introducing DBRX: The Future of Language Models
UNLIMITED
[Exclusive] Databricks Roundtable // Introducing DBRX: The Future of Language Models
byMLOps.community
0 ratings
0% found this document useful
How Column-Aware Development Tooling Yields Better Data Models: Architectural decisions are all based on certain constraints and a desire to optimize for different outcomes. In data systems one of the core architectural exercises is data modeling, which can have significant impacts on what is and is not possible for downstream use cases. By incorporating column-level lineage in the data modeling process it encourages a more robust and well-informed design. In this episode Satish Jayanthi explores the benefits of incorporating column-aware tooling in the data modeling process.
UNLIMITED
How Column-Aware Development Tooling Yields Better Data Models: Architectural decisions are all based on certain constraints and a desire to optimize for different outcomes. In data systems one of the core architectural exercises is data modeling, which can have significant impacts on what is and is not possible for downstream use cases. By incorporating column-level lineage in the data modeling process it encourages a more robust and well-informed design. In this episode Satish Jayanthi explores the benefits of incorporating column-aware tooling in the data modeling process.
byData Engineering Podcast
0 ratings
0% found this document useful
Small Data, Big Impact: The Story Behind DuckDB // Hannes Mühleisen & Jordan Tigani // #202
UNLIMITED
Small Data, Big Impact: The Story Behind DuckDB // Hannes Mühleisen & Jordan Tigani // #202
byMLOps.community
0 ratings
0% found this document useful
Eliminate The Overhead In Your Data Integration With The Open Source dlt Library: Cloud data warehouses and the introduction of the ELT paradigm has led to the creation of multiple options for flexible data integration, with a roughly equal distribution of commercial and open source options. The challenge is that most of those options are complex to operate and exist in their own silo. The dlt project was created to eliminate overhead and bring data integration into your full control as a library component of your overall data system. In this episode Adrian Brudaru explains how it works, the benefits that it provides over other data integration solutions, and how you can start building pipelines today.
UNLIMITED
Eliminate The Overhead In Your Data Integration With The Open Source dlt Library: Cloud data warehouses and the introduction of the ELT paradigm has led to the creation of multiple options for flexible data integration, with a roughly equal distribution of commercial and open source options. The challenge is that most of those options are complex to operate and exist in their own silo. The dlt project was created to eliminate overhead and bring data integration into your full control as a library component of your overall data system. In this episode Adrian Brudaru explains how it works, the benefits that it provides over other data integration solutions, and how you can start building pipelines today.
byData Engineering Podcast
0 ratings
0% found this document useful
Datapreneurs - How Todays Business Leaders Are Using Data To Define The Future: Data has been one of the most substantial drivers of business and economic value for the past few decades. Bob Muglia has had a front-row seat to many of the major shifts driven by technology over his career. In his recent book "Datapreneurs" he reflects on the people and businesses that he has known and worked with and how they relied on data to deliver valuable services and drive meaningful change.
UNLIMITED
Datapreneurs - How Todays Business Leaders Are Using Data To Define The Future: Data has been one of the most substantial drivers of business and economic value for the past few decades. Bob Muglia has had a front-row seat to many of the major shifts driven by technology over his career. In his recent book "Datapreneurs" he reflects on the people and businesses that he has known and worked with and how they relied on data to deliver valuable services and drive meaningful change.
byData Engineering Podcast
0 ratings
0% found this document useful
#131 - Data Essentials in Software Architecture - Pramod Sadalage
UNLIMITED
#131 - Data Essentials in Software Architecture - Pramod Sadalage
byTech Lead Journal
0 ratings
0% found this document useful
Build Your Second Brain One Piece At A Time: Generative AI promises to accelerate the productivity of human collaborators. Currently the primary way of working with these tools is through a conversational prompt, which is often cumbersome and unwieldy. In order to simplify the integration of AI capabilities into developer workflows Tsavo Knott helped create Pieces, a powerful collection of tools that complements the tools that developers already use. In this episode he explains the data collection and preparation process, the collection of model types and sizes that work together to power the experience, and how to incorporate it into your workflow to act as a second brain.
UNLIMITED
Build Your Second Brain One Piece At A Time: Generative AI promises to accelerate the productivity of human collaborators. Currently the primary way of working with these tools is through a conversational prompt, which is often cumbersome and unwieldy. In order to simplify the integration of AI capabilities into developer workflows Tsavo Knott helped create Pieces, a powerful collection of tools that complements the tools that developers already use. In this episode he explains the data collection and preparation process, the collection of model types and sizes that work together to power the experience, and how to incorporate it into your workflow to act as a second brain.
byData Engineering Podcast
0 ratings
0% found this document useful
#134 - A Developer-Centric Approach to Measuring and Improving Productivity - Margaret-Anne Storey & Abi Noda
UNLIMITED
#134 - A Developer-Centric Approach to Measuring and Improving Productivity - Margaret-Anne Storey & Abi Noda
byTech Lead Journal
0 ratings
0% found this document useful
An Overview Of The Sate Of Data Orchestration In An Increasingly Complex Data Ecosystem: Data systems are inherently complex and often require integration of multiple technologies. Orchestrators are centralized utilities that control the execution and sequencing of interdependent operations. This offers a single location for managing visibility and error handling so that data platform engineers can manage complexity. In this episode Nick Schrock, creator of Dagster, shares his perspective on the state of data orchestration technology and its application to help inform its implementation in your environment.
UNLIMITED
An Overview Of The Sate Of Data Orchestration In An Increasingly Complex Data Ecosystem: Data systems are inherently complex and often require integration of multiple technologies. Orchestrators are centralized utilities that control the execution and sequencing of interdependent operations. This offers a single location for managing visibility and error handling so that data platform engineers can manage complexity. In this episode Nick Schrock, creator of Dagster, shares his perspective on the state of data orchestration technology and its application to help inform its implementation in your environment.
byData Engineering Podcast
0 ratings
0% found this document useful
Reduce Friction In Your Business Analytics Through Entity Centric Data Modeling: For business analytics the way that you model the data in your warehouse has a lasting impact on what types of questions can be answered quickly and easily. The major strategies in use today were created decades ago when the software and hardware for warehouse databases were far more constrained. In this episode Maxime Beauchemin of Airflow and Superset fame shares his vision for the entity-centric data model and how you can incorporate it into your own warehouse design.
UNLIMITED
Reduce Friction In Your Business Analytics Through Entity Centric Data Modeling: For business analytics the way that you model the data in your warehouse has a lasting impact on what types of questions can be answered quickly and easily. The major strategies in use today were created decades ago when the software and hardware for warehouse databases were far more constrained. In this episode Maxime Beauchemin of Airflow and Superset fame shares his vision for the entity-centric data model and how you can incorporate it into your own warehouse design.
byData Engineering Podcast
0 ratings
0% found this document useful
Scaling yourself ‘down’ as an engineering leader w/ James Everingham #155: James Everingham, co-founder and VP of Engineering @ Lightspark, joins our podcast to share his best tools for scaling yourself down – not up – as an engineering leader. He discusses his latest career move shifting down in scale and how that impacts your risk tolerance as a leader. We also cover some of James’ favorite leadership methods, including the Socratic method, principle-based decision-making, and creating narratives as a product / eng org goal-setting tool, plus how he’s employed those tools effectively throughout his career. We also address navigating the balance between process & anti-process, approaches to product planning & finding PMF, and adapting your communication style to work within a smaller vs. large org.
UNLIMITED
Scaling yourself ‘down’ as an engineering leader w/ James Everingham #155: James Everingham, co-founder and VP of Engineering @ Lightspark, joins our podcast to share his best tools for scaling yourself down – not up – as an engineering leader. He discusses his latest career move shifting down in scale and how that impacts your risk tolerance as a leader. We also cover some of James’ favorite leadership methods, including the Socratic method, principle-based decision-making, and creating narratives as a product / eng org goal-setting tool, plus how he’s employed those tools effectively throughout his career. We also address navigating the balance between process & anti-process, approaches to product planning & finding PMF, and adapting your communication style to work within a smaller vs. large org.
byThe Engineering Leadership Podcast
0 ratings
0% found this document useful
Data and Analytics Strategy with Head of Data and Analytics at Google Cloud: #dataanalytics #analytics Watch this important CXOTalk episode for a discussion on data and analytics strategy with Bruno Aziza, Head of Data and Analytics at Google Cloud. Bruno shares insights on the convergence of data and workloads, data...
UNLIMITED
Data and Analytics Strategy with Head of Data and Analytics at Google Cloud: #dataanalytics #analytics Watch this important CXOTalk episode for a discussion on data and analytics strategy with Bruno Aziza, Head of Data and Analytics at Google Cloud. Bruno shares insights on the convergence of data and workloads, data...
byCXOTalk: Leadership, AI, and the Digital Economy
0 ratings
0% found this document useful
Deciphering Data Architectures with James Serra
UNLIMITED
Deciphering Data Architectures with James Serra
byInsights Tomorrow
0 ratings
0% found this document useful
66: A guide to data models and dynamic dashboards for marketers
UNLIMITED
66: A guide to data models and dynamic dashboards for marketers
byHumans of Martech
0 ratings
0% found this document useful
Defining a Database with Tony Baer
UNLIMITED
Defining a Database with Tony Baer
byScreaming in the Cloud
0 ratings
0% found this document useful
All Data Scientists Should Learn Software Engineering Principles // Catherine Nelson // #245
UNLIMITED
All Data Scientists Should Learn Software Engineering Principles // Catherine Nelson // #245
byMLOps.community
0 ratings
0% found this document useful
Adding An Easy Mode For The Modern Data Stack With 5X: The "modern data stack" promised a scalable, composable data platform that gave everyone the flexibility to use the best tools for every job. The reality was that it left data teams in the position of spending all of their engineering effort on integrating systems that weren't designed with compatible user experiences. The team at 5X understand the pain involved and the barriers to productivity and set out to solve it by pre-integrating the best tools from each layer of the stack. In this episode founder Tarush Aggarwal explains how the realities of the modern data stack are impacting data teams and the work that they are doing to accelerate time to value.
UNLIMITED
Adding An Easy Mode For The Modern Data Stack With 5X: The "modern data stack" promised a scalable, composable data platform that gave everyone the flexibility to use the best tools for every job. The reality was that it left data teams in the position of spending all of their engineering effort on integrating systems that weren't designed with compatible user experiences. The team at 5X understand the pain involved and the barriers to productivity and set out to solve it by pre-integrating the best tools from each layer of the stack. In this episode founder Tarush Aggarwal explains how the realities of the modern data stack are impacting data teams and the work that they are doing to accelerate time to value.
byData Engineering Podcast
0 ratings
0% found this document useful

Skip carousel

Roundtable: AI And The Practice Of Architecture
Architecture Australia
UNLIMITED
Roundtable: AI And The Practice Of Architecture
Mar 4, 2024
Engagement in practice Gwyllim Jahn: To begin with, we are interested in understanding how AI is being used in your practices. Xavier, could you explain who in your practice is using AI and what they’re using it for? Xavier De Kestelier: We work clos
10 min read
So Predictable? AI And Landscape Architecture
Landscape Architecture Australia
UNLIMITED
So Predictable? AI And Landscape Architecture
Apr 30, 2023
6 min read
Getting The edge
The European Business Review
UNLIMITED
Getting The edge
Feb 25, 2021
7 min read
Quantum Leap
Marketing
UNLIMITED
Quantum Leap
Jul 11, 2019
6 min read
Tech Talk
St. Louis Magazine
UNLIMITED
Tech Talk
Apr 30, 2024
3 min read
Cloudy With No Chance Of Erp
Architectural Review Asia Pacific
UNLIMITED
Cloudy With No Chance Of Erp
Nov 11, 2019
ERP (enterprise resource planning) was born around the time the first ‘[Something] for Dummies’ book was published*. It’s typically inflexible, uncompromising software designed for large businesses, like banks, large corporations, manufacturing and s
2 min read
Enterprise Soaring Success
Linux Format
UNLIMITED
Enterprise Soaring Success
Aug 27, 2019
7 min read
AI In Action: Case Studies
Architecture Australia
UNLIMITED
AI In Action: Case Studies
Mar 4, 2024
9 min read
Generative AI: What Leaders Need To Know
Rotman Management
UNLIMITED
Generative AI: What Leaders Need To Know
Jan 1, 2024
12 min read
Data Fabric
PC Pro Magazine
UNLIMITED
Data Fabric
Aug 13, 2020
3 min read
Questions for Tim Brown, CEO, IDEO
Rotman Management
UNLIMITED
Questions for Tim Brown, CEO, IDEO
Jan 1, 2018
You have said that, at its best, design creates relationships between people and technologies. Please explain. When I use the term ‘technologies’, I mean anything that is constructed by human beings — whether it’s an iPod, an automobile, a rapid tran
8 min read
Intel ...ON TE FUTURE OF... Computing
TechLife
UNLIMITED
Intel ...ON TE FUTURE OF... Computing
Jan 13, 2020
5 min read
Intel …ON THE FUTURE OF… Computing
T3
UNLIMITED
Intel …ON THE FUTURE OF… Computing
Sep 27, 2019
5 min read
Jonathan Ellis INTERVIEW
Linux Format
UNLIMITED
Jonathan Ellis INTERVIEW
Oct 22, 2019
6 min read
Five Steps To Join The Era Of Industry 4.0
Architectural Review Asia Pacific
UNLIMITED
Five Steps To Join The Era Of Industry 4.0
Sep 4, 2019
When 3D modelling tool Revit first arrived on the scene, Australian architects were some of the world’s earliest adopters, with local users outnumbering Europe and the US combined. As a country, we’re often ahead of the curve, and should be building
1 min read
01 Ready Or Not, AI Is Here To Assist You
HWM Singapore
UNLIMITED
01 Ready Or Not, AI Is Here To Assist You
Jul 11, 2023
4 min read
Saxo Bank And Thoughtworks: Enabling Data Democratization At A Global Investment Bank
Business Today
UNLIMITED
Saxo Bank And Thoughtworks: Enabling Data Democratization At A Global Investment Bank
Jan 20, 2023
2 min read
Better Design Decisions: Architecture And Data
Architecture Australia
UNLIMITED
Better Design Decisions: Architecture And Data
Jan 23, 2022
5 min read
Intel …ON THE FUTURE OF… Computing
T3 Australia
UNLIMITED
Intel …ON THE FUTURE OF… Computing
Nov 4, 2019
5 min read
Maya
3D World
UNLIMITED
Maya
Jan 25, 2022
3 min read
What is… SMAC?
PC Pro Magazine
UNLIMITED
What is… SMAC?
Apr 10, 2022
3 min read
The Algorithmic Leader
Rotman Management
UNLIMITED
The Algorithmic Leader
Jan 1, 2020
9 min read
“We’re Learning As We Go And Accepting Any False Starts As Being A Part Of The Process”
PC Pro Magazine
UNLIMITED
“We’re Learning As We Go And Accepting Any False Starts As Being A Part Of The Process”
Jul 8, 2021
6 min read
Taming Your Tech Talent
Inc.
UNLIMITED
Taming Your Tech Talent
Mar 1, 2017
ETELKA LEHOCZKY WHEN ANASTASIA LENG QUIT Google to start Hatch.co, a shopping site for handmade goods, in 2012, one of the skills she’d developed at the tech giant proved crucial. Managing some of the world’s best IT talent gave the marketing specia
2 min read
AI Tools That Actually Work
Entrepreneur
UNLIMITED
AI Tools That Actually Work
Jul 16, 2024
6 min read
Salesforce Buys Slack in a $27.4B Deal
Techfastly
UNLIMITED
Salesforce Buys Slack in a $27.4B Deal
Feb 4, 2021
4 min read
Pragmatic Parametricism
Architectural Review Asia Pacific
UNLIMITED
Pragmatic Parametricism
Nov 13, 2020
4 min read
Customer-centric From Its Core
NZ Marketing
UNLIMITED
Customer-centric From Its Core
Sep 16, 2018
What’s been the biggest change you have seen in your career? I think one of the biggest things I am seeing recently is data becoming more and more important in how businesses operate and how they deliver customer experiences. Even a few years ago whe
2 min read
Q&A
Rotman Management
UNLIMITED
Q&A
May 1, 2023
Describe the capability that companies like Netflix, UPS, Amazon and Caesars Entertainment have in common. These are all leading firms in their industries with respect to leveraging analytics as a source of competitive advantage. We now have so much
7 min read
SATYA NADELLA The Man Who Brought Back Microsoft’s Lost Charm
Techfastly
UNLIMITED
SATYA NADELLA The Man Who Brought Back Microsoft’s Lost Charm
Mar 1, 2022
4 min read

Related categories

Skip carousel

Reviews for Data Modeling with Snowflake

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Data Modeling with Snowflake - Serge Gershkovich

Cover.pngPackt Logo

BIRMINGHAM—MUMBAI

Data Modeling with Snowflake

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Group Product Manager: Reshma Raman

Publishing Product Manager: Apeksha Shetty

Content Development Editor: Manikandan Kurup

Technical Editor: Sweety Pagaria

Copy Editor: Safis Editing

Project Coordinator: Farheen Fathima

Proofreader: Safis Editing

Indexer: Hemangini Bari

Production Designer: Shankar Kalbhor

Marketing Coordinator: Nivedita Singh

Cover Design: Elena Kadantseva

First published: May 2023

Production reference: 2180523

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-83763-445-3

www.packtpub.com

To Elena, the entity without whose relationship none of this data could have been modeled.

– Serge Gershkovich

Foreword

My first exposure to relational design and modeling concepts was in the late 1980s. I had built a few things in dBase II in the early ‘80s, then Dbase III a little later, but had no formal training. On a US government contract, a forward-looking manager of mine asked me if I was interested in learning something new about designing databases that he had just learned. He then walked me through the material from a class on entity-relationship modeling and normalization (taught by IBM) that he had just returned from (they were actually copies of transparencies from the class). It was amazing and made so much sense to me. That was when I learned about forms of normalization, which led me to read more in a book by Dr. CJ Date and eventually into building new databases using an early version of Oracle (version 5.1a to be exact).

Initially, I drew models on paper and whiteboards, starting with the Chen-style notation. Eventually, I did them with primitive drawing tools (such as MacDraw!) long before modern data modeling tools were available.

To say things have changed in the last few decades is an understatement.

We now have modern cloud-based, high-performance databases such as Snowflake and cloud-based data modeling and design tools such as SqlDBM. What we can do today with data and these tools is something I never dreamed of (e.g., I can now easily switch between modeling notations such as Chen, IE, and Barker on-the-fly).

For nearly a decade, during the initial era of Big Data, Hadoop, and NoSQL, it was declared far and wide, Data modeling is dead. While many of us cringed and knew that was false, worse, we also knew that the sentiment would lead to big problems down the road (data swamps, anyone?). Unfortunately, the next generation, and other newbies, joining the industry during those times got zero exposure to data modeling of any form or the logic and theory behind it.

As the industry evolved and the cloud entered the picture, people started asking questions such as, How will we ever get a handle on all this data? and How are we going to make it usable to our business users? If only there were a way to draw a picture or map that most people could read and understand…

What a concept!

And thus, data modeling reentered the popular discussion in blogs, podcasts, webinars, and the like.

But now the question became, Do we need to model differently for modern data and data platforms?

Yes and no.

The fundamentals and benefits of database modeling have not changed. However, the cloud-native architecture of modern platforms such as Snowflake has redefined the rules (and costs) of how data is stored, shared, and processed. This book is an excellent start in bridging the time-tested techniques of relational database modeling with the revolutionary features and facets of Snowflake’s scalable data platform. It is appropriate for those new to the concept of data modeling as well as veteran data modelers who are beginning to work with modern cloud databases.

In this book, Serge takes you from the history of data modeling and its various forms and notations to exploring the core features of Snowflake architecture to construct performant and cost-effective solutions. By learning to apply these decades-old, proven approaches to the revolutionary features of The Data Cloud, you can better leverage the data assets in your organization to remain competitive and become a 21st-century data-driven organization.

With all this in context, this book will be your guide and a launchpad into the world of modern data modeling in The Data Cloud.

Enjoy!

#LongLiveDataModeling

Kent Graziano, The Data Warrior

May 2023

Contributors

About the author

Serge Gershkovich is a seasoned data architect with decades of experience designing and maintaining enterprise-scale data warehouse platforms and reporting solutions. He is a leading subject matter expert, speaker, content creator, and Snowflake Data Superhero. Serge earned a bachelor of science degree in information systems from the State University of New York (SUNY) Stony Brook. Throughout his career, Serge has worked in model-driven development from SAP BW/HANA to dashboard design to cost-effective cloud analytics with Snowflake. He currently serves as product success lead at SqlDBM, an online database modeling tool.

I want to thank Anna, Ed, and Ajay for recognizing the potential that even I didn’t know I had. This book happened thanks to your guidance and encouragement. To my loving wife, Elena, thank you for your unwavering support throughout this process.

About the reviewers

Hazal Sener is a senior developer advocate at SqlDBM. She graduated with honors from Istanbul Technical University and earned a master’s degree in geomatics engineering. Following her studies, Hazal started her career in the geographic information system (GIS) surveying industry, where, over five years ago, she discovered her passion for data. In 2019, Hazal joined the Business Intelligence team at a top-five business-to-business (B2B) bed bank as a data warehouse modeler and built warehouse models and transformational pipelines and optimized SQL queries there. Hazal’s passion for data leads her to her current position as a senior developer advocate at SqlDBM. In this role, Hazal provides technical guidance and educates clients on the tool’s features and capabilities.

Oliver Cramer is owner of data provisioning at Aquila Capital. As product manager of a data warehouse, he is responsible for guiding various teams. Creating guidelines and standards is also within his scope. His current focus is building larger teams under the heading of analytics engineering.

Keith Belanger is a very passionate data professional. With over 25 years of experience in data architecture and information management, he is highly experienced at assembling and directing high-performing data-focused teams and solutions. He combines a deep technical and data background with a business-oriented mindset. He enjoys working with business and IT teams on data strategies to solve everyday business problems. He is a recognized Snowflake Data Superhero, Certified Data Vault 2.0 Practitioner, Co-Chair of the Boston Snowflake User Group, and North America Data Vault User Group board member. He has worked in the data and analytics space in a wide range of verticals, including manufacturing, property and casualty insurance, life insurance, and health care.

Table of Contents

Preface

Part 1: Core Concepts in Data Modeling and Snowflake Architecture

Unlocking the Power of Modeling

Technical requirements

Modeling with purpose

Leveraging the modeling toolkit

The benefits of database modeling

Operational and analytical modeling scenarios

A look at relational and transformational modeling

What modeling looks like in operational systems

What modeling looks like in analytical systems

Summary

Further reading

References

An Introduction to the Four Modeling Types

Design and process

Ubiquitous modeling

Conceptual

What it is

What it looks like

Logical

What it is

What it looks like

Physical modeling

What it is

What it looks like

Transformational

What it is

What it looks like

Summary

Further reading

Mastering Snowflake’s Architecture

Traditional architectures

Shared-disk architecture

Shared-nothing architecture

Snowflake’s solution

Snowflake’s three-tier architecture

Storage layer

Compute layer

Services layer

Snowflake’s features

Zero-copy cloning

Time Travel

Hybrid Unistore tables

Beyond structured data

Costs to consider

Storage costs

Compute costs

Service costs

Saving cash by using cache

Services layer

Warehouse cache

Storage layer

Summary

Further reading

Mastering Snowflake Objects

Stages

File formats

Tables

Physical tables

Stage metadata tables

Snowflake views

Caching

Security

Materialized views

Streams

Loading from streams

Change tracking

Tasks

Combining tasks and streams

Summary

References

Speaking Modeling through Snowflake Objects

Entities as tables

How Snowflake stores data

Clustering

Attributes as columns

Snowflake data types

Storing semi-structured data

Constraints and enforcement

Identifiers as primary keys

Benefits of a PK

Specifying a PK

Keys taxonomy

Sequences

Alternate keys as unique constraints

Relationships as foreign keys

Benefits of an FK

Mandatory columns as NOT NULL constraints

Summary

Seeing Snowflake’s Architecture through Modeling Notation

A history of relational modeling

RM versus entity-relationship diagram

Visual modeling conventions

Depicting entities

Depicting relationships

Adding conceptual context to Snowflake architecture

The benefit of synchronized modeling

Summary

Part 2: Applied Modeling from Idea to Deployment

Putting Conceptual Modeling into Practice

Embarking on conceptual design

Dimensional modeling

Understanding dimensional modeling

Setting the record straight on dimensional modeling

Starting a conceptual model in four easy steps

From bus matrix to a conceptual model

Modeling in reverse

Identify the facts and dimensions

Establish the relationships

Propose and validate the business processes

Summary

Further reading

Putting Logical Modeling into Practice

Expanding from conceptual to logical modeling

Adding attributes

Cementing the relationships

Many-to-many relationships

Weak entities

Inheritance

Summary

Database Normalization

An overview of database normalization

Data anomalies

Update anomaly

Insertion anomaly

Deletion anomaly

Domain anomaly

Database normalization through examples

1NF

2NF

3NF

BCNF

4NF

5NF

DKNF

6NF

Data models on a spectrum of normalization

Summary

Database Naming and Structure

Naming conventions

Case

Object naming

Suggested conventions

Organizing a Snowflake database

Organization of databases and schemas

OLTP versus OLAP database structures

Database environments

Summary

Putting Physical Modeling into Practice

Technical requirements

Considerations before starting the implementation

Performance

Cost

Data quality and integrity

Data security

Non-considerations

Expanding from logical to physical modeling

Physicalizing the logical objects

Defining the tables

Deploying a physical model

Creating an ERD from a physical model

Summary

Part 3: Solving Real-World Problems with Transformational Modeling

Putting Transformational Modeling into Practice

Technical requirements

Separating the model from the object

Shaping transformations through relationships

Join elimination using constraints

When to use RELY for join elimination

When to be careful using RELY

Joins and set operators

Performance considerations and monitoring

Common query problems

Additional query considerations

Putting transformational modeling into practice

Gathering the business requirements

Reviewing the relational model

Building the transformational model

Summary

Modeling Slowly Changing Dimensions

Technical requirements

Dimensions overview

SCD types

Example scenario

Recipes for maintaining SCDs in Snowflake

Setting the stage

Type 1 – merge

Type 2 – Type 1-like performance using streams

Type 3 – one-time update

Summary

Modeling Facts for Rapid Analysis

Technical requirements

Fact table types

Fact table measures

Getting the facts straight

The world’s most versatile transactional fact table

The leading method for recovering deleted records

Type 2 slowly changing facts

Maintaining fact tables using Snowflake features

Building a reverse balance fact table with Streams

Recovering deleted records with leading load dates

Handling time intervals in a Type 2 fact table

Summary

Modeling Semi-Structured Data

Technical requirements

The benefits of semi-structured data in Snowflake

Getting hands-on with semi-structured data

Schema-on-read != schema-no-need

Converting semi-structured data into relational data

Summary

Modeling Hierarchies

Technical requirements

Understanding and distinguishing between hierarchies

A fixed-depth hierarchy

A slightly ragged hierarchy

A ragged hierarchy

Maintaining hierarchies in Snowflake

Recursively navigating a ragged hierarchy

Handling changes

Summary

Scaling Data Models through Modern Techniques

Technical requirements

Demystifying Data Vault 2.0

Building the Raw Vault

Loading with multi-table inserts

Modeling the data marts

Star schema

Snowflake schema

Discovering Data Mesh

Start with the business

Adopt governance guidelines

Emphasize data quality

Encourage a culture of data sharing

Summary

18: Appendix

Technical requirements

The exceptional time traveler

The secret column type Snowflake refuses to document

Read the functional manual (RTFM)

Summary

Index

Other Books You May Enjoy

Preface

Snowflake is one of the leading cloud data platforms and is gaining popularity among organizations looking to migrate their data to the cloud. With its game-changing features, Snowflake is unlocking new possibilities for self-service analytics and collaboration. However, Snowflake’s scalable consumption-based pricing model demands that users fully understand its revolutionary three-tier cloud architecture and pair it with universal modeling principles to ensure they are unlocking value and not letting money vaporize into the cloud.

Data modeling is essential for building scalable and cost-effective designs in data warehousing. Effective modeling techniques not only help businesses build efficient data models but also enable them to better understand their business. Though modeling is largely database-agnostic, pairing modeling techniques with game-changing Snowflake features can help build Snowflake’s most performant and cost-effective solutions.

This book combines the best practices in data modeling with Snowflake’s powerful features to offer you the most efficient and effective approach to data modeling in Snowflake. Using these techniques, you can optimize your data warehousing processes, improve your organization’s data-driven decision-making capabilities, and save valuable time and resources.

Who this book is for

Database modeling is a simple, yet foundational tool for enhancing communication and decision-making within enterprise teams and streamlining development. By pairing modeling-first principles with the specifics of Snowflake architecture, this book will serve as an effective tool for data engineers looking to build cost-effective Snowflake systems for business users looking for an easy way to understand them.

The three main personas who are the target audience of this content are as follows:

Data engineers: This book takes a Snowflake-centered approach to designing data models. It pairs universal modeling principles with unique architectural facets of the data cloud to help build performant and cost-effective solutions.

Data architects: While familiar with modeling concepts, many architects may be new to the Snowflake platform and are eager to learn and incorporate its best features into their designs for improved efficiency and maintenance.

Business analysts: Many analysts transition from business or functional roles and are cast into the world of data without a formal introduction to database best practices and modeling conventions. This book will give them the tools to navigate their data landscape and confidently create their own models and analyses.

What this book covers

Chapter 1, Unlocking the Power of Modeling, explores the role that models play in simplifying and guiding our everyday experience. This chapter unpacks the concept of modeling into its constituents: natural language, technical, and visual semantics. This chapter also gives you a glimpse into how modeling differs across various types of databases.

Chapter 2, An Introduction to the Four Modeling Types, looks at the four types of modeling covered in this book: conceptual, logical, physical, and transformational. This chapter gives an overview of where and how each type of modeling is used and what it looks like. This foundation gives you a taste of where the upcoming chapters will lead.

Chapter 3, Mastering Snowflake’s Architecture, provides a history of the evolution of database architectures and highlights the advances that make the data cloud a game changer in scalable computing. Understanding the underlying architecture will inform how Snowflake’s three-tier architecture unlocks unique capabilities in the models we design in later chapters.

Chapter 4, Mastering Snowflake Objects, explores the various Snowflake objects we will use in our modeling exercises throughout the book. This chapter looks at the memory footprints of the different table types, change tracking through streams, and the use of tasks to automate data transformations, among many other topics.

Chapter 5, Speaking Modeling through Snowflake Objects, bridges universal modeling concepts such as entities and relationships with accompanying Snowflake architecture, storage, and handling. This chapter breaks down the fundamentals of Snowflake data storage, detailing micro partitions and clustering so that you can make informed and cost-effective design decisions.

Chapter 6, Seeing Snowflake’s Architecture through Modeling Notation, explores why there are so many competing and overlapping visual notations in modeling and how to use the ones that work. This chapter zeroes in on the most concise and intuitive notations you can use to plan and design database models and make them accessible to business users simultaneously.

Chapter 7, Putting Conceptual Modeling into Practice, starts the journey of creating a conceptual model by engaging with domain experts from the business and understanding the elements of the underlying business. This chapter uses Kimball’s dimensional modeling method to identify the facts and dimensions, establish the bus matrix, and launch the design process. We also explore how to work backward using the same technique to align a physical model to a business model.

Chapter 8, Putting Logical Modeling into Practice, continues the modeling journey by expanding the conceptual model with attributes and business nuance. This chapter explores how to resolve many-to-many relationships, expand weak entities, and tackle inheritance in modeling entities.

Chapter 9, Database Normalization, demonstrates that normal doesn’t necessarily mean better—there are trade-offs. While most database models fall within the first to third normal forms, this chapter takes you all the way to the sixth, with detailed examples to illustrate the differences. This chapter also explores the various data anomalies that normalization aims to mitigate.

Chapter 10, Database Naming and Structure, takes the ambiguity out of database object naming and proposes a clear and consistent standard. This chapter focuses on the conventions that will enable you to scale and adjust your model and avoid breaking downstream processes. By considering how Snowflake handles cases and uniqueness, you can make confident and consistent design decisions for your physical objects.

Chapter 11, Putting Physical Modeling into Practice, translates the logical model from the previous chapter into a fully deployable physical model. In this process, we handle the security and governance concerns accompanying a physical model and its deployment. This chapter also explores physicalizing logical inheritance and demonstrates how to go from DDL to generating a visual diagram.

Chapter 12, Putting Transformational Modeling into Practice, demonstrates how to use the physical model to drive transformational design and improve performance gains through join elimination in Snowflake. The chapter discusses the types of joins and set operators available in Snowflake and provides guidance on monitoring Snowflake queries to identify common issues. Using these techniques, you will practice creating transformational designs from business requirements.

Chapter 13, Modeling Slowly Changing Dimensions, delves into the concept of slowly changing dimensions (SCDs) and provides you with recipes for maintaining SCDs efficiently using Snowflake features. You will learn about the challenges of keeping record counts in dimension tables in check and how mini dimensions can help address this issue. The chapter also discusses creating multifunctional surrogate keys and compares them with hashing techniques.

Chapter 14, Modeling Facts for Rapid Analysis, focuses on fact tables and explains the different types of fact tables and measures. You will discover versatile reporting structures such as the reverse balance and range-based factless facts and learn how to recover deleted records. This chapter also provides related Snowflake recipes for building and maintaining all the operations mentioned.

Chapter 15, Modeling Semi-Structured Data, explores techniques required to use and model semi-structured data in Snowflake. This chapter demonstrates that while Snowflake makes querying semi-structured data easy, there is effort involved in transforming it into a relational format that users can understand. We explore the benefits of converting semi-structured data to a relational schema and review a rule-based method for doing so.

Chapter 16, Modeling Hierarchies, provides you with an understanding of the different types of hierarchies and their uses in data warehouses. The chapter distinguishes between hierarchy types and discusses modeling techniques for maintaining each of them. You will also learn about Snowflake features for traversing a recursive tree structure and techniques for handling changes in hierarchy dimensions.

Chapter 17, Scaling Data Models through Modern Frameworks, discusses the utility of Data Vault methodology in modern data platforms and how it addresses the challenges of managing large, complex, and rapidly changing data environments. This chapter also discusses the efficient loading of the Data Vault with multi-table inserts and creating Star and Snowflake schema models for reporting information marts. Additionally, you will be introduced to Data Mesh and its application in managing data in large, complex organizations. Finally, the chapter reviews modeling best practices mentioned throughout the book.

Chapter 18, Appendix, collects all the fun and practical Snowflake recipes that couldn’t fit into the structure of the main chapters. This chapter showcases useful techniques such as the exceptional time traveler, exposes the (secret) virtual column type, and more!

To get the most out of this book

This book will rely heavily on the design and use of visual modeling diagrams. While a diagram can be drawn by hand, maintained in Excel, or constructed in PowerPoint, a modeling tool with dedicated layouts and functions is recommended. As the exercises in this book will take you from conceptual database-agnostic diagrams to deployable and runnable Snowflake code, a tool that supports Snowflake syntax and can generate deployable DDL is recommended.

This book uses visual examples from SqlDBM, an online database modeling tool that supports Snowflake. A free trial is available on their website here: https://sqldbm.com/Home/.

Another popular online diagramming solution is LucidChart (https://www.lucidchart.com/pages/). Although LucidChart does not support Snowflake as of this writing, it also offers a free tier for designing ER diagrams as well as other models such as Unified Modeling Language (UML) and network diagrams.

If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book’s GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.

Download the example code files

You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Data-Modeling-with-Snowflake. If there’s an update to the code, it will be updated in the GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: Adding a discriminator between the CUSTOMER supertype and the LOYALTY_CUSTOMER subtype adds context that would otherwise be lost at the database level.

A block of code is set as follows:

-- Query the change tracking metadata to observe

-- only inserts from the timestamp till now

select * from myTable

changes(information => append_only)

at(timestamp => $cDts);

Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in bold. Here is an example: Subtypes share common characteristics with a supertype entity but have additional attributes that make them distinct.

Tips or important notes

Appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, email us at customercare@packtpub.com and mention the book title in the subject of your message.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at copyright@packt.com with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Share Your Thoughts

Once you’ve read Data Modeling with Snowflake, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.

Download a free PDF copy of this book

Thanks for purchasing this book!

Do you like to read on the go but are unable to carry your print books everywhere? Is your eBook purchase not compatible with the device of your choice?

Don’t

Enjoying the preview?

Page 1 of 1

Data Modeling with Snowflake: A practical guide to accelerating Snowflake development using universal data modeling techniques

About this ebook

Serge Gershkovich

Related authors

Related to Data Modeling with Snowflake

Related ebooks

Database Design and Modeling with Google Cloud: Learn database design and development to take your data to applications, analytics, and AI

Data Engineering with dbt: A practical guide to building a cloud-based, pragmatic, and dependable data platform with SQL

Hands-On Big Data Modeling: Effective database design techniques for data architects and business intelligence professionals

Data Analysis and Business Modeling with Excel 2013

Mastering Tableau 2023: Implement advanced business intelligence techniques, analytics, and machine learning models with Tableau

Cloud Data Architectures Demystified: Gain the expertise to build Cloud data solutions as per the organization's needs (English Edition)

The Predictive Project Manager

Mastering Data Engineering and Analytics with Databricks: A Hands-on Guide to Build Scalable Pipelines Using Databricks, Delta Lake, and MLflow (English Edition)

Agile Machine Learning with DataRobot: Automate each step of the machine learning life cycle, from understanding problems to delivering value

Data Analysis and Harmonization: A Simple Guide

Expert T-SQL Window Functions in SQL Server 2019: The Hidden Secret to Fast Analytic and Reporting Queries

Beginning Power BI with Excel 2013: Self-Service Business Intelligence Using Power Pivot, Power View, Power Query, and Power Map

Deep Learning with Azure: Building and Deploying Artificial Intelligence Solutions on the Microsoft AI Platform

Optimizing Databricks Workloads: Harness the power of Apache Spark in Azure and maximize the performance of modern big data workloads

Mastering Snowflake Platform: Generate, fetch, and automate Snowflake data as a skilled data practitioner (English Edition)

Data Lakehouse in Action: Architecting a modern and scalable data analytics platform

Deep Learning for Data Architects: Unleash the power of Python's deep learning algorithms (English Edition)

Monetizing Machine Learning: Quickly Turn Python ML Ideas into Web Applications on the Serverless Cloud

Principles of Data Fabric: Become a data-driven organization by implementing Data Fabric solutions efficiently

Learning Tableau 2022: Create effective data visualizations, build interactive visual analytics, and improve your data storytelling capabilities

Scalable Big Data Architecture: A practitioners guide to choosing relevant Big Data architecture

Technology Operating Models for Cloud and Edge: Create your purpose-built distributed operating model for public, hybrid, multicloud, and edge

Data Engineering with Apache Spark, Delta Lake, and Lakehouse: Create scalable pipelines that ingest, curate, and aggregate complex data in a timely and secure way

Scalable Data Analytics with Azure Data Explorer: Modern ways to query, analyze, and perform real-time data analysis on large volumes of data

Practical Azure SQL Database for Modern Developers: Building Applications in the Microsoft Cloud

BigQuery for Data Warehousing: Managed Data Analysis in the Google Cloud

Azure Synapse Analytics Cookbook: Implement a limitless analytical platform using effective recipes for Azure Synapse

SQL Query Design Patterns and Best Practices: A practical guide to writing readable and maintainable SQL queries using its design patterns

Ultimate Data Engineering with Databricks: Develop Scalable Data Pipelines Using Data Engineering's Core Tenets Such as Delta Tables, Ingestion, Transformation, Security, and Scalability

Ultimate Data Engineering with Databricks

Data Modeling & Design For You

The Secrets of ChatGPT Prompt Engineering for Non-Developers

Thinking in Algorithms: Strategic Thinking Skills, #2

Data Analytics for Beginners: Introduction to Data Analytics

DAX Patterns: Second Edition

Neural Networks for Beginners: An Easy-to-Follow Introduction to Artificial Intelligence and Deep Learning

Mastering Agile User Stories

150 Most Poweful Excel Shortcuts: Secrets of Saving Time with MS Excel

Data Visualization: a successful design process

Raspberry Pi :Raspberry Pi Guide On Python & Projects Programming In Easy Steps

Power Pivot and Power BI: The Excel User's Guide to DAX, Power Query, Power BI &amp; Power Pivot in Excel 2010-2016

The Esri Guide to GIS Analysis, Volume 3: Modeling Suitability, Movement, and Interaction

Living in Data: A Citizen's Guide to a Better Information Future

Managing Data Using Excel

Mastering Python Design Patterns

Hacks To Crush Plc Program Fast & Efficiently Everytime... : Coding, Simulating & Testing Programmable Logic Controller With Examples

Machine Learning: A Comprehensive, Step-by-Step Guide to Learning and Understanding Machine Learning Concepts, Technology and Principles for Beginners: 1

Data Analytics with Python: Data Analytics in Python Using Pandas

Mastering Hadoop

Supercharge Power BI: Power BI is Better When You Learn To Write DAX

Supercharge Excel: When you learn to Write DAX for Power Pivot

A Concise Guide to Object Orientated Programming

Principles of Data Science

Machine Learning Interview Questions

Microsoft Access: Database Creation and Management through Microsoft Access

Hands-On Data Science for Marketing: Improve your marketing strategies with machine learning using Python and R

Tableau Desktop Certified Associate: Exam Guide: Develop your Tableau skills and prepare for Tableau certification with tips from industry experts

Python Data Analysis

The Systems Thinker - Mental Models: The Systems Thinker Series, #3

Kafka in Action

Advanced Deep Learning with Python: Design and implement advanced next-generation AI solutions using TensorFlow and PyTorch

Related podcast episodes

Related articles

Related categories

Reviews for Data Modeling with Snowflake

What did you think?

Book preview

Data Modeling with Snowflake - Serge Gershkovich

Power Pivot and Power BI: The Excel User's Guide to DAX, Power Query, Power BI & Power Pivot in Excel 2010-2016