Data Modeling with Snowflake: A practical guide to accelerating Snowflake development using universal data modeling techniques
By Serge Gershkovich and Kent Graziano
()
About this ebook
The Snowflake Data Cloud is one of the fastest-growing platforms for data warehousing and application workloads. Snowflake's scalable, cloud-native architecture and expansive set of features and objects enables you to deliver data solutions quicker than ever before.
Yet, we must ensure that these solutions are developed using recommended design patterns and accompanied by documentation that’s easily accessible to everyone in the organization.
This book will help you get familiar with simple and practical data modeling frameworks that accelerate agile design and evolve with the project from concept to code. These universal principles have helped guide database design for decades, and this book pairs them with unique Snowflake-native objects and examples like never before – giving you a two-for-one crash course in theory as well as direct application.
By the end of this Snowflake book, you’ll have learned how to leverage Snowflake’s innovative features, such as time travel, zero-copy cloning, and change-data-capture, to create cost-effective, efficient designs through time-tested modeling principles that are easily digestible when coupled with real-world examples.
Related to Data Modeling with Snowflake
Related ebooks
Database Design and Modeling with Google Cloud: Learn database design and development to take your data to applications, analytics, and AI Rating: 0 out of 5 stars0 ratingsData Engineering with dbt: A practical guide to building a cloud-based, pragmatic, and dependable data platform with SQL Rating: 0 out of 5 stars0 ratingsHands-On Big Data Modeling: Effective database design techniques for data architects and business intelligence professionals Rating: 0 out of 5 stars0 ratingsData Analysis and Business Modeling with Excel 2013 Rating: 1 out of 5 stars1/5Mastering Tableau 2023: Implement advanced business intelligence techniques, analytics, and machine learning models with Tableau Rating: 0 out of 5 stars0 ratingsThe Predictive Project Manager Rating: 0 out of 5 stars0 ratingsData Analysis and Harmonization: A Simple Guide Rating: 0 out of 5 stars0 ratingsExpert T-SQL Window Functions in SQL Server 2019: The Hidden Secret to Fast Analytic and Reporting Queries Rating: 0 out of 5 stars0 ratingsBeginning Power BI with Excel 2013: Self-Service Business Intelligence Using Power Pivot, Power View, Power Query, and Power Map Rating: 0 out of 5 stars0 ratingsDeep Learning with Azure: Building and Deploying Artificial Intelligence Solutions on the Microsoft AI Platform Rating: 0 out of 5 stars0 ratingsMastering Snowflake Platform: Generate, fetch, and automate Snowflake data as a skilled data practitioner (English Edition) Rating: 0 out of 5 stars0 ratingsData Lakehouse in Action: Architecting a modern and scalable data analytics platform Rating: 0 out of 5 stars0 ratingsDeep Learning for Data Architects: Unleash the power of Python's deep learning algorithms (English Edition) Rating: 0 out of 5 stars0 ratingsMonetizing Machine Learning: Quickly Turn Python ML Ideas into Web Applications on the Serverless Cloud Rating: 0 out of 5 stars0 ratingsPrinciples of Data Fabric: Become a data-driven organization by implementing Data Fabric solutions efficiently Rating: 0 out of 5 stars0 ratingsScalable Big Data Architecture: A practitioners guide to choosing relevant Big Data architecture Rating: 2 out of 5 stars2/5Practical Azure SQL Database for Modern Developers: Building Applications in the Microsoft Cloud Rating: 0 out of 5 stars0 ratingsBigQuery for Data Warehousing: Managed Data Analysis in the Google Cloud Rating: 0 out of 5 stars0 ratingsAzure Synapse Analytics Cookbook: Implement a limitless analytical platform using effective recipes for Azure Synapse Rating: 0 out of 5 stars0 ratingsUltimate Data Engineering with Databricks Rating: 0 out of 5 stars0 ratings
Data Modeling & Design For You
The Secrets of ChatGPT Prompt Engineering for Non-Developers Rating: 5 out of 5 stars5/5Thinking in Algorithms: Strategic Thinking Skills, #2 Rating: 4 out of 5 stars4/5Data Analytics for Beginners: Introduction to Data Analytics Rating: 4 out of 5 stars4/5DAX Patterns: Second Edition Rating: 5 out of 5 stars5/5Neural Networks for Beginners: An Easy-to-Follow Introduction to Artificial Intelligence and Deep Learning Rating: 2 out of 5 stars2/5Mastering Agile User Stories Rating: 4 out of 5 stars4/5150 Most Poweful Excel Shortcuts: Secrets of Saving Time with MS Excel Rating: 3 out of 5 stars3/5Data Visualization: a successful design process Rating: 4 out of 5 stars4/5Raspberry Pi :Raspberry Pi Guide On Python & Projects Programming In Easy Steps Rating: 3 out of 5 stars3/5The Esri Guide to GIS Analysis, Volume 3: Modeling Suitability, Movement, and Interaction Rating: 0 out of 5 stars0 ratingsLiving in Data: A Citizen's Guide to a Better Information Future Rating: 4 out of 5 stars4/5Managing Data Using Excel Rating: 5 out of 5 stars5/5Mastering Python Design Patterns Rating: 0 out of 5 stars0 ratingsData Analytics with Python: Data Analytics in Python Using Pandas Rating: 3 out of 5 stars3/5Mastering Hadoop Rating: 0 out of 5 stars0 ratingsSupercharge Power BI: Power BI is Better When You Learn To Write DAX Rating: 5 out of 5 stars5/5Supercharge Excel: When you learn to Write DAX for Power Pivot Rating: 0 out of 5 stars0 ratingsA Concise Guide to Object Orientated Programming Rating: 0 out of 5 stars0 ratingsPrinciples of Data Science Rating: 4 out of 5 stars4/5Machine Learning Interview Questions Rating: 5 out of 5 stars5/5Microsoft Access: Database Creation and Management through Microsoft Access Rating: 0 out of 5 stars0 ratingsPython Data Analysis Rating: 4 out of 5 stars4/5The Systems Thinker - Mental Models: The Systems Thinker Series, #3 Rating: 0 out of 5 stars0 ratingsKafka in Action Rating: 0 out of 5 stars0 ratingsAdvanced Deep Learning with Python: Design and implement advanced next-generation AI solutions using TensorFlow and PyTorch Rating: 0 out of 5 stars0 ratings
Reviews for Data Modeling with Snowflake
0 ratings0 reviews
Book preview
Data Modeling with Snowflake - Serge Gershkovich
BIRMINGHAM—MUMBAI
Data Modeling with Snowflake
Copyright © 2023 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Group Product Manager: Reshma Raman
Publishing Product Manager: Apeksha Shetty
Content Development Editor: Manikandan Kurup
Technical Editor: Sweety Pagaria
Copy Editor: Safis Editing
Project Coordinator: Farheen Fathima
Proofreader: Safis Editing
Indexer: Hemangini Bari
Production Designer: Shankar Kalbhor
Marketing Coordinator: Nivedita Singh
Cover Design: Elena Kadantseva
First published: May 2023
Production reference: 2180523
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.
ISBN 978-1-83763-445-3
www.packtpub.com
To Elena, the entity without whose relationship none of this data could have been modeled.
– Serge Gershkovich
Foreword
My first exposure to relational design and modeling concepts was in the late 1980s. I had built a few things in dBase II in the early ‘80s, then Dbase III a little later, but had no formal training. On a US government contract, a forward-looking manager of mine asked me if I was interested in learning something new about designing databases that he had just learned. He then walked me through the material from a class on entity-relationship modeling and normalization (taught by IBM) that he had just returned from (they were actually copies of transparencies from the class). It was amazing and made so much sense to me. That was when I learned about forms of normalization, which led me to read more in a book by Dr. CJ Date and eventually into building new databases using an early version of Oracle (version 5.1a to be exact).
Initially, I drew models on paper and whiteboards, starting with the Chen-style notation. Eventually, I did them with primitive drawing tools (such as MacDraw!) long before modern data modeling tools were available.
To say things have changed in the last few decades is an understatement.
We now have modern cloud-based, high-performance databases such as Snowflake and cloud-based data modeling and design tools such as SqlDBM. What we can do today with data and these tools is something I never dreamed of (e.g., I can now easily switch between modeling notations such as Chen, IE, and Barker on-the-fly).
For nearly a decade, during the initial era of Big Data, Hadoop, and NoSQL, it was declared far and wide, Data modeling is dead.
While many of us cringed and knew that was false, worse, we also knew that the sentiment would lead to big problems down the road (data swamps, anyone?). Unfortunately, the next generation, and other newbies, joining the industry during those times got zero exposure to data modeling of any form or the logic and theory behind it.
As the industry evolved and the cloud entered the picture, people started asking questions such as, How will we ever get a handle on all this data?
and How are we going to make it usable to our business users?
If only there were a way to draw a picture or map that most people could read and understand…
What a concept!
And thus, data modeling reentered the popular discussion in blogs, podcasts, webinars, and the like.
But now the question became, Do we need to model differently for modern data and data platforms?
Yes and no.
The fundamentals and benefits of database modeling have not changed. However, the cloud-native architecture of modern platforms such as Snowflake has redefined the rules (and costs) of how data is stored, shared, and processed. This book is an excellent start in bridging the time-tested techniques of relational database modeling with the revolutionary features and facets of Snowflake’s scalable data platform. It is appropriate for those new to the concept of data modeling as well as veteran data modelers who are beginning to work with modern cloud databases.
In this book, Serge takes you from the history of data modeling and its various forms and notations to exploring the core features of Snowflake architecture to construct performant and cost-effective solutions. By learning to apply these decades-old, proven approaches to the revolutionary features of The Data Cloud, you can better leverage the data assets in your organization to remain competitive and become a 21st-century data-driven organization.
With all this in context, this book will be your guide and a launchpad into the world of modern data modeling in The Data Cloud.
Enjoy!
#LongLiveDataModeling
Kent Graziano, The Data Warrior
May 2023
Contributors
About the author
Serge Gershkovich is a seasoned data architect with decades of experience designing and maintaining enterprise-scale data warehouse platforms and reporting solutions. He is a leading subject matter expert, speaker, content creator, and Snowflake Data Superhero. Serge earned a bachelor of science degree in information systems from the State University of New York (SUNY) Stony Brook. Throughout his career, Serge has worked in model-driven development from SAP BW/HANA to dashboard design to cost-effective cloud analytics with Snowflake. He currently serves as product success lead at SqlDBM, an online database modeling tool.
I want to thank Anna, Ed, and Ajay for recognizing the potential that even I didn’t know I had. This book happened thanks to your guidance and encouragement. To my loving wife, Elena, thank you for your unwavering support throughout this process.
About the reviewers
Hazal Sener is a senior developer advocate at SqlDBM. She graduated with honors from Istanbul Technical University and earned a master’s degree in geomatics engineering. Following her studies, Hazal started her career in the geographic information system (GIS) surveying industry, where, over five years ago, she discovered her passion for data. In 2019, Hazal joined the Business Intelligence team at a top-five business-to-business (B2B) bed bank as a data warehouse modeler and built warehouse models and transformational pipelines and optimized SQL queries there. Hazal’s passion for data leads her to her current position as a senior developer advocate at SqlDBM. In this role, Hazal provides technical guidance and educates clients on the tool’s features and capabilities.
Oliver Cramer is owner of data provisioning at Aquila Capital. As product manager of a data warehouse, he is responsible for guiding various teams. Creating guidelines and standards is also within his scope. His current focus is building larger teams under the heading of analytics engineering.
Keith Belanger is a very passionate data professional. With over 25 years of experience in data architecture and information management, he is highly experienced at assembling and directing high-performing data-focused teams and solutions. He combines a deep technical and data background with a business-oriented mindset. He enjoys working with business and IT teams on data strategies to solve everyday business problems. He is a recognized Snowflake Data Superhero, Certified Data Vault 2.0 Practitioner, Co-Chair of the Boston Snowflake User Group, and North America Data Vault User Group board member. He has worked in the data and analytics space in a wide range of verticals, including manufacturing, property and casualty insurance, life insurance, and health care.
Table of Contents
Preface
Part 1: Core Concepts in Data Modeling and Snowflake Architecture
1
Unlocking the Power of Modeling
Technical requirements
Modeling with purpose
Leveraging the modeling toolkit
The benefits of database modeling
Operational and analytical modeling scenarios
A look at relational and transformational modeling
What modeling looks like in operational systems
What modeling looks like in analytical systems
Summary
Further reading
References
2
An Introduction to the Four Modeling Types
Design and process
Ubiquitous modeling
Conceptual
What it is
What it looks like
Logical
What it is
What it looks like
Physical modeling
What it is
What it looks like
Transformational
What it is
What it looks like
Summary
Further reading
3
Mastering Snowflake’s Architecture
Traditional architectures
Shared-disk architecture
Shared-nothing architecture
Snowflake’s solution
Snowflake’s three-tier architecture
Storage layer
Compute layer
Services layer
Snowflake’s features
Zero-copy cloning
Time Travel
Hybrid Unistore tables
Beyond structured data
Costs to consider
Storage costs
Compute costs
Service costs
Saving cash by using cache
Services layer
Warehouse cache
Storage layer
Summary
Further reading
4
Mastering Snowflake Objects
Stages
File formats
Tables
Physical tables
Stage metadata tables
Snowflake views
Caching
Security
Materialized views
Streams
Loading from streams
Change tracking
Tasks
Combining tasks and streams
Summary
References
5
Speaking Modeling through Snowflake Objects
Entities as tables
How Snowflake stores data
Clustering
Attributes as columns
Snowflake data types
Storing semi-structured data
Constraints and enforcement
Identifiers as primary keys
Benefits of a PK
Specifying a PK
Keys taxonomy
Sequences
Alternate keys as unique constraints
Relationships as foreign keys
Benefits of an FK
Mandatory columns as NOT NULL constraints
Summary
6
Seeing Snowflake’s Architecture through Modeling Notation
A history of relational modeling
RM versus entity-relationship diagram
Visual modeling conventions
Depicting entities
Depicting relationships
Adding conceptual context to Snowflake architecture
The benefit of synchronized modeling
Summary
Part 2: Applied Modeling from Idea to Deployment
7
Putting Conceptual Modeling into Practice
Embarking on conceptual design
Dimensional modeling
Understanding dimensional modeling
Setting the record straight on dimensional modeling
Starting a conceptual model in four easy steps
From bus matrix to a conceptual model
Modeling in reverse
Identify the facts and dimensions
Establish the relationships
Propose and validate the business processes
Summary
Further reading
8
Putting Logical Modeling into Practice
Expanding from conceptual to logical modeling
Adding attributes
Cementing the relationships
Many-to-many relationships
Weak entities
Inheritance
Summary
9
Database Normalization
An overview of database normalization
Data anomalies
Update anomaly
Insertion anomaly
Deletion anomaly
Domain anomaly
Database normalization through examples
1NF
2NF
3NF
BCNF
4NF
5NF
DKNF
6NF
Data models on a spectrum of normalization
Summary
10
Database Naming and Structure
Naming conventions
Case
Object naming
Suggested conventions
Organizing a Snowflake database
Organization of databases and schemas
OLTP versus OLAP database structures
Database environments
Summary
11
Putting Physical Modeling into Practice
Technical requirements
Considerations before starting the implementation
Performance
Cost
Data quality and integrity
Data security
Non-considerations
Expanding from logical to physical modeling
Physicalizing the logical objects
Defining the tables
Deploying a physical model
Creating an ERD from a physical model
Summary
Part 3: Solving Real-World Problems with Transformational Modeling
12
Putting Transformational Modeling into Practice
Technical requirements
Separating the model from the object
Shaping transformations through relationships
Join elimination using constraints
When to use RELY for join elimination
When to be careful using RELY
Joins and set operators
Performance considerations and monitoring
Common query problems
Additional query considerations
Putting transformational modeling into practice
Gathering the business requirements
Reviewing the relational model
Building the transformational model
Summary
13
Modeling Slowly Changing Dimensions
Technical requirements
Dimensions overview
SCD types
Example scenario
Recipes for maintaining SCDs in Snowflake
Setting the stage
Type 1 – merge
Type 2 – Type 1-like performance using streams
Type 3 – one-time update
Summary
14
Modeling Facts for Rapid Analysis
Technical requirements
Fact table types
Fact table measures
Getting the facts straight
The world’s most versatile transactional fact table
The leading method for recovering deleted records
Type 2 slowly changing facts
Maintaining fact tables using Snowflake features
Building a reverse balance fact table with Streams
Recovering deleted records with leading load dates
Handling time intervals in a Type 2 fact table
Summary
15
Modeling Semi-Structured Data
Technical requirements
The benefits of semi-structured data in Snowflake
Getting hands-on with semi-structured data
Schema-on-read != schema-no-need
Converting semi-structured data into relational data
Summary
16
Modeling Hierarchies
Technical requirements
Understanding and distinguishing between hierarchies
A fixed-depth hierarchy
A slightly ragged hierarchy
A ragged hierarchy
Maintaining hierarchies in Snowflake
Recursively navigating a ragged hierarchy
Handling changes
Summary
17
Scaling Data Models through Modern Techniques
Technical requirements
Demystifying Data Vault 2.0
Building the Raw Vault
Loading with multi-table inserts
Modeling the data marts
Star schema
Snowflake schema
Discovering Data Mesh
Start with the business
Adopt governance guidelines
Emphasize data quality
Encourage a culture of data sharing
Summary
18: Appendix
Technical requirements
The exceptional time traveler
The secret column type Snowflake refuses to document
Read the functional manual (RTFM)
Summary
Index
Other Books You May Enjoy
Preface
Snowflake is one of the leading cloud data platforms and is gaining popularity among organizations looking to migrate their data to the cloud. With its game-changing features, Snowflake is unlocking new possibilities for self-service analytics and collaboration. However, Snowflake’s scalable consumption-based pricing model demands that users fully understand its revolutionary three-tier cloud architecture and pair it with universal modeling principles to ensure they are unlocking value and not letting money vaporize into the cloud.
Data modeling is essential for building scalable and cost-effective designs in data warehousing. Effective modeling techniques not only help businesses build efficient data models but also enable them to better understand their business. Though modeling is largely database-agnostic, pairing modeling techniques with game-changing Snowflake features can help build Snowflake’s most performant and cost-effective solutions.
This book combines the best practices in data modeling with Snowflake’s powerful features to offer you the most efficient and effective approach to data modeling in Snowflake. Using these techniques, you can optimize your data warehousing processes, improve your organization’s data-driven decision-making capabilities, and save valuable time and resources.
Who this book is for
Database modeling is a simple, yet foundational tool for enhancing communication and decision-making within enterprise teams and streamlining development. By pairing modeling-first principles with the specifics of Snowflake architecture, this book will serve as an effective tool for data engineers looking to build cost-effective Snowflake systems for business users looking for an easy way to understand them.
The three main personas who are the target audience of this content are as follows:
Data engineers: This book takes a Snowflake-centered approach to designing data models. It pairs universal modeling principles with unique architectural facets of the data cloud to help build performant and cost-effective solutions.
Data architects: While familiar with modeling concepts, many architects may be new to the Snowflake platform and are eager to learn and incorporate its best features into their designs for improved efficiency and maintenance.
Business analysts: Many analysts transition from business or functional roles and are cast into the world of data without a formal introduction to database best practices and modeling conventions. This book will give them the tools to navigate their data landscape and confidently create their own models and analyses.
What this book covers
Chapter 1, Unlocking the Power of Modeling, explores the role that models play in simplifying and guiding our everyday experience. This chapter unpacks the concept of modeling into its constituents: natural language, technical, and visual semantics. This chapter also gives you a glimpse into how modeling differs across various types of databases.
Chapter 2, An Introduction to the Four Modeling Types, looks at the four types of modeling covered in this book: conceptual, logical, physical, and transformational. This chapter gives an overview of where and how each type of modeling is used and what it looks like. This foundation gives you a taste of where the upcoming chapters will lead.
Chapter 3, Mastering Snowflake’s Architecture, provides a history of the evolution of database architectures and highlights the advances that make the data cloud a game changer in scalable computing. Understanding the underlying architecture will inform how Snowflake’s three-tier architecture unlocks unique capabilities in the models we design in later chapters.
Chapter 4, Mastering Snowflake Objects, explores the various Snowflake objects we will use in our modeling exercises throughout the book. This chapter looks at the memory footprints of the different table types, change tracking through streams, and the use of tasks to automate data transformations, among many other topics.
Chapter 5, Speaking Modeling through Snowflake Objects, bridges universal modeling concepts such as entities and relationships with accompanying Snowflake architecture, storage, and handling. This chapter breaks down the fundamentals of Snowflake data storage, detailing micro partitions and clustering so that you can make informed and cost-effective design decisions.
Chapter 6, Seeing Snowflake’s Architecture through Modeling Notation, explores why there are so many competing and overlapping visual notations in modeling and how to use the ones that work. This chapter zeroes in on the most concise and intuitive notations you can use to plan and design database models and make them accessible to business users simultaneously.
Chapter 7, Putting Conceptual Modeling into Practice, starts the journey of creating a conceptual model by engaging with domain experts from the business and understanding the elements of the underlying business. This chapter uses Kimball’s dimensional modeling method to identify the facts and dimensions, establish the bus matrix, and launch the design process. We also explore how to work backward using the same technique to align a physical model to a business model.
Chapter 8, Putting Logical Modeling into Practice, continues the modeling journey by expanding the conceptual model with attributes and business nuance. This chapter explores how to resolve many-to-many relationships, expand weak entities, and tackle inheritance in modeling entities.
Chapter 9, Database Normalization, demonstrates that normal doesn’t necessarily mean better—there are trade-offs. While most database models fall within the first to third normal forms, this chapter takes you all the way to the sixth, with detailed examples to illustrate the differences. This chapter also explores the various data anomalies that normalization aims to mitigate.
Chapter 10, Database Naming and Structure, takes the ambiguity out of database object naming and proposes a clear and consistent standard. This chapter focuses on the conventions that will enable you to scale and adjust your model and avoid breaking downstream processes. By considering how Snowflake handles cases and uniqueness, you can make confident and consistent design decisions for your physical objects.
Chapter 11, Putting Physical Modeling into Practice, translates the logical model from the previous chapter into a fully deployable physical model. In this process, we handle the security and governance concerns accompanying a physical model and its deployment. This chapter also explores physicalizing logical inheritance and demonstrates how to go from DDL to generating a visual diagram.
Chapter 12, Putting Transformational Modeling into Practice, demonstrates how to use the physical model to drive transformational design and improve performance gains through join elimination in Snowflake. The chapter discusses the types of joins and set operators available in Snowflake and provides guidance on monitoring Snowflake queries to identify common issues. Using these techniques, you will practice creating transformational designs from business requirements.
Chapter 13, Modeling Slowly Changing Dimensions, delves into the concept of slowly changing dimensions (SCDs) and provides you with recipes for maintaining SCDs efficiently using Snowflake features. You will learn about the challenges of keeping record counts in dimension tables in check and how mini dimensions can help address this issue. The chapter also discusses creating multifunctional surrogate keys and compares them with hashing techniques.
Chapter 14, Modeling Facts for Rapid Analysis, focuses on fact tables and explains the different types of fact tables and measures. You will discover versatile reporting structures such as the reverse balance and range-based factless facts and learn how to recover deleted records. This chapter also provides related Snowflake recipes for building and maintaining all the operations mentioned.
Chapter 15, Modeling Semi-Structured Data, explores techniques required to use and model semi-structured data in Snowflake. This chapter demonstrates that while Snowflake makes querying semi-structured data easy, there is effort involved in transforming it into a relational format that users can understand. We explore the benefits of converting semi-structured data to a relational schema and review a rule-based method for doing so.
Chapter 16, Modeling Hierarchies, provides you with an understanding of the different types of hierarchies and their uses in data warehouses. The chapter distinguishes between hierarchy types and discusses modeling techniques for maintaining each of them. You will also learn about Snowflake features for traversing a recursive tree structure and techniques for handling changes in hierarchy dimensions.
Chapter 17, Scaling Data Models through Modern Frameworks, discusses the utility of Data Vault methodology in modern data platforms and how it addresses the challenges of managing large, complex, and rapidly changing data environments. This chapter also discusses the efficient loading of the Data Vault with multi-table inserts and creating Star and Snowflake schema models for reporting information marts. Additionally, you will be introduced to Data Mesh and its application in managing data in large, complex organizations. Finally, the chapter reviews modeling best practices mentioned throughout the book.
Chapter 18, Appendix, collects all the fun and practical Snowflake recipes that couldn’t fit into the structure of the main chapters. This chapter showcases useful techniques such as the exceptional time traveler, exposes the (secret) virtual column type, and more!
To get the most out of this book
This book will rely heavily on the design and use of visual modeling diagrams. While a diagram can be drawn by hand, maintained in Excel, or constructed in PowerPoint, a modeling tool with dedicated layouts and functions is recommended. As the exercises in this book will take you from conceptual database-agnostic diagrams to deployable and runnable Snowflake code, a tool that supports Snowflake syntax and can generate deployable DDL is recommended.
This book uses visual examples from SqlDBM, an online database modeling tool that supports Snowflake. A free trial is available on their website here: https://sqldbm.com/Home/.
Another popular online diagramming solution is LucidChart (https://www.lucidchart.com/pages/). Although LucidChart does not support Snowflake as of this writing, it also offers a free tier for designing ER diagrams as well as other models such as Unified Modeling Language (UML) and network diagrams.
If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book’s GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.
Download the example code files
You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Data-Modeling-with-Snowflake. If there’s an update to the code, it will be updated in the GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
Conventions used
There are a number of text conventions used throughout this book.
Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: Adding a discriminator between the CUSTOMER supertype and the LOYALTY_CUSTOMER subtype adds context that would otherwise be lost at the database level.
A block of code is set as follows:
-- Query the change tracking metadata to observe
-- only inserts from the timestamp till now
select * from myTable
changes(information => append_only)
at(timestamp => $cDts);
Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in bold. Here is an example: Subtypes share common characteristics with a supertype entity but have additional attributes that make them distinct.
Tips or important notes
Appear like this.
Get in touch
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, email us at customercare@packtpub.com and mention the book title in the subject of your message.
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.
Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at copyright@packt.com with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Share Your Thoughts
Once you’ve read Data Modeling with Snowflake, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.
Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.
Download a free PDF copy of this book
Thanks for purchasing this book!
Do you like to read on the go but are unable to carry your print books everywhere? Is your eBook purchase not compatible with the device of your choice?
Don’t