Graph Databases in Action: Examples in Gremlin

Ebook715 pages6 hours

Graph Databases in Action: Examples in Gremlin

Name: Graph Databases in Action: Examples in Gremlin
Author: Josh Perryman
ISBN: 9781638350101

By Josh Perryman and Dave Bechberger

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Graph Databases in Action introduces you to graph database concepts by comparing them with relational database constructs. You'll learn just enough theory to get started, then progress to hands-on development. Discover use cases involving social networking, recommendation engines, and personalization.

Summary
Relationships in data often look far more like a web than an orderly set of rows and columns. Graph databases shine when it comes to revealing valuable insights within complex, interconnected data such as demographics, financial records, or computer networks. In Graph Databases in Action, experts Dave Bechberger and Josh Perryman illuminate the design and implementation of graph databases in real-world applications. You'll learn how to choose the right database solutions for your tasks, and how to use your new knowledge to build agile, flexible, and high-performing graph-powered applications!

Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.

About the technology
Isolated data is a thing of the past! Now, data is connected, and graph databases—like Amazon Neptune, Microsoft Cosmos DB, and Neo4j—are the essential tools of this new reality. Graph databases represent relationships naturally, speeding the discovery of insights and driving business value.

About the book
Graph Databases in Action introduces you to graph database concepts by comparing them with relational database constructs. You'll learn just enough theory to get started, then progress to hands-on development. Discover use cases involving social networking, recommendation engines, and personalization.

What's inside
    Graph databases vs. relational databases
    Systematic graph data modeling
    Querying and navigating a graph
    Graph patterns
    Pitfalls and antipatterns

About the reader
For software developers. No experience with graph databases required.

About the author
Dave Bechberger and Josh Perryman have decades of experience building complex data-driven systems and have worked with graph databases since 2014.

Table of Contents

PART 1 - GETTING STARTED WITH GRAPH DATABASES

1 Introduction to graphs

2 Graph data modeling

3 Running basic and recursive traversals

4 Pathfinding traversals and mutating graphs

5 Formatting results

6 Developing an application

PART 2 - BUILDING ON GRAPH DATABASES

7 Advanced data modeling techniques

8 Building traversals using known walks

9 Working with subgraphs

PART 3 - MOVING BEYOND THE BASICS

10 Performance, pitfalls, and anti-patterns

11 What's next: Graph analytics, machine learning, and resources

Skip carousel

LanguageEnglish

PublisherManning

Release dateOct 17, 2020

ISBN9781638350101

Author

Josh Perryman

Josh Perryman is technologist with over two decades of diverse experience building and maintaining complex systems, including high performance computing (HPC) environments. Since 2014 he has focused on graph databases, especially in distributed or big data environments, and he regularly blogs and speaks at conferences about graph databases.

Related authors

Skip carousel

Related to Graph Databases in Action

Related ebooks

Skip carousel

Making Sense of NoSQL: A guide for managers and the rest of us
Ebook
Making Sense of NoSQL: A guide for managers and the rest of us
byAnn Kelly
Rating: 0 out of 5 stars
0 ratings
Data-Oriented Programming: Reduce software complexity
Ebook
Data-Oriented Programming: Reduce software complexity
byYehonathan Sharvit
Rating: 4 out of 5 stars
4/5
MLOps Engineering at Scale
Ebook
MLOps Engineering at Scale
byCarl Osipov
Rating: 0 out of 5 stars
0 ratings
Machine Learning Systems: Designs that scale
Ebook
Machine Learning Systems: Designs that scale
byJeffrey Smith
Rating: 0 out of 5 stars
0 ratings
GraphQL in Action
Ebook
GraphQL in Action
bySamer Buna
Rating: 2 out of 5 stars
2/5
Machine Learning Bookcamp: Build a portfolio of real-life projects
Ebook
Machine Learning Bookcamp: Build a portfolio of real-life projects
byAlexey Grigorev
Rating: 4 out of 5 stars
4/5
Think Like a Data Scientist: Tackle the data science process step-by-step
Ebook
Think Like a Data Scientist: Tackle the data science process step-by-step
byBrian Godsey
Rating: 0 out of 5 stars
0 ratings
Machine Learning Engineering in Action
Ebook
Machine Learning Engineering in Action
byBen Wilson
Rating: 0 out of 5 stars
0 ratings
Infrastructure as Code, Patterns and Practices: With examples in Python and Terraform
Ebook
Infrastructure as Code, Patterns and Practices: With examples in Python and Terraform
byRosemary Wang
Rating: 0 out of 5 stars
0 ratings
Software Mistakes and Tradeoffs: How to make good programming decisions
Ebook
Software Mistakes and Tradeoffs: How to make good programming decisions
byTomasz Lelek
Rating: 0 out of 5 stars
0 ratings
Designing Cloud Data Platforms
Ebook
Designing Cloud Data Platforms
byDanil Zburivsky
Rating: 0 out of 5 stars
0 ratings
Data Lake Development with Big Data
Ebook
Data Lake Development with Big Data
byPasupuleti Pradeep
Rating: 0 out of 5 stars
0 ratings
Big Data Analytics
Ebook
Big Data Analytics
byVenkat Ankam
Rating: 0 out of 5 stars
0 ratings
Functional Programming in JavaScript: How to improve your JavaScript programs using functional techniques
Ebook
Functional Programming in JavaScript: How to improve your JavaScript programs using functional techniques
byLuis Atencio
Rating: 0 out of 5 stars
0 ratings
Data Engineering on Azure
Ebook
Data Engineering on Azure
byVlad Riscutia
Rating: 0 out of 5 stars
0 ratings
Full Stack GraphQL Applications: With React, Node.js, and Neo4j
Ebook
Full Stack GraphQL Applications: With React, Node.js, and Neo4j
byWilliam Lyon
Rating: 0 out of 5 stars
0 ratings
Operations Anti-Patterns, DevOps Solutions
Ebook
Operations Anti-Patterns, DevOps Solutions
byJeffery Smith
Rating: 0 out of 5 stars
0 ratings
Visualizing Graph Data
Ebook
Visualizing Graph Data
byCorey Lanum
Rating: 0 out of 5 stars
0 ratings
Event Processing in Action
Ebook
Event Processing in Action
byPeter Niblett
Rating: 0 out of 5 stars
0 ratings
Data Pipelines with Apache Airflow
Ebook
Data Pipelines with Apache Airflow
byJulian de Ruiter
Rating: 0 out of 5 stars
0 ratings
Event Streams in Action: Real-time event systems with Kafka and Kinesis
Ebook
Event Streams in Action: Real-time event systems with Kafka and Kinesis
byValentin Crettaz
Rating: 0 out of 5 stars
0 ratings
How to Lead in Data Science
Ebook
How to Lead in Data Science
byJike Chong
Rating: 0 out of 5 stars
0 ratings
Streaming Data: Understanding the real-time pipeline
Ebook
Streaming Data: Understanding the real-time pipeline
byAndrew Psaltis
Rating: 0 out of 5 stars
0 ratings
Neo4j High Performance
Ebook
Neo4j High Performance
bySonal Raj
Rating: 0 out of 5 stars
0 ratings
Neo4j Cookbook
Ebook
Neo4j Cookbook
byAnkur Goel
Rating: 0 out of 5 stars
0 ratings
Introducing Data Science: Big data, machine learning, and more, using Python tools
Ebook
Introducing Data Science: Big data, machine learning, and more, using Python tools
byDavy Cielen
Rating: 5 out of 5 stars
5/5
Real-World Functional Programming: With examples in F# and C#
Ebook
Real-World Functional Programming: With examples in F# and C#
byTomas Petricek
Rating: 0 out of 5 stars
0 ratings
Scala in Action
Ebook
Scala in Action
byNilanjan Raychaudhuri
Rating: 0 out of 5 stars
0 ratings
Algorithms of the Intelligent Web
Ebook
Algorithms of the Intelligent Web
byDoug McIlwraith
Rating: 0 out of 5 stars
0 ratings
Neo4j in Action
Ebook
Neo4j in Action
byTareq Abedrabbo
Rating: 0 out of 5 stars
0 ratings

Data Modeling & Design For You

Skip carousel

The Secrets of ChatGPT Prompt Engineering for Non-Developers
Ebook
The Secrets of ChatGPT Prompt Engineering for Non-Developers
byCea West
Rating: 5 out of 5 stars
5/5
Thinking in Algorithms: Strategic Thinking Skills, #2
Ebook
Thinking in Algorithms: Strategic Thinking Skills, #2
byAlbert Rutherford
Rating: 4 out of 5 stars
4/5
Data Analytics for Beginners: Introduction to Data Analytics
Ebook
Data Analytics for Beginners: Introduction to Data Analytics
byAnthony S. Williams
Rating: 4 out of 5 stars
4/5
DAX Patterns: Second Edition
Ebook
DAX Patterns: Second Edition
byMarco Russo
Rating: 5 out of 5 stars
5/5
Neural Networks for Beginners: An Easy-to-Follow Introduction to Artificial Intelligence and Deep Learning
Ebook
Neural Networks for Beginners: An Easy-to-Follow Introduction to Artificial Intelligence and Deep Learning
byBrian Murray
Rating: 2 out of 5 stars
2/5
Mastering Agile User Stories
Ebook
Mastering Agile User Stories
byDeEtta Balthazar
Rating: 4 out of 5 stars
4/5
150 Most Poweful Excel Shortcuts: Secrets of Saving Time with MS Excel
Ebook
150 Most Poweful Excel Shortcuts: Secrets of Saving Time with MS Excel
byAndrei Besedin
Rating: 3 out of 5 stars
3/5
Data Visualization: a successful design process
Ebook
Data Visualization: a successful design process
byAndy Kirk
Rating: 4 out of 5 stars
4/5
Raspberry Pi :Raspberry Pi Guide On Python & Projects Programming In Easy Steps
Ebook
Raspberry Pi :Raspberry Pi Guide On Python & Projects Programming In Easy Steps
byJason Scotts
Rating: 3 out of 5 stars
3/5
Power Pivot and Power BI: The Excel User's Guide to DAX, Power Query, Power BI & Power Pivot in Excel 2010-2016
Ebook
Power Pivot and Power BI: The Excel User's Guide to DAX, Power Query, Power BI & Power Pivot in Excel 2010-2016
byRob Collie
Rating: 4 out of 5 stars
4/5
The Esri Guide to GIS Analysis, Volume 3: Modeling Suitability, Movement, and Interaction
Ebook
The Esri Guide to GIS Analysis, Volume 3: Modeling Suitability, Movement, and Interaction
byAndy Mitchell
Rating: 0 out of 5 stars
0 ratings
Living in Data: A Citizen's Guide to a Better Information Future
Ebook
Living in Data: A Citizen's Guide to a Better Information Future
byJer Thorp
Rating: 4 out of 5 stars
4/5
Managing Data Using Excel
Ebook
Managing Data Using Excel
byMark Gardener
Rating: 5 out of 5 stars
5/5
Mastering Python Design Patterns
Ebook
Mastering Python Design Patterns
bySakis Kasampalis
Rating: 0 out of 5 stars
0 ratings
Hacks To Crush Plc Program Fast & Efficiently Everytime... : Coding, Simulating & Testing Programmable Logic Controller With Examples
Ebook
Hacks To Crush Plc Program Fast & Efficiently Everytime... : Coding, Simulating & Testing Programmable Logic Controller With Examples
byMichael Blake
Rating: 5 out of 5 stars
5/5
Machine Learning: A Comprehensive, Step-by-Step Guide to Learning and Understanding Machine Learning Concepts, Technology and Principles for Beginners: 1
Ebook
Machine Learning: A Comprehensive, Step-by-Step Guide to Learning and Understanding Machine Learning Concepts, Technology and Principles for Beginners: 1
byPeter Bradley
Rating: 0 out of 5 stars
0 ratings
Data Analytics with Python: Data Analytics in Python Using Pandas
Ebook
Data Analytics with Python: Data Analytics in Python Using Pandas
byFrank Millstein
Rating: 3 out of 5 stars
3/5
Mastering Hadoop
Ebook
Mastering Hadoop
bySandeep Karanth
Rating: 0 out of 5 stars
0 ratings
Supercharge Power BI: Power BI is Better When You Learn To Write DAX
Ebook
Supercharge Power BI: Power BI is Better When You Learn To Write DAX
byMatt Allington
Rating: 5 out of 5 stars
5/5
Supercharge Excel: When you learn to Write DAX for Power Pivot
Ebook
Supercharge Excel: When you learn to Write DAX for Power Pivot
byMatt Allington
Rating: 0 out of 5 stars
0 ratings
A Concise Guide to Object Orientated Programming
Ebook
A Concise Guide to Object Orientated Programming
byalasdair gilchrist
Rating: 0 out of 5 stars
0 ratings
Principles of Data Science
Ebook
Principles of Data Science
bySinan Ozdemir
Rating: 4 out of 5 stars
4/5
Machine Learning Interview Questions
Ebook
Machine Learning Interview Questions
byTech Interviews
Rating: 5 out of 5 stars
5/5
Microsoft Access: Database Creation and Management through Microsoft Access
Ebook
Microsoft Access: Database Creation and Management through Microsoft Access
bySteven Bright
Rating: 0 out of 5 stars
0 ratings
Hands-On Data Science for Marketing: Improve your marketing strategies with machine learning using Python and R
Ebook
Hands-On Data Science for Marketing: Improve your marketing strategies with machine learning using Python and R
byYoon Hyup Hwang
Rating: 5 out of 5 stars
5/5
Tableau Desktop Certified Associate: Exam Guide: Develop your Tableau skills and prepare for Tableau certification with tips from industry experts
Ebook
Tableau Desktop Certified Associate: Exam Guide: Develop your Tableau skills and prepare for Tableau certification with tips from industry experts
byDmitry Anoshin
Rating: 0 out of 5 stars
0 ratings
Python Data Analysis
Ebook
Python Data Analysis
byIvan Idris
Rating: 4 out of 5 stars
4/5
The Systems Thinker - Mental Models: The Systems Thinker Series, #3
Ebook
The Systems Thinker - Mental Models: The Systems Thinker Series, #3
byAlbert Rutherford
Rating: 0 out of 5 stars
0 ratings
Kafka in Action
Ebook
Kafka in Action
byDylan Scott
Rating: 0 out of 5 stars
0 ratings
Advanced Deep Learning with Python: Design and implement advanced next-generation AI solutions using TensorFlow and PyTorch
Ebook
Advanced Deep Learning with Python: Design and implement advanced next-generation AI solutions using TensorFlow and PyTorch
byIvan Vasilev
Rating: 0 out of 5 stars
0 ratings

Related podcast episodes

Skip carousel

#122 How Organizations Can Bridge the Data Literacy Gap
UNLIMITED
#122 How Organizations Can Bridge the Data Literacy Gap
byDataFramed
0 ratings
0% found this document useful
It’s Not a Data Science Problem, It’s a Data Engineering Problem with Laurie Voss: Laurie Voss is a senior data analyst at Netlify, makers of a serverless platform designed to help teams build, deploy, and collaborate on web apps more effectively. Previously, Laurie worked as Chief Data Officer at npm, Inc., co-founded Snowball Factory,
UNLIMITED
It’s Not a Data Science Problem, It’s a Data Engineering Problem with Laurie Voss: Laurie Voss is a senior data analyst at Netlify, makers of a serverless platform designed to help teams build, deploy, and collaborate on web apps more effectively. Previously, Laurie worked as Chief Data Officer at npm, Inc., co-founded Snowball Factory,
byScreaming in the Cloud
0 ratings
0% found this document useful
Google’s Site Reliability Engineering with Todd Underwood: Google’s site reliability engineers are responsible for maintaining the highly available services that power the Google software that we all use on a regular basis. O’Reilly recently published the book “Site Reliability Engineering: How Google Runs Pro...
UNLIMITED
Google’s Site Reliability Engineering with Todd Underwood: Google’s site reliability engineers are responsible for maintaining the highly available services that power the Google software that we all use on a regular basis. O’Reilly recently published the book “Site Reliability Engineering: How Google Runs Pro...
byCloud Engineering Archives - Software Engineering Daily
100%
100% found this document useful
Putting Airflow Into Production With James Meickle - Episode 43: Lessons Learned While Building A Data Science Platform With Airflow (Interview)
UNLIMITED
Putting Airflow Into Production With James Meickle - Episode 43: Lessons Learned While Building A Data Science Platform With Airflow (Interview)
byData Engineering Podcast
0 ratings
0% found this document useful
All Roads Lead to Kubernetes with Kendall Miller: Kendall Miller is the president at Fairwinds, a shop that helps teams optimize containerized apps and get the most out of Kubernetes that was formerly called ReactiveOps. He's also the host of Authority Issues, a podcast about leadership. Prior to these p
UNLIMITED
All Roads Lead to Kubernetes with Kendall Miller: Kendall Miller is the president at Fairwinds, a shop that helps teams optimize containerized apps and get the most out of Kubernetes that was formerly called ReactiveOps. He's also the host of Authority Issues, a podcast about leadership. Prior to these p
byScreaming in the Cloud
0 ratings
0% found this document useful
#121 — ChatGPT and How Generative AI is Augmenting Workflows
UNLIMITED
#121 — ChatGPT and How Generative AI is Augmenting Workflows
byDataFramed
0 ratings
0% found this document useful
Taking A Tour Of PostgreSQL with Jonathan Katz - Episode 42: A Whirlwind Tour Of The PostgreSQL Database (Interview)
UNLIMITED
Taking A Tour Of PostgreSQL with Jonathan Katz - Episode 42: A Whirlwind Tour Of The PostgreSQL Database (Interview)
byData Engineering Podcast
100%
100% found this document useful
Rust: A language for the next 40 years with Carol Nichols: Learn what makes the programming language Rust a unique technology, such as the memory safety guarantees that enable more people to write performant systems-level code. Scott talks to Rust core contributor Carol Nichols about what she's so excited about Rust and the future.
UNLIMITED
Rust: A language for the next 40 years with Carol Nichols: Learn what makes the programming language Rust a unique technology, such as the memory safety guarantees that enable more people to write performant systems-level code. Scott talks to Rust core contributor Carol Nichols about what she's so excited about Rust and the future.
byHanselminutes with Scott Hanselman
0 ratings
0% found this document useful
An Introduction to the Go Programming language with Andrew Gerrand: Andrew Gerrand is a developer at Google who works on the Go Programming Language (golang). Why Go and why now? What kinds of problems does Go solve that aren't a good match for existing languages? How does Go compare to C++ and improve upon it?
UNLIMITED
An Introduction to the Go Programming language with Andrew Gerrand: Andrew Gerrand is a developer at Google who works on the Go Programming Language (golang). Why Go and why now? What kinds of problems does Go solve that aren't a good match for existing languages? How does Go compare to C++ and improve upon it?
byHanselminutes with Scott Hanselman
0 ratings
0% found this document useful
Revisit The Fundamental Principles Of Working With Data To Avoid Getting Caught In The Hype Cycle: The data ecosystem has seen a constant flurry of activity for the past several years, and it shows no signs of slowing down. With all of the products, techniques, and buzzwords being discussed it can be easy to be overcome by the hype. In this episode Juan Sequeda and Tim Gasper from data.world share their views on the core principles that you can use to ground your work and avoid getting caught in the hype cycles.
UNLIMITED
Revisit The Fundamental Principles Of Working With Data To Avoid Getting Caught In The Hype Cycle: The data ecosystem has seen a constant flurry of activity for the past several years, and it shows no signs of slowing down. With all of the products, techniques, and buzzwords being discussed it can be easy to be overcome by the hype. In this episode Juan Sequeda and Tim Gasper from data.world share their views on the core principles that you can use to ground your work and avoid getting caught in the hype cycles.
byData Engineering Podcast
0 ratings
0% found this document useful
The Pragmatic Programmers: with Andy Hunt & Dave Thomas
UNLIMITED
The Pragmatic Programmers: with Andy Hunt & Dave Thomas
byThe Changelog: Software Development, Open Source
0 ratings
0% found this document useful
Rust in Production Ep 1 - InfluxData's Paul Dix: Paul Dix, CTO of InfluxDB, talks about the open-source time series database's development, the decision to use Go and Rust, challenges of managing high data volumes, performance improvements, future plans, and the value of hands-on learning.
UNLIMITED
Rust in Production Ep 1 - InfluxData's Paul Dix: Paul Dix, CTO of InfluxDB, talks about the open-source time series database's development, the decision to use Go and Rust, challenges of managing high data volumes, performance improvements, future plans, and the value of hands-on learning.
byRust in Production
0 ratings
0% found this document useful
Taming Distributed Architecture with Caitie McCaffrey: Distributed systems programming will always be a world of tradeoffs -- there is no silver bullet in the future. But life can be made easier with tactics such as the actor pattern and the use of conflict-free replicated data types (CRDTs). -
UNLIMITED
Taming Distributed Architecture with Caitie McCaffrey: Distributed systems programming will always be a world of tradeoffs -- there is no silver bullet in the future. But life can be made easier with tactics such as the actor pattern and the use of conflict-free replicated data types (CRDTs). -
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
#70 Beyond the Language Wars: R & Python for the Modern Data Scientist
UNLIMITED
#70 Beyond the Language Wars: R & Python for the Modern Data Scientist
byDataFramed
0 ratings
0% found this document useful
CRDTs and Distributed Consensus with Christopher Meiklejohn - Episode 14: CRDTs, Conflict Resolution, and Distributed Consensus in Real World Systems (Interview)
UNLIMITED
CRDTs and Distributed Consensus with Christopher Meiklejohn - Episode 14: CRDTs, Conflict Resolution, and Distributed Consensus in Real World Systems (Interview)
byData Engineering Podcast
0 ratings
0% found this document useful
All Things Azure with Dwayne Monroe: Dwayne Monroe is a senior cloud architect at Cloudreach, an organization that helps enterprises maximize their cloud investments, who’s focused on Azure. Prior to joining Cloudreach, Dwayne worked as a senior Microsoft and cloud architect at High Availabi
UNLIMITED
All Things Azure with Dwayne Monroe: Dwayne Monroe is a senior cloud architect at Cloudreach, an organization that helps enterprises maximize their cloud investments, who’s focused on Azure. Prior to joining Cloudreach, Dwayne worked as a senior Microsoft and cloud architect at High Availabi
byScreaming in the Cloud
0 ratings
0% found this document useful
Yugabyte and Database Innovations with Karthik Ranganathan: This week Corey is joined by Karthik Ranganathan, CTO and Co-Founder of Yugabyte, to talk about databases of which YugabyteDB is one of the best. Karthik started at Facebook building distributed databases and now has moved onto building even more! Why? We
UNLIMITED
Yugabyte and Database Innovations with Karthik Ranganathan: This week Corey is joined by Karthik Ranganathan, CTO and Co-Founder of Yugabyte, to talk about databases of which YugabyteDB is one of the best. Karthik started at Facebook building distributed databases and now has moved onto building even more! Why? We
byScreaming in the Cloud
0 ratings
0% found this document useful
#76 - Learning Domain-Driven Design - Vladik Khononov
UNLIMITED
#76 - Learning Domain-Driven Design - Vladik Khononov
byTech Lead Journal
0 ratings
0% found this document useful
164 | Edward Tufte's complete work with Sandra Rendgen
UNLIMITED
164 | Edward Tufte's complete work with Sandra Rendgen
byData Stories
0 ratings
0% found this document useful
Morgan Senkal: Using Epics to Improve Code Quality Within Sprints: Robby speaks with Morgan Senkal, Software Architect at Metal Toad. Morgan recalls a challenging 15-year-old legacy project that was reminiscent of a Stephen King story and explains what to think about when considering a software rewrite. Morgan and Robby keep a running analogy of technical debt and automotive repairs.
UNLIMITED
Morgan Senkal: Using Epics to Improve Code Quality Within Sprints: Robby speaks with Morgan Senkal, Software Architect at Metal Toad. Morgan recalls a challenging 15-year-old legacy project that was reminiscent of a Stephen King story and explains what to think about when considering a software rewrite. Morgan and Robby keep a running analogy of technical debt and automotive repairs.
byMaintainable
0 ratings
0% found this document useful
Keeping Your Data Warehouse In Order With DataForm - Episode 102: An interview about Dataform and how it helps you to keep your data warehouse in good working order
UNLIMITED
Keeping Your Data Warehouse In Order With DataForm - Episode 102: An interview about Dataform and how it helps you to keep your data warehouse in good working order
byData Engineering Podcast
0 ratings
0% found this document useful
#58 - Uncle Bob Martin // The Clean Coder Behind Test Driven Development, SOLID Principles and the Agile Manifesto
UNLIMITED
#58 - Uncle Bob Martin // The Clean Coder Behind Test Driven Development, SOLID Principles and the Agile Manifesto
byalphalist.CTO Podcast - For CTOs and Technical Leaders
0 ratings
0% found this document useful
Unlocking The Power of Data Lineage In Your Platform with OpenLineage: An interview with Julien Le Dem about the OpenLineage specification and the opportunity that it offers for simplifying the tracking and analysis of data lineage across your data platform.
UNLIMITED
Unlocking The Power of Data Lineage In Your Platform with OpenLineage: An interview with Julien Le Dem about the OpenLineage specification and the opportunity that it offers for simplifying the tracking and analysis of data lineage across your data platform.
byData Engineering Podcast
0 ratings
0% found this document useful
Engineering interview tips & tricks: with Emma Draper & Jonas
UNLIMITED
Engineering interview tips & tricks: with Emma Draper & Jonas
byGo Time: Golang, Software Engineering
0 ratings
0% found this document useful
The Pragmatic Programmer celebrates 20 years with Dave Thomas and Andy Hunt: Straight from the programming trenches, The Pragmatic Programmer cuts through the increasing specialization and technicalities of modern software development to examine the core process—what do you do, as an individual and as a team, if you want to create software that’s easy to work with and good for your users. Now updated after 20 years, Scott talks to Andy and Dave about this classic book! This classic title is regularly featured on software development “Top Ten” lists, and is issued by many corporations to new hires.
UNLIMITED
The Pragmatic Programmer celebrates 20 years with Dave Thomas and Andy Hunt: Straight from the programming trenches, The Pragmatic Programmer cuts through the increasing specialization and technicalities of modern software development to examine the core process—what do you do, as an individual and as a team, if you want to create software that’s easy to work with and good for your users. Now updated after 20 years, Scott talks to Andy and Dave about this classic book! This classic title is regularly featured on software development “Top Ten” lists, and is issued by many corporations to new hires.
byHanselminutes with Scott Hanselman
100%
100% found this document useful
#40 Becoming a Data Scientist
UNLIMITED
#40 Becoming a Data Scientist
byDataFramed
100%
100% found this document useful
#124 Using AI to Improve Data Quality in Healthcare
UNLIMITED
#124 Using AI to Improve Data Quality in Healthcare
byDataFramed
0 ratings
0% found this document useful
Software Architecture with Simon Brown: Software architecture address the challenge of communicating and navigating large, complex systems to stakeholders, both technical and non-technical. Over the years software architecture has gone in and out of fashion.
UNLIMITED
Software Architecture with Simon Brown: Software architecture address the challenge of communicating and navigating large, complex systems to stakeholders, both technical and non-technical. Over the years software architecture has gone in and out of fashion.
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Episode 8: Interview Eric Evans: Eric Evans is the author of the well known Domain-Driven Design book. In his day job he works as a consultant and coach for his own company, Domain Language. In this interview, Eric talks about the essential building blocks of domain-driven design as w...
UNLIMITED
Episode 8: Interview Eric Evans: Eric Evans is the author of the well known Domain-Driven Design book. In his day job he works as a consultant and coach for his own company, Domain Language. In this interview, Eric talks about the essential building blocks of domain-driven design as w...
bySoftware Engineering Radio - the podcast for professional software developers
0 ratings
0% found this document useful
Running Databases on Kubernetes
UNLIMITED
Running Databases on Kubernetes
byThe Cloudcast
0 ratings
0% found this document useful

Skip carousel

The Future Of The Database
Linux Format
UNLIMITED
The Future Of The Database
Aug 27, 2019
7 min read
Create Visualisations And Cool Dashboards
Linux Format
UNLIMITED
Create Visualisations And Cool Dashboards
Jan 14, 2020
8 min read
Usability
Linux Format
UNLIMITED
Usability
Oct 19, 2021
3 min read
Create A RESTful Server In Go
Linux Format
UNLIMITED
Create A RESTful Server In Go
Oct 19, 2021
8 min read
Metrics & Visuals In Go
Linux Format
UNLIMITED
Metrics & Visuals In Go
Nov 17, 2020
Mihalis Tsoukalos is a DataOps engineer and a technical writer. He’s the author of Go Systems Programming and Mastering Go, 2nd edition. The subject of this tutorial is two-fold. First, it’s about creating a Go application that exports metrics to P
7 min read
An Introduction To Rabbitmq
Linux Format
UNLIMITED
An Introduction To Rabbitmq
Jun 29, 2021
RabbitMQ is a Message Broker, which means that it can safely hold messages generated by applications and make them available to other applications. The main advantages are reliability, support for clustering and high-availability queues, tracing capa
1 min read
Are Docker Containers a Good Idea for Laptops?
Maximum PC
UNLIMITED
Are Docker Containers a Good Idea for Laptops?
Mar 31, 2020
Docker containers are cool. If you haven’t yet played with Docker, you’re missing a large world of easily deployed applications. For example, I can deploy NodeRed, Plex, Jupyter Lab, and Nextcloud servers, and run them behind a Traefik reverse proxy
2 min read
Understanding ELT & ETL
Techfastly
UNLIMITED
Understanding ELT & ETL
Apr 1, 2021
8 min read
Access Your Mac Anywhere
MacLife
UNLIMITED
Access Your Mac Anywhere
Nov 8, 2022
2 min read
How Image Recognition Works
APC
UNLIMITED
How Image Recognition Works
Nov 4, 2019
4 min read
AWS Vs Azure What’s The Difference?
PC Pro Magazine
UNLIMITED
AWS Vs Azure What’s The Difference?
Sep 11, 2022
7 min read
An easy-to-Understand Overview of Popular extended BPF Tools: BCC, Falco, and More
Techfastly
UNLIMITED
An easy-to-Understand Overview of Popular extended BPF Tools: BCC, Falco, and More
Apr 1, 2022
7 min read
Basic Concepts
Linux Format
UNLIMITED
Basic Concepts
Jul 2, 2019
A messaging system such as Kafka enables you to send messages between processes, applications and servers. Applications connect to Kafka to send or get data. Strictly speaking, a Kafka ‘topic’ is a unit of storage in Kafka: data in Kafka is stored in
1 min read
DJANGO Create A Database-driven Website
Linux Format
UNLIMITED
DJANGO Create A Database-driven Website
Jun 4, 2019
The Django web framework was named after the famous guitarist Django Reinhardt and was first created by web developers at a small newspaper in Kansas. The main goals of Django is to enable fast development of complex websites with database needs. It
7 min read
Types Of Databases
Linux Format
UNLIMITED
Types Of Databases
Aug 27, 2019
NoSQL databases provide the performance, scalability and stability that’s required by the modern data-driven apps we interact with these days. But that is where the similarity between NoSQL systems end. In fact, it wouldn’t be wrong to say that the o
1 min read
GO Inside Parsing – How Go Handles The Code
Linux Format
UNLIMITED
GO Inside Parsing – How Go Handles The Code
Jul 30, 2019
This tutorial has two aspects: a theoretical one and a practical one. In the theoretical part, you will learn about parsing, grammar and regular expressions; this is how languages are built and therefore understood in terms of construction and usage.
8 min read
A.I.-POWERED RASPBERRY Pi
Linux Format
UNLIMITED
A.I.-POWERED RASPBERRY Pi
Sep 19, 2023
1 min read
Why Is ELT Better For Cloud Data Warehousing?
Techfastly
UNLIMITED
Why Is ELT Better For Cloud Data Warehousing?
Apr 1, 2021
2 min read
How Netflix’s OTT Architecture Functions?
Techfastly
UNLIMITED
How Netflix’s OTT Architecture Functions?
May 1, 2022
With so many OTT platforms in the market today, Netflix has managed to capture a majority of the audience on a global scale. Netflix has become the go-to source of so much entertainment for consumers in less than 20 years. It can even be said that Ne
4 min read
LastPass vs. Bitwarden
APC
UNLIMITED
LastPass vs. Bitwarden
Jul 11, 2022
4 min read
What is ELT?
Techfastly
UNLIMITED
What is ELT?
Apr 1, 2021
It stands for extract, load, and transform- the processes a data pipeline uses for replicating the data from a source system into a target system such as a cloud data warehouse. 1. Extraction is the first step in which data is copied from the source
6 min read
Understanding 'Big Data' and What It Means to Your Business
Entrepreneur
UNLIMITED
Understanding 'Big Data' and What It Means to Your Business
May 1, 2013
2 min read
The Coders Programming Themselves Out of a Job
The Atlantic
UNLIMITED
The Coders Programming Themselves Out of a Job
Oct 2, 2018
8 min read
What You Need to Know About Data Modeling
Entrepreneur
UNLIMITED
What You Need to Know About Data Modeling
Jan 1, 2013
2 min read
Stay Safe Online!
Linux Format
UNLIMITED
Stay Safe Online!
Jan 9, 2024
19 min read
Enterprise Soaring Success
Linux Format
UNLIMITED
Enterprise Soaring Success
Aug 27, 2019
7 min read
Data Fabric
PC Pro Magazine
UNLIMITED
Data Fabric
Aug 13, 2020
3 min read
5 Tools That Integrate Your Cloud Storage Into Windows File Explorer
PCWorld
UNLIMITED
5 Tools That Integrate Your Cloud Storage Into Windows File Explorer
Apr 30, 2024
6 min read
RapidWeaver Classic
MacFormat
UNLIMITED
RapidWeaver Classic
Aug 23, 2022
3 min read
Browser wars 2020
APC
UNLIMITED
Browser wars 2020
Nov 2, 2020
8 min read

Related categories

Skip carousel

Reviews for Graph Databases in Action

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Graph Databases in Action - Josh Perryman

Graph Databases in Action

Examples in Gremlin

Dave Bechberger and Josh Perryman

Foreword by Ted Wilmes

To comment go to liveBook

Manning

Shelter Island

For more information on this and other Manning titles go to

manning.com

Copyright

For online information and ordering of these and other Manning books, please visit manning.com. The publisher offers discounts on these books when ordered in quantity.

For more information, please contact

Special Sales Department

Manning Publications Co.

20 Baldwin Road

PO Box 761

Shelter Island, NY 11964

Email: orders@manning.com

No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher.

Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps.

♾ Recognizing the importance of preserving what has been written, it is Manning’s policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end. Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without the use of elemental chlorine.

ISBN: 9781617296376

foreword

preface

acknowledgments

about this book

about the authors

about the cover illustration

Part 1. Getting started with graph databases

1 Introduction to graphs

1.1 What is a graph?

What is a graph database?

Comparison with other types of databases

Why can’t I use SQL?

1.2 Is my problem a graph problem?

Explore the questions

I’m still confused. . . . Is this a graph problem?

2 Graph data modeling

2.1 The data modeling process

Data modeling terms

Four-step process for data modeling

2.2 Understand the problem

Domain and scope questions

Business entity questions

Functionality questions

2.3 Developing the whiteboard model

Identifying and grouping entities

Identifying relationships between entities

2.4 Constructing the logical data model

Translating entities to vertices

Translating relationships to edges

Finding and assigning properties

2.5 Checking our model

3 Running basic and recursive traversals

3.1 Setting up your environment

Starting the Gremlin Server

Starting the Gremlin Console, connecting to the Gremlin Server, and loading the data

3.2 Traversing a graph

Using a logical data model (schema) to plan traversals

Planning the steps through the graph data

Fundamental concepts of traversing a graph

Writing traversals in Gremlin

Retrieving properties with values steps

3.3 Recursive traversals

Using recursive logic

Writing recursive traversals in Gremlin

4 Pathfinding traversals and mutating graphs

4.1 Mutating a graph

Creating vertices and edges

Removing data from our graph

Updating a graph

Extending our graph

4.2 Paths

Cycles in graphs

Finding the simple path

4.3 Traversing and filtering edges

Introducing the E and V steps for traversing edges

Filtering with edge properties

Include edges in path results

Performant edge counts and denormalization

5 Formatting results

5.1 Review of values steps

5.2 Constructing our result payload

Applying aliases in Gremlin

Projecting results instead of aliasing

5.3 Organizing our results

Ordering results returned from a graph traversal

Grouping results returned from a graph traversal

Limiting results

5.4 Combining steps into complex traversals

6 Developing an application

6.1 Starting the project

Selecting our tools

Setting up the project

Obtaining a driver

Preparing the database server Instance

6.2 Connecting to our database

Building the cluster configuration

Setting up the GraphTraversalSource

6.3 Retrieving data

Retrieving a vertex

Using Gremlin language variants (GLVs)

Adding terminal steps

Creating the Java method in our application

6.4 Adding, modifying, and deleting data

Adding vertices

Adding edges

Updating properties

Deleting elements

6.5 Translating our list and path traversals

Getting a list of results

Implementing recursive traversals

Implementing paths

Part 2. Building on Graph Databases

7 Advanced data modeling techniques

7.1 Reviewing our current data models

7.2 Extending our logical data model

7.3 Translating entities to vertices

Using generic labels

Denormalizing graph data

Translating relationships to edges

Finding and assigning properties

Moving properties to edges

Checking our model

7.4 Extending our data model for personalization

7.5 Comparing the results

8 Building traversals using known walks

8.1 Preparing to develop our traversals

Identifying the required elements

Selecting a starting place

Setting up test data

8.2 Writing our first traversal

Designing our traversal

Developing the traversal code

8.3 Pagination and graph databases

8.4 Recommending the highest-rated restaurants

Designing our traversal

Developing the traversal code

8.5 Writing the last recommendation engine traversal

Designing our traversal

Adding this traversal to our application

9 Working with subgraphs

9.1 Working with subgraphs

Extracting a subgraph

Traversing a subgraph

9.2 Building a subgraph for personalization

9.3 Building the traversal

Reversing the traversing direction

Evaluating the individualized results of the subgraph

9.4 Implementing a subgraph with a remote connection

Connecting with TinkerPop’s Client class

Adding this traversal to our application

Part 3. Moving Beyond the Basics

10 Performance, pitfalls, and anti-patterns

10.1 Slow-performing traversals

Explaining our traversal

Profiling our traversal

Indexes

10.2 Dealing with supernodes

It’s about instance data

It’s about the database

What makes a supernode?

Monitoring for supernodes

What to do if you have a supernode

10.3 Application anti-patterns

Using graphs for non-graph use cases

Dirty data

Lack of adequate testing

10.4 Traversal anti-patterns

Not using parameterized traversals

Using unlabeled filtering steps

11 What’s next: Graph analytics, machine learning, and resources

11.1 Graph analytics

Pathfinding

Centrality

Community detection

Graphs and machine learning

Additional resources

11.2 Final thoughts

appendix. Apache TinkerPop installation and overview

index

front matter

foreword

At the dawn of a new decade, developers are confronted with a myriad of database options when beginning a new project. The stalwart relational database still rules the roost, maintaining popularity in both legacy and greenfield projects. This is for good reason; flexibility and forty plus years of cumulative engineering history are hard to argue with. Despite the success of relational databases, the last decade saw an explosion of new commercial and open-source database systems that were designed around alternative models and query languages. Some tackle traditional RDBMS workloads with a new twist, perhaps focusing horizontal scale out or high performance via the embrace of in-memory optimization that have become available due to decreases in RAM prices. Many other systems diverged from the relational model altogether. Out of this set, we find a variety of focus areas and modeling paradigms. This book focuses on one of the more expressive and powerful developments, the graph model, and the property graph in particular.

Graph databases aren’t a new thing. Hierarchical and navigational databases have existed since the 60s, but these have recently experienced an increase in developer popularity. I think this is largely due to the intuitiveness of the property graph data model. People are already wired to think in graphs. If you draw a graph on a whiteboard, technical and non-technical folks get it. Consequently, after you overlay the graph model onto your software tasks at hand, everything starts to look like a graph problem.

With all that said, we’re still dealing with technology, and the available property graph databases are the newer technology at that, so there isn’t any magic. This is where Dave and Josh come in. I can’t imagine a better pair to help lay out the signposts and guide you on the journey to graph understanding. Both are accomplished graph architects and developers that have been involved in this junior space since before its recent uptick in popularity. Having worked in graph-based product development and consulting, they’ve racked up years of real-world experience.

This experience has influenced their pragmatic approach to the problems of graph application development, and though both proponents of graphs, they’re proponents with a healthy dose of skepticism and are not overly fascinated with the technology. After all, as mentioned, one of the first and most important questions new developers have is, Is this a graph problem? As you make your way through this book, you’ll hone an intuition for translating real world problems into graph data models and build up your Gremlin query chops, a popular and powerful property graph query language. The rubber meets the road in chapter 6 where you use this knowledge to build your first graph application. By the time you’ve finished, you’ll have the knowledge to evaluate if a graph database is a good fit for your next project, and if so, to execute on that vision having already built an example graph database application.

Ted Wilmes

Data Architect & JanusGraph Technical Steering Committee Member

Expero Inc.

preface

Two complementary trends started in the mid to late 2000s. First, companies began using and collecting more data on their customers, competition, and users than ever before. Second, the information companies wanted from this data became more complex, often containing hidden connections. These two trends drove the need for an easier exploration of expansive, yet highly connected data. Graph databases met that need.

Both the authors have gotten an up-close and personal view of this market as the technology, usage, and adoption of graph technology has matured. We both started using graph databases in the mid 2010s while working for a niche software consulting company. Independently, we each worked on projects that used graph databases to solve specific types of complex data problems. At that time, graph databases were new and very rough. Despite the challenges of working with new technologies, we both recognized the power of this tool and were hooked.

Since then, we have spent countless hours banging our heads against a proverbial wall to understand all the intricacies and nuances of building graph-backed applications. This book is the distillation of those countless hours of struggle. It is our hope that the hands-on nature of this book will provide a solid, foundational understanding of the skills needed to build graph-backed applications and, in the process, help you to avoid some of the pitfalls that we encountered.

acknowledgments

This book has been a labor of love, and sometimes frustration, so we first and foremost need to thank our wives (Melody and Meredith), and then acknowledge family and friends for their endless patience and for indulging us as we shared our latest esoteric discoveries while working with graph databases. Without their support we never could have made it through the countless hours it took to create this book.

A big thank you goes out to Dr. Denise Gosnell, Kelly Mondor, Ted Wilmes, and Daniel Farrell for all the specific insights, interviews, and support you provided, which helped us immensely in creating this book.

We would also like to thank the team at Manning Publications for allowing us the time and opportunity to publish this book. We would like to thank the entire Manning staff and specifically our publishers Marjan Bace and Michael Stephens, as well as our editors Frances Lefkowitz, Nick Watts, Alex Ott, Lori Weidert, and Frances Buran for all the amazing feedback and endless patience you have shown. Our appreciation also goes out to all the reviewers whose comments and reviews were invaluable in solidifying the organization and in clarifying the focus of this book: Scott Bartram, Andrew Blair, Alain Couniot, Douglas Duncan, Mike Erickson, John Guthrie, Mike Haller, Milorad Imbra, Ramaninder Singh Jhajj, Mike Jensen, Nicholas Robert Keers, Mladen Knežic´, Miguel Montalvo, Luis Moux, Nick Rakochy, Ron Sher, Deshuang Tang, Richard Vaughan, and Matthew Welke.

We would also like to thank the team at Expero Inc., without whom Josh and Dave would never have met, nor would have ever started their exploration of graph databases. Our many years of working side by side with the exceptionally talented Experonauts were a fruitful starting point that eventually led to writing this book.

about this book

This book is written for anyone building applications using graph databases. It is designed to provide a foundational understanding of graphs and graph databases, as well as to provide a framework for building applications using common graph database patterns. To teach this framework, this book follows the development lifecycle of a fictitious application called DiningByFriends. We use this application throughout the book to provide a realistic grounding of graph principles and examples of the concepts and content we teach. In many areas throughout this book, we compare and contrast the differences between building a graph-backed application and using the more traditional relational database model. By the end of this book, you will not only have the skills needed to build your own graph-backed application, but you will have built your first application, DiningByFriends.

Who should read this book

This book is for application developers, data engineers, and database developers who want to use graph databases as the backing data store for their applications. Throughout this book, we do not expect the reader to have any prior experience using graph databases, but you should be familiar with data modeling concepts, specifically with relational database development, as these are used heavily throughout as a common point of reference. Although all the application code is written in Java, any developer with object-oriented application development experience should be able to follow along with the concepts and content.

How this book is organized: A roadmap

This book is organized into 3 parts, comprising of 11 chapters. In part 1, Getting started with graph databases, we establish the foundation for our DiningByFriends application:

Chapter 1 begins with an introduction to graphs and graph terminology. We discuss how graph databases differ from relational databases and how you can use graph databases to solve highly connected data problems. We finish this chapter by discussing what makes a problem a good candidate for using a graph database.

Chapter 2 is where we hit the ground running by building an initial data model for our DiningByFriends application. We start with the types of information needed to begin the data modeling process. We then show how to turn this information into a conceptual data model. Finally, we walk through a framework for taking our business needs and our conceptual data model and turn that into our initial data model using the elements of a graph database: vertices, edges, and properties.

Chapter 3 begins a set of three chapters focused on learning the process of querying a graph database, known as traversing. We begin by teaching you how to retrieve and filter data from our graph. We follow this with learning how to navigate the structure of our graph and how that differs from working with a relational database. Then we finish up this chapter by demonstrating the ease with which you can recursively traverse through a graph to retrieve complex, interconnected data.

Chapter 4 continues our exploration of graph traversals with data mutation use cases. We then show how you can traverse the graph to find the entities and relationships that connect two items, known as the path. Finally, we look at how to leverage properties on relationships to filter the traversals and increase their performance.

Chapter 5 finishes our initial focus on graph traversals with a discussion of ways to format the results of our traversal into a desired output. Additionally, you learn how to perform common operations such as sorting, filtering, and limiting the results returned.

Chapter 6 begins the process of building our DiningByFriends application by taking the traversals we developed in chapters 3, 4, and 5 and walking through incorporating these into a Java application. Then we’ll process the results to complete this first part.

In part 2, Building an application with graph databases, we extend the concepts introduced in part 1:

Chapter 7 uses the foundations of data modeling from chapter 2, as well as what you learned about traversing a graph, to extend the data model for more complex use cases, such as recommendation engines and personalization.

Chapter 8 leverages a recommendation engine use case to demonstrate the power of using a known-walk pattern to create a robust recommendation application pattern.

Chapter 9 uses our personalization use case to demonstrate how to use a subgraph access pattern within a graph-backed application.

In part 3, Beyond the basics, we move past the DiningByFriends application to discuss our next steps in the application development process.

Chapter 10 discusses how to debug and troubleshoot common performance problems with traversals. We then investigate exactly what supernodes are and why they cause issues in graph-backed applications. We follow up these common performance problems with common application and traversal pitfalls and anti-patterns, as well as how to recognize and avoid them.

Chapter 11 takes a forward-looking view and discusses some of the next steps you might want to take with your graph-backed application. We also discuss some of the most common graph analytics algorithms and how you can apply these to solve a specific problem. Finally, we wrap up this chapter with a brief overview of how to leverage graphs in machine learning (ML) application.

About the code

This book contains many examples of source code, both in numbered listings and in line with normal text. In both cases, source code is formatted in a fixed-width font like this to separate it from ordinary text.

In many cases, the original source code has been reformatted; we’ve added line breaks and reworked indentation to accommodate the available page size in the book. In rare cases, even this was not enough and code listings include line-continuation markers (➥). Additionally, code annotations accompany many of the listings, highlighting important concepts.

The code for the examples in this book is available for download from the Manning website at https://www.manning.com/books/graph-databases-in-action, and from GitHub at https://github.com/bechbd/graph-databases-in-action.

About the technologies

Our goal throughout this book is to equip the reader with the conceptual knowledge needed to build graph-backed applications. However, in order to provide practical examples of these concepts, we had to make decisions regarding the technologies used for demonstration.

Our first decision was to pick the type of database. We decided to use a labeled property graph database, instead of, for example, an RDF store or triplestore database. Labeled property graph databases are the most common type we have seen in production use and seem to be the ones with the most momentum behind them. Additionally, these are the closest to the familiar concepts of relational databases, so labeled property graph databases are quite effective for comparisons.

This lead us to our next decision: the traversal language to use, openCypher or Gremlin.

While there’s a strong case for using openCypher, the goal of this book is to remain as vendor-agnostic as possible. It is important to us that these concepts and techniques are easily transferable to many popular databases when you start to build your applications. In the end, we decided to use the Apache TinkerPop version 3.4.x framework because it currently has the most database vendors with compatible implementations.

We have been questioned multiple times during the proposal and review processes as to why we chose this stack over a Neo4j/Cypher stack. Given the popularity of the Neo4j ecosystem this is a fair question which deserves fuller comment. There are three reasons we chose TinkerPop’s Gremlin for the illustrations throughout this book:

Gremlin is a better tool for teaching how a traversal works.

Gremlin is a common language of choice for enterprise applications.

Gremlin is the most portable language between property graph databases.

As for the first reason, we believe that the imperative design of Gremlin provides a better teaching tool for learning how a graph traversal works compared to the declarative approach of Cypher/openCypher. The syntax of Gremlin requires that we think about how we are moving through our graph in order to determine where we will move next. While we do appreciate the simplicity of Cypher/openCypher, it can also obfuscate critical technical matters, especially when dealing with issues of performance or scale. So while Cypher/openCypher is a great starting point for learning how to work with connected data, we feel that Gremlin is better suited for building high performing, scalable data applications.

Because Gremlin is the common language of choice for enterprise applications, many of these applications were built using TinkerPop-enabled databases. This means that Gremlin is the query language of choice. Some organizations have both Cypher/openCypher and Gremlin applications. But in our experience, the bigger, more complex enterprise-level projects seem to have chosen one of the many TinkerPop-enabled databases or cloud services.

As for our third choice, at this time, it is easy to say that Gremlin is the most widely available query language across graph database engines. Nearly all of the major cloud vendors (Amazon Web Services, Microsoft Azure, IBM, Huawei, and so forth) offer graph databases or services compatible with Gremlin. The lone exception is the Google Cloud Platform, which offers Neo4j as a service.

Our goal is not to advocate for one database or language over another. We seek to provide you with a solid foundation for how to use a graph database when building applications with highly connected data and to illustrate how graph databases work under the cover. We think that Gremlin provides the best path to accomplish this.

With the decision to use TinkerPop’s Gremlin made, we had to pick a specific TinkerPop-enabled database to use. In the spirit of remaining vendor agnostic, we’ve decided to use TinkerGraph for the examples. TinkerGraph is the graph implementation used in the Gremlin Server and Gremlin Console, the reference software provided as part of the Apache Software Foundation’s TinkerPop project.

Finally, we had to decide on an application programming language to build our example application, DiningByFriends. As Java is the most common language we have used with graph databases, we chose that as our application language. We should note that it is possible to build the same application with other languages such as C#, JavaScript and Python. Not only is it possible, we have done so ourselves. But all the traversals provided in this book are written in Gremlin and any application code is written in Java.

While almost all the concepts presented throughout this book are not specific to TinkerPop-enabled databases, there are a few we discuss that are unique to TinkerPop. When this is the case, we'll note where a TinkerPop-specific feature is used so that you’re aware that a particular feature might not be available in your graph database of choice. If no such note is given, it is safe to assume that the concept we discuss is applicable to other labeled property graph databases as well.

liveBook discussion forum

Purchase of Graph Databases in Action includes free access to a private web forum run by Manning Publications where you can make comments about the book, ask technical questions, and receive help from the authors and from other users. To access the forum, go to https://livebook.manning.com/#!/book/graph-databases-in-action/discussion. You can also learn more about Manning’s forums and the rules of conduct at https://livebook.manning.com/#!/discussion.

Manning’s commitment to our readers is to provide a venue where a meaningful dialogue between individual readers and between readers and the authors can take place. It is not a commitment to any specific amount of participation on the part of the authors, whose contribution to the forum remains voluntary (and unpaid). We suggest you try asking the authors some challenging questions lest their interest stray! The forum and the archives of previous discussions will be accessible from the publisher’s website as long as the book is in print.

about the authors

Dave Bechberger is a data architect and developer with over two decades of experience. He uses his extensive knowledge of graph and other big data technologies to build highly performant and scalable data platforms in complex data domains such as bioinformatics, oil and gas, and supply chain management. Since the mid-2010s, Dave has worked with graph databases as a consultant, consumer, and vendor. He is an active member of the graph community and has presented on a wide range of graph-related topics at national and international conferences.

Josh Perryman also has over two decades of experience building and maintaining complex systems. Since 2014, he has focused on graph databases, especially in distributed or big data environments, and he regularly blogs and speaks at conferences about graph databases. Josh has worked with a variety of industries, including enterprise software, financial services, consumer products, and government intelligence agencies. In addition to consulting and product work, he has designed Gremlin training courses that have been delivered all over the world.

about the cover illustration

The figure on the cover of Graph Databases in Action is captioned Femme de la Foret Noire, or a woman from the Black Forest, in Southwest Germany. The illustration is taken from a collection of dress costumes from various countries by Jacques Grasset de Saint-Sauveur (1757-1810), titled Costumes civils actuels de tous les peoples connus, published in France in 1788. Each illustration is finely drawn and colored by hand. The rich variety of Grasset de Saint-Sauveur’s collection reminds us vividly of how culturally apart the world’s towns and regions were just 200 years ago. Isolated from each other, people spoke different dialects and languages. In the streets or in the countryside, it was easy to identify where they lived and what their trade or station in life was just by their dress.

The way we dress has changed since then and the diversity by region, so rich at the time, has faded away. It is now hard to tell apart the inhabitants of different continents, let alone different towns, regions, or countries. Perhaps we have traded cultural diversity for a more varied personal life--certainly for a more varied and fast-paced technological life.

At a time when it is hard to tell one computer book from another, Manning celebrates the inventiveness and initiative of the computer business with book covers based on the rich diversity of regional life of two centuries ago, brought back to life by Grasset de Saint-Sauveur’s pictures.

Part 1. Getting started with graph databases

Journeys into new technologies take work, and in this book, our journey will extend your current knowledge of building relational database applications to demonstrate how you can solve complex data problems by building graph databases and graph-backed applications. In this first part, we ease into your journey by establishing concepts, terms, and processes, while highlighting the critical differences required when approaching a problem with a graph mindset.

Chapter 1 introduces the core concepts of graphs and discusses the types of problems that are well suited for these models. In chapter 2, we establish a data modeling methodology and build a simple data model for a social network that we’ll use in our example application, DiningByFriends. The next three chapters introduce the most common operations that you’ll use to find and manipulate data in graph databases. We approach these operations in three stages, starting with the basics of moving around a graph in chapter 3. Chapter 4 then covers how to perform basic CRUD (Create/Read/Update/Delete) operations before extending the work we did in chapter 3 to perform more complex recursive and pathfinding traversals. In chapter 5, we close our introduction by using simple graph operations to examine ways to organize your results. Chapter 6 completes this part by synthesizing the work from chapters 2 through 5 into our working Java application, DiningByFriends.

1 Introduction to graphs

This chapter covers

An introduction to graphs and graph terminology

How graph databases help solve highly connected data problems

The advantages of graph databases over relational databases

Identifying problems that make good candidates for using a graph database

Modern applications are built on data--data that is ever increasing in both size and complexity. Even as the complexity of our data grows, so do our expectations of what insight our applications can derive from that data. If you are old enough, you likely remember when applications took a long time to load data and had limited features. Today’s reality is different; applications provide powerful, flexible, and immediate insight into data. But for every 100 questions modern applications answer, the most common data tool these use (namely, a relational database) handles only about 88 of those questions well. That leaves 12 types of questions where relational databases struggle. These remaining questions deal with the links and connections within the data, those aspects of the data that can generate powerful and unique insights. This puts us at a crossroad: we can use the relational database hammer to pound away at those questions and make this work well enough, or we can take a step back and look at what other tools can answer these questions better, faster, and with less effort.

By reading this book, you decided to take a step back from your relational database hammer and investigate a road less traveled: graph databases. This book is written for developers, engineers, and architects who are interested in other ways to solve problems specific to working with highly connected data. We assume you are already familiar with relational databases but are interested in learning when, where, and how graph databases are a better tool.

Our goal with this book is to equip you with the techniques needed to add graph databases as another tool in your toolbelt. We like to think of this book as the guide that we wish we had when we started building graph-backed applications. Throughout this book, we’ll demonstrate common graph patterns that highlight how graph databases enable navigation and exploration of data in ways not easily accomplished with a traditional relational database.

Our primary approach is through an example of building a fictitious restaurant review and recommendation application we call DiningByFriends. As we move through the software development life cycle from planning, to analysis, to design, and on to implementation, this application demonstrates how to think about and work with graph data. Each chapter builds on the previous chapter, and by the end of this book, we’ll have created a functioning application on a graph database. We believe that putting the concepts immediately to work by solving a realistic set of problems, even if they are somewhat simplistic, is the best way to get comfortable using a new technology. Let’s begin our journey with an introduction to what graphs and graph databases are and how they compare with traditional tools such as relational databases.

1.1 What is a graph?

When you look at a road map, examine an organizational chart, or use social networks such as Facebook, LinkedIn, or Twitter, you use a graph. Graphs are a nearly ubiquitous way to think about real-world scenarios as these abstract out the items and the relationships being represented, and this abstraction allows for quick and efficient processing of the connections within the data.

Let’s demonstrate with a common task: going to the supermarket. Take out a piece of paper and draw out a plan for getting from your house to your supermarket. Chances are it looks something like figure 1.1.

Figure 1.1 A graph representing directions to the supermarket

Figure 1.1 shows a graph where the key items and relationships are represented by abstractions. First, we abstracted key locations, like intersections, and represented these as circles. We then designated the connections between these key intersections as lines, showing how the key intersections are related. This is just one example of how we naturally represent real-world problems as graphs.

It is human nature to abstract real-world entities and their relationships, and the mathematical name for this abstract construct is a graph. When thinking about a set of data that contains a vast array of highly interconnected items, we might also describe this data set as a web of interconnected things, which is just another way of saying a graph.

On maps, cities are frequently represented by circles, and the roads that connect these are represented by lines. On an organizational chart (org chart), a circle usually represents a person, normally with an associated title, and lines that connect these

Enjoying the preview?

Page 1 of 1

Graph Databases in Action: Examples in Gremlin

About this ebook

Josh Perryman

Related authors

Related to Graph Databases in Action

Related ebooks

Making Sense of NoSQL: A guide for managers and the rest of us

Data-Oriented Programming: Reduce software complexity

MLOps Engineering at Scale

Machine Learning Systems: Designs that scale

GraphQL in Action

Machine Learning Bookcamp: Build a portfolio of real-life projects

Think Like a Data Scientist: Tackle the data science process step-by-step

Machine Learning Engineering in Action

Infrastructure as Code, Patterns and Practices: With examples in Python and Terraform

Software Mistakes and Tradeoffs: How to make good programming decisions

Designing Cloud Data Platforms

Data Lake Development with Big Data

Big Data Analytics

Functional Programming in JavaScript: How to improve your JavaScript programs using functional techniques

Data Engineering on Azure

Full Stack GraphQL Applications: With React, Node.js, and Neo4j

Operations Anti-Patterns, DevOps Solutions

Visualizing Graph Data

Event Processing in Action

Data Pipelines with Apache Airflow

Event Streams in Action: Real-time event systems with Kafka and Kinesis

How to Lead in Data Science

Streaming Data: Understanding the real-time pipeline

Neo4j High Performance

Neo4j Cookbook

Introducing Data Science: Big data, machine learning, and more, using Python tools

Real-World Functional Programming: With examples in F# and C#

Scala in Action

Algorithms of the Intelligent Web

Neo4j in Action

Data Modeling & Design For You

The Secrets of ChatGPT Prompt Engineering for Non-Developers

Thinking in Algorithms: Strategic Thinking Skills, #2

Data Analytics for Beginners: Introduction to Data Analytics

DAX Patterns: Second Edition

Neural Networks for Beginners: An Easy-to-Follow Introduction to Artificial Intelligence and Deep Learning

Mastering Agile User Stories

150 Most Poweful Excel Shortcuts: Secrets of Saving Time with MS Excel

Data Visualization: a successful design process

Raspberry Pi :Raspberry Pi Guide On Python & Projects Programming In Easy Steps

Power Pivot and Power BI: The Excel User's Guide to DAX, Power Query, Power BI &amp; Power Pivot in Excel 2010-2016

The Esri Guide to GIS Analysis, Volume 3: Modeling Suitability, Movement, and Interaction

Living in Data: A Citizen's Guide to a Better Information Future

Managing Data Using Excel

Mastering Python Design Patterns

Hacks To Crush Plc Program Fast & Efficiently Everytime... : Coding, Simulating & Testing Programmable Logic Controller With Examples

Machine Learning: A Comprehensive, Step-by-Step Guide to Learning and Understanding Machine Learning Concepts, Technology and Principles for Beginners: 1

Data Analytics with Python: Data Analytics in Python Using Pandas

Mastering Hadoop

Supercharge Power BI: Power BI is Better When You Learn To Write DAX

Supercharge Excel: When you learn to Write DAX for Power Pivot

A Concise Guide to Object Orientated Programming

Principles of Data Science

Machine Learning Interview Questions

Microsoft Access: Database Creation and Management through Microsoft Access

Hands-On Data Science for Marketing: Improve your marketing strategies with machine learning using Python and R

Tableau Desktop Certified Associate: Exam Guide: Develop your Tableau skills and prepare for Tableau certification with tips from industry experts

Python Data Analysis

The Systems Thinker - Mental Models: The Systems Thinker Series, #3

Kafka in Action

Advanced Deep Learning with Python: Design and implement advanced next-generation AI solutions using TensorFlow and PyTorch

Related podcast episodes

Related articles

Related categories

Reviews for Graph Databases in Action

What did you think?

Book preview

Graph Databases in Action - Josh Perryman

Graph Databases in Action

contents

foreword

preface

acknowledgments

about this book

Power Pivot and Power BI: The Excel User's Guide to DAX, Power Query, Power BI & Power Pivot in Excel 2010-2016