Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
ScyllaDB Architecture -
built for speed
Tzach Livyatan, VP Product
Tzach Livyatan
■ VP Product ScyllaDB
■ Love Databases and NoSQL
■ <cool hobby>
Your photo
goes here,
smile :)
■ High Availability
■ Data Modeling
■ Implementation
Presentation Agenda
Vestibulum congue
Distributed
Node
HW
Control
High Availability
NoSQL – By Data Model
Key / Value Redis, Aerospike, RocksDB
Document store MongoDB, Couchbase
Wide column store Scylla, Apache Cassandra,
HBase, DynamoDB
Graph Neo4j, JanusGraph
Complexity
5
NoSQL– By Availability vs Consistency
6
Pick Two
Availability
Partition Tolerance
Consistency
PACELC:
Latency vs Consistency
Cluster - Node Ring
7
Node 5
Node 1
Node 2
Node 4 Node 3
Data Replication
■ Replication Factor: number of nodes where data (rows and partitions) are replicated
■ Done automatically
■ Set for keyspace
CREATE KEYSPACE mykeyspace WITH replication = {
'class': 'NetworkTopologyStrategy',
'replication_factor' : 3}
AND durable_writes = true;
8
Replication Factor (RF) = 3
9
Node 5
Node 1
Node 2
Node 4 Node 3
Multiple Data Centers
10
USA DC
Asia DC
EU DC
'us_1' : 3,
'eu' : 3,
'asia' : 3
Consistency Level
■ CL: # of nodes that must acknowledge read/write
■ I.E.: 1, QUORUM, LOCAL_QUORUM, ALL
■ Tunable Consistency: CL set per operation
11
12
Cluster Level Write
Cluster Level Read
13
CL =1
Datacenters
14
USA DC
Asia DC
LOCAL
QUORUM
15
Scylla Architecture
Data Modeling
CQL Example
Query:
SELECT * from heartrate_v10 WHERE
pet_chip_id = 80d39c78-9dc0-11eb-a8b3-0242ac130003 LIMIT 1;
SELECT * from heartrate_v10 WHERE
pet_chip_id = 80d39c78-9dc0-11eb-a8b3-0242ac130003 AND
time >= '2021-05-01 01:00+0000' AND
time < '2021-05-01 01:03+0000';
17
https://gist.github.com/tzach/7486f1a0cc904c52f4514f20f14d2a97
Wide Partition Example
CREATE TABLE heartrate_v10 (
pet_chip_id uuid,
owner uuid,
time timestamp,
heart_rate int,
PRIMARY KEY (pet_chip_id, time)
);
pet_chip_id time heart_rate
80d39c78-9dc0-11eb-a8b3-0242ac130003 2021-05-01 01:00:00.000000+0000 120
80d39c78-9dc0-11eb-a8b3-0242ac130003 2021-05-01 01:01:00.000000+0000 121
80d39c78-9dc0-11eb-a8b3-0242ac130003 2021-05-01 01:02:00.000000+0000 120
Partition Key Clustering Key
18
Architecture
pet_chip_id time heart_rate
80d39c78-9dc0-11eb-a8b3-
0242ac130003 2021-05-01 01:00:00.000000+0000 120
80d39c78-9dc0-11eb-a8b3-
0242ac130003 2021-05-01 01:01:00.000000+0000 121
80d39c78-9dc0-11eb-a8b3-
0242ac130003 2021-05-01 01:02:00.000000+0000 120
Partitioner
Hash Function
Partition Key
Token Range
20
Wide Partition Example
Advance Data Modeling
Materialized Views (MV)
Secondary Index (SI)
Change Data Capture (CDC)
Collections
User Defined Types
Time To Live (TTL)
…
1. INSERT INTO heartrate
(pet_chip_id,
Owner,
Time,
heart_rate)
VALUES (..);
2. INSERT INTO
heartrate
Base replica
View replica
Coordinator
3. INSERT INTO
heartrate_by_owner
View is another table
22
View is another table
2.
SELECT * FROM
heartrate_by_owner
WHERE owner = ‘642a..’;
Base replica
View replica
Coordinator
1.
SELECT * FROM
heartrate_by_owner
WHERE owner = ‘642a..’;
23
Global Sec Index - Different Partition key
2.
SELECT name
FROM pet_by_owner_index
WHERE owner = '642a..';
3.
SELECT *
FROM heartrate_10
WHERE pet_chip_id in (...)
AND time in (...)
Base replica
View replica
Coordinator
1.
SELECT * FROM heartrate_v10
WHERE owner = ‘642a..’;
Write Path - Replica
25
26
Read Path - Replica
SSTables
Cache
Memory
Disk
1
2
3
4
5
…
Bloom
Filter
2.5
Storage - Log-Structured Merge Tree
SStable 1
Time
Storage - Log-Structured Merge Tree
SStable 1
Time
SStable 2
SStable 1
SStable 2
SStable 3
Time
SStable 4
SStable 1+2+3
Storage - Log-Structured Merge Tree
Implementation
ScyllaDB Design Decisions
C++ instead of Java
1
2 All Things Async
3 Shard per Core
4 Unified Cache
5 I/O Scheduler
6 Autonomous
C++
ScyllaDB Design Decisions
1
2 All Things Async
3 Shard per Core
4 Unified Cache
5 I/O Scheduler
6 Autonomous
C++
ScyllaDB Design Decisions
Shards
1
2 All Things Async
3 Shard per Core
4 Unified Cache
5 I/O Scheduler
6 Autonomous
C++
Small, Medium, Large Machines
Why larger nodes?
■ Time between failures is
shorter
■ Ease of maintenance
■ No noisy neighbours
■ No virtualization, container
overhead
■ No other moving parts
■ Scale up before out!
Linear Scale Ingestion
Constant Time while volume & throughput double
2X 2X 2X 2X 2X
Network Comparison
Kernel
Cassandra
TCP/IP
Scheduler
queue
queue
queue
queue
queue
threads
NIC
Queues
Kernel
Traditional Stack SeaStar’s Sharded Stack
Memory
Application
TCP/I
P
Task Scheduler
queue
queue
queue
queue
queue
smp queue
NIC
Queue
DPDK
Kernel
(isn’t
involved)
Userspace
Application
TCP/I
P
Task Scheduler
queue
queue
queue
queue
queue
smp queue
NIC
Queue
DPDK
Kernel
(isn’t
involved)
Userspace
Application
TCP/I
P
Task Scheduler
queue
queue
queue
queue
queue
smp queue
NIC
Queue
DPDK
Kernel
(isn’t
involved)
Userspace
Core
Database
Task Scheduler
queue
queue
queue
queue
queue
smp queue
NIC
Queue
Userspace
ScyllaDB Has Its Own Task Scheduler
Traditional Stack Scylla’s Stack
Promise
Task
Promise
Task
Promise
Task
Promise
Task
CPU
Promise
Task
Promise
Task
Promise
Task
Promise
Task
CPU
Promise
Task
Promise
Task
Promise
Task
Promise
Task
CPU
Promise
Task
Promise
Task
Promise
Task
Promise
Task
CPU
Promise
Task
Promise
Task
Promise
Task
Promise
Task
CPU
Promise is a
pointer to
eventually
computed value
Task is a
pointer to a
lambda function
Scheduler
CPU
Scheduler
CPU
Scheduler
CPU
Scheduler
CPU
Scheduler
CPU
Thread
Stack
Thread
Stack
Thread
Stack
Thread
Stack
Thread
Stack
Thread
Stack
Thread
Stack
Thread
Stack
Thread is a
function pointer
Stack is a byte
array from 64k
to megabytes
ScyllaDB Design Decisions
Cassandra Scylla
Key cache
Row cache
On-heap /
Off-heap
Linux page cache
SSTables
Unified cache
SSTables
Complex
Tuning
1
2 All Things Async
3 Shard per Core
4 Unified Cache
5 I/O Scheduler
6 Autonomous
C++
ScyllaDB Design Decisions
Cassandra
Key cache
Row cache
On-heap /
Off-heap
Linux page cache
SSTables
1
2 All Things Async
3 Shard per Core
4 Unified Cache
5 I/O Scheduler
6 Autonomous
C++
App
thread
Kernel
SSD
Page fault
Suspend thread
Initiate I/O
Context switch
I/O
completes
Interrupt
Context
switch
Map page
Resume
thread
ScyllaDB Design Decisions
Query
Commitlog
Compaction
Userspace
I/O
Scheduler
Disk
Max useful disk concurrency
Queue
Queue
Queue
1
2 All Things Async
3 Shard per Core
4 Unified Cache
5 I/O Scheduler
6 Autonomous
C++
ScyllaDB Design Decisions
Memtable
Seastar
Scheduler
Compaction
Query
Repair
Commitlog
SSD
Compaction
Backlog Monitor
Memory Monitor
Adjust priority
Adjust priority
WAN
CPU
1
2 All Things Async
3 Shard per Core
4 Unified Cache
5 I/O Scheduler
6 Autonomous
C++
Different types of loads
■ OLTP
● Small work items
● Latency sensitive
● involves narrow
portion of the data
■ OLAP
● Large work items
● Throughput oriented
● Performed on large
amounts of data
Workload Prioritization
Load #3
800 shares
Load #2
400 shares
Load #1
200 shares
Scylla Design Decisions
1
2 All Things Async
3 Shard per Core
4 Unified Cache
5 I/O Scheduler
6 Autonomous
C++
More than 1M req/sec on i4i.8xlarge
https://github.com/scylladb/1m-ops-demo by Attila Tóth
46
■ Built for High Availability
■ Design to meet modern hardware
■ Use a fully async, share nothing, shard per core architecture
■ Superior throughput and consistent low latency
■ Expose internal scheduler to the user as Workload Prioritization
Summary
Scylla vs Competition
■ 1/7th the cost
■ 26x better in a real life
scenario
■ 10x volume
■ 9.3x throughput
■ 1/4x latency
■ 4 Scylla nodes vs 40
Cassandra
■ 2.5X cheaper
■ 11x better latency
■ 1/5th cost in a benchmark
■ 20x better real-life scenario
■ No throttling
■ No locking
CockroachDB
Google’s
Bigtable
DynamoDB Cassandra
logscaled
Stay in Touch
Tzach Livyatan
tzach@scylladb.com
https://twitter.com/TzachL
https://github.com/tzach
https://www.linkedin.com/in/tzach/

More Related Content

Similar to A Deep Dive into ScyllaDB's Architecture

Scylla on Kubernetes: Introducing the Scylla Operator
Scylla on Kubernetes: Introducing the Scylla OperatorScylla on Kubernetes: Introducing the Scylla Operator
Scylla on Kubernetes: Introducing the Scylla Operator
ScyllaDB
 
5 Apache Spark Tips in 5 Minutes
5 Apache Spark Tips in 5 Minutes5 Apache Spark Tips in 5 Minutes
5 Apache Spark Tips in 5 Minutes
Cloudera, Inc.
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: Sharding
MongoDB
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
ScyllaDB
 
ScyllaDB: NoSQL at Ludicrous Speed
ScyllaDB: NoSQL at Ludicrous SpeedScyllaDB: NoSQL at Ludicrous Speed
ScyllaDB: NoSQL at Ludicrous Speed
J On The Beach
 
To Serverless and Beyond
To Serverless and BeyondTo Serverless and Beyond
To Serverless and Beyond
ScyllaDB
 
Couchbase Overview - Monterey Bay Information Technologists Meetup 02.15.17
Couchbase Overview - Monterey Bay Information Technologists Meetup 02.15.17Couchbase Overview - Monterey Bay Information Technologists Meetup 02.15.17
Couchbase Overview - Monterey Bay Information Technologists Meetup 02.15.17
Aaron Benton
 
MySQL 5.7 in a Nutshell
MySQL 5.7 in a NutshellMySQL 5.7 in a Nutshell
MySQL 5.7 in a Nutshell
Emily Ikuta
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
Christian Johannsen
 
Quick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterQuick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage Cluster
Patrick Quairoli
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle Coherence
Ben Stopford
 
KSCOPE 2013: Exadata Consolidation Success Story
KSCOPE 2013: Exadata Consolidation Success StoryKSCOPE 2013: Exadata Consolidation Success Story
KSCOPE 2013: Exadata Consolidation Success Story
Kristofferson A
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
Splunk
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
Uwe Printz
 
23 October 2013 - AWS 201 - A Walk through the AWS Cloud: Introduction to Ama...
23 October 2013 - AWS 201 - A Walk through the AWS Cloud: Introduction to Ama...23 October 2013 - AWS 201 - A Walk through the AWS Cloud: Introduction to Ama...
23 October 2013 - AWS 201 - A Walk through the AWS Cloud: Introduction to Ama...
Amazon Web Services
 
Presentation
PresentationPresentation
Presentation
Dimitris Stripelis
 
CFCamp 2016 - Couchbase Overview
CFCamp 2016 - Couchbase OverviewCFCamp 2016 - Couchbase Overview
CFCamp 2016 - Couchbase Overview
Aaron Benton
 
How Development Teams Cut Costs with ScyllaDB.pdf
How Development Teams Cut Costs with ScyllaDB.pdfHow Development Teams Cut Costs with ScyllaDB.pdf
How Development Teams Cut Costs with ScyllaDB.pdf
ScyllaDB
 
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison
Severalnines
 
Under The Hood Of A Shard-Per-Core Database Architecture
Under The Hood Of A Shard-Per-Core Database ArchitectureUnder The Hood Of A Shard-Per-Core Database Architecture
Under The Hood Of A Shard-Per-Core Database Architecture
ScyllaDB
 

Similar to A Deep Dive into ScyllaDB's Architecture (20)

Scylla on Kubernetes: Introducing the Scylla Operator
Scylla on Kubernetes: Introducing the Scylla OperatorScylla on Kubernetes: Introducing the Scylla Operator
Scylla on Kubernetes: Introducing the Scylla Operator
 
5 Apache Spark Tips in 5 Minutes
5 Apache Spark Tips in 5 Minutes5 Apache Spark Tips in 5 Minutes
5 Apache Spark Tips in 5 Minutes
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: Sharding
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
 
ScyllaDB: NoSQL at Ludicrous Speed
ScyllaDB: NoSQL at Ludicrous SpeedScyllaDB: NoSQL at Ludicrous Speed
ScyllaDB: NoSQL at Ludicrous Speed
 
To Serverless and Beyond
To Serverless and BeyondTo Serverless and Beyond
To Serverless and Beyond
 
Couchbase Overview - Monterey Bay Information Technologists Meetup 02.15.17
Couchbase Overview - Monterey Bay Information Technologists Meetup 02.15.17Couchbase Overview - Monterey Bay Information Technologists Meetup 02.15.17
Couchbase Overview - Monterey Bay Information Technologists Meetup 02.15.17
 
MySQL 5.7 in a Nutshell
MySQL 5.7 in a NutshellMySQL 5.7 in a Nutshell
MySQL 5.7 in a Nutshell
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
Quick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterQuick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage Cluster
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle Coherence
 
KSCOPE 2013: Exadata Consolidation Success Story
KSCOPE 2013: Exadata Consolidation Success StoryKSCOPE 2013: Exadata Consolidation Success Story
KSCOPE 2013: Exadata Consolidation Success Story
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
 
23 October 2013 - AWS 201 - A Walk through the AWS Cloud: Introduction to Ama...
23 October 2013 - AWS 201 - A Walk through the AWS Cloud: Introduction to Ama...23 October 2013 - AWS 201 - A Walk through the AWS Cloud: Introduction to Ama...
23 October 2013 - AWS 201 - A Walk through the AWS Cloud: Introduction to Ama...
 
Presentation
PresentationPresentation
Presentation
 
CFCamp 2016 - Couchbase Overview
CFCamp 2016 - Couchbase OverviewCFCamp 2016 - Couchbase Overview
CFCamp 2016 - Couchbase Overview
 
How Development Teams Cut Costs with ScyllaDB.pdf
How Development Teams Cut Costs with ScyllaDB.pdfHow Development Teams Cut Costs with ScyllaDB.pdf
How Development Teams Cut Costs with ScyllaDB.pdf
 
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison
 
Under The Hood Of A Shard-Per-Core Database Architecture
Under The Hood Of A Shard-Per-Core Database ArchitectureUnder The Hood Of A Shard-Per-Core Database Architecture
Under The Hood Of A Shard-Per-Core Database Architecture
 

More from ScyllaDB

Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...
Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...
Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...
ScyllaDB
 
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
ScyllaDB
 
Measuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at TwitterMeasuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at Twitter
ScyllaDB
 
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
ScyllaDB
 
Noise Canceling RUM by Tim Vereecke, Akamai
Noise Canceling RUM by Tim Vereecke, AkamaiNoise Canceling RUM by Tim Vereecke, Akamai
Noise Canceling RUM by Tim Vereecke, Akamai
ScyllaDB
 
Running a Go App in Kubernetes: CPU Impacts
Running a Go App in Kubernetes: CPU ImpactsRunning a Go App in Kubernetes: CPU Impacts
Running a Go App in Kubernetes: CPU Impacts
ScyllaDB
 
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
ScyllaDB
 
Performance Budgets for the Real World by Tammy Everts
Performance Budgets for the Real World by Tammy EvertsPerformance Budgets for the Real World by Tammy Everts
Performance Budgets for the Real World by Tammy Everts
ScyllaDB
 
Using Libtracecmd to Analyze Your Latency and Performance Troubles
Using Libtracecmd to Analyze Your Latency and Performance TroublesUsing Libtracecmd to Analyze Your Latency and Performance Troubles
Using Libtracecmd to Analyze Your Latency and Performance Troubles
ScyllaDB
 
Reducing P99 Latencies with Generational ZGC
Reducing P99 Latencies with Generational ZGCReducing P99 Latencies with Generational ZGC
Reducing P99 Latencies with Generational ZGC
ScyllaDB
 
5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X
5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X
5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X
ScyllaDB
 
How Netflix Builds High Performance Applications at Global Scale
How Netflix Builds High Performance Applications at Global ScaleHow Netflix Builds High Performance Applications at Global Scale
How Netflix Builds High Performance Applications at Global Scale
ScyllaDB
 
Conquering Load Balancing: Experiences from ScyllaDB Drivers
Conquering Load Balancing: Experiences from ScyllaDB DriversConquering Load Balancing: Experiences from ScyllaDB Drivers
Conquering Load Balancing: Experiences from ScyllaDB Drivers
ScyllaDB
 
Interaction Latency: Square's User-Centric Mobile Performance Metric
Interaction Latency: Square's User-Centric Mobile Performance MetricInteraction Latency: Square's User-Centric Mobile Performance Metric
Interaction Latency: Square's User-Centric Mobile Performance Metric
ScyllaDB
 
How to Avoid Learning the Linux-Kernel Memory Model
How to Avoid Learning the Linux-Kernel Memory ModelHow to Avoid Learning the Linux-Kernel Memory Model
How to Avoid Learning the Linux-Kernel Memory Model
ScyllaDB
 
99.99% of Your Traces are Trash by Paige Cruz
99.99% of Your Traces are Trash by Paige Cruz99.99% of Your Traces are Trash by Paige Cruz
99.99% of Your Traces are Trash by Paige Cruz
ScyllaDB
 
Square's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with RaftSquare's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with Raft
ScyllaDB
 
Making Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of RustMaking Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of Rust
ScyllaDB
 
A Deep Dive Into Concurrent React by Matheus Albuquerque
A Deep Dive Into Concurrent React by Matheus AlbuquerqueA Deep Dive Into Concurrent React by Matheus Albuquerque
A Deep Dive Into Concurrent React by Matheus Albuquerque
ScyllaDB
 
The Latency Stack: Discovering Surprising Sources of Latency
The Latency Stack: Discovering Surprising Sources of LatencyThe Latency Stack: Discovering Surprising Sources of Latency
The Latency Stack: Discovering Surprising Sources of Latency
ScyllaDB
 

More from ScyllaDB (20)

Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...
Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...
Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...
 
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
 
Measuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at TwitterMeasuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at Twitter
 
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
 
Noise Canceling RUM by Tim Vereecke, Akamai
Noise Canceling RUM by Tim Vereecke, AkamaiNoise Canceling RUM by Tim Vereecke, Akamai
Noise Canceling RUM by Tim Vereecke, Akamai
 
Running a Go App in Kubernetes: CPU Impacts
Running a Go App in Kubernetes: CPU ImpactsRunning a Go App in Kubernetes: CPU Impacts
Running a Go App in Kubernetes: CPU Impacts
 
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
 
Performance Budgets for the Real World by Tammy Everts
Performance Budgets for the Real World by Tammy EvertsPerformance Budgets for the Real World by Tammy Everts
Performance Budgets for the Real World by Tammy Everts
 
Using Libtracecmd to Analyze Your Latency and Performance Troubles
Using Libtracecmd to Analyze Your Latency and Performance TroublesUsing Libtracecmd to Analyze Your Latency and Performance Troubles
Using Libtracecmd to Analyze Your Latency and Performance Troubles
 
Reducing P99 Latencies with Generational ZGC
Reducing P99 Latencies with Generational ZGCReducing P99 Latencies with Generational ZGC
Reducing P99 Latencies with Generational ZGC
 
5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X
5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X
5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X
 
How Netflix Builds High Performance Applications at Global Scale
How Netflix Builds High Performance Applications at Global ScaleHow Netflix Builds High Performance Applications at Global Scale
How Netflix Builds High Performance Applications at Global Scale
 
Conquering Load Balancing: Experiences from ScyllaDB Drivers
Conquering Load Balancing: Experiences from ScyllaDB DriversConquering Load Balancing: Experiences from ScyllaDB Drivers
Conquering Load Balancing: Experiences from ScyllaDB Drivers
 
Interaction Latency: Square's User-Centric Mobile Performance Metric
Interaction Latency: Square's User-Centric Mobile Performance MetricInteraction Latency: Square's User-Centric Mobile Performance Metric
Interaction Latency: Square's User-Centric Mobile Performance Metric
 
How to Avoid Learning the Linux-Kernel Memory Model
How to Avoid Learning the Linux-Kernel Memory ModelHow to Avoid Learning the Linux-Kernel Memory Model
How to Avoid Learning the Linux-Kernel Memory Model
 
99.99% of Your Traces are Trash by Paige Cruz
99.99% of Your Traces are Trash by Paige Cruz99.99% of Your Traces are Trash by Paige Cruz
99.99% of Your Traces are Trash by Paige Cruz
 
Square's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with RaftSquare's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with Raft
 
Making Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of RustMaking Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of Rust
 
A Deep Dive Into Concurrent React by Matheus Albuquerque
A Deep Dive Into Concurrent React by Matheus AlbuquerqueA Deep Dive Into Concurrent React by Matheus Albuquerque
A Deep Dive Into Concurrent React by Matheus Albuquerque
 
The Latency Stack: Discovering Surprising Sources of Latency
The Latency Stack: Discovering Surprising Sources of LatencyThe Latency Stack: Discovering Surprising Sources of Latency
The Latency Stack: Discovering Surprising Sources of Latency
 

Recently uploaded

Navigating Post-Quantum Blockchain: Resilient Cryptography in Quantum Threats
Navigating Post-Quantum Blockchain: Resilient Cryptography in Quantum ThreatsNavigating Post-Quantum Blockchain: Resilient Cryptography in Quantum Threats
Navigating Post-Quantum Blockchain: Resilient Cryptography in Quantum Threats
anupriti
 
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
Matthew Sinclair
 
What Not to Document and Why_ (North Bay Python 2024)
What Not to Document and Why_ (North Bay Python 2024)What Not to Document and Why_ (North Bay Python 2024)
What Not to Document and Why_ (North Bay Python 2024)
Margaret Fero
 
Lessons Of Binary Analysis - Christien Rioux
Lessons Of Binary Analysis - Christien RiouxLessons Of Binary Analysis - Christien Rioux
Lessons Of Binary Analysis - Christien Rioux
crioux1
 
Data Protection in a Connected World: Sovereignty and Cyber Security
Data Protection in a Connected World: Sovereignty and Cyber SecurityData Protection in a Connected World: Sovereignty and Cyber Security
Data Protection in a Connected World: Sovereignty and Cyber Security
anupriti
 
Coordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar SlidesCoordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar Slides
Safe Software
 
Blockchain and Cyber Defense Strategies in new genre times
Blockchain and Cyber Defense Strategies in new genre timesBlockchain and Cyber Defense Strategies in new genre times
Blockchain and Cyber Defense Strategies in new genre times
anupriti
 
5G bootcamp Sep 2020 (NPI initiative).pptx
5G bootcamp Sep 2020 (NPI initiative).pptx5G bootcamp Sep 2020 (NPI initiative).pptx
5G bootcamp Sep 2020 (NPI initiative).pptx
SATYENDRA100
 
“Intel’s Approach to Operationalizing AI in the Manufacturing Sector,” a Pres...
“Intel’s Approach to Operationalizing AI in the Manufacturing Sector,” a Pres...“Intel’s Approach to Operationalizing AI in the Manufacturing Sector,” a Pres...
“Intel’s Approach to Operationalizing AI in the Manufacturing Sector,” a Pres...
Edge AI and Vision Alliance
 
Hire a private investigator to get cell phone records
Hire a private investigator to get cell phone recordsHire a private investigator to get cell phone records
Hire a private investigator to get cell phone records
HackersList
 
Verti - EMEA Insurer Innovation Award 2024
Verti - EMEA Insurer Innovation Award 2024Verti - EMEA Insurer Innovation Award 2024
Verti - EMEA Insurer Innovation Award 2024
The Digital Insurer
 
Knowledge and Prompt Engineering Part 2 Focus on Prompt Design Approaches
Knowledge and Prompt Engineering Part 2 Focus on Prompt Design ApproachesKnowledge and Prompt Engineering Part 2 Focus on Prompt Design Approaches
Knowledge and Prompt Engineering Part 2 Focus on Prompt Design Approaches
Earley Information Science
 
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfINDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
jackson110191
 
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Erasmo Purificato
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
Eric D. Schabell
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
SynapseIndia
 
Why do You Have to Redesign?_Redesign Challenge Day 1
Why do You Have to Redesign?_Redesign Challenge Day 1Why do You Have to Redesign?_Redesign Challenge Day 1
Why do You Have to Redesign?_Redesign Challenge Day 1
FellyciaHikmahwarani
 
Quantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLMQuantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLM
Vijayananda Mohire
 
Quality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of TimeQuality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of Time
Aurora Consulting
 
HTTP Adaptive Streaming – Quo Vadis (2024)
HTTP Adaptive Streaming – Quo Vadis (2024)HTTP Adaptive Streaming – Quo Vadis (2024)
HTTP Adaptive Streaming – Quo Vadis (2024)
Alpen-Adria-Universität
 

Recently uploaded (20)

Navigating Post-Quantum Blockchain: Resilient Cryptography in Quantum Threats
Navigating Post-Quantum Blockchain: Resilient Cryptography in Quantum ThreatsNavigating Post-Quantum Blockchain: Resilient Cryptography in Quantum Threats
Navigating Post-Quantum Blockchain: Resilient Cryptography in Quantum Threats
 
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
 
What Not to Document and Why_ (North Bay Python 2024)
What Not to Document and Why_ (North Bay Python 2024)What Not to Document and Why_ (North Bay Python 2024)
What Not to Document and Why_ (North Bay Python 2024)
 
Lessons Of Binary Analysis - Christien Rioux
Lessons Of Binary Analysis - Christien RiouxLessons Of Binary Analysis - Christien Rioux
Lessons Of Binary Analysis - Christien Rioux
 
Data Protection in a Connected World: Sovereignty and Cyber Security
Data Protection in a Connected World: Sovereignty and Cyber SecurityData Protection in a Connected World: Sovereignty and Cyber Security
Data Protection in a Connected World: Sovereignty and Cyber Security
 
Coordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar SlidesCoordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar Slides
 
Blockchain and Cyber Defense Strategies in new genre times
Blockchain and Cyber Defense Strategies in new genre timesBlockchain and Cyber Defense Strategies in new genre times
Blockchain and Cyber Defense Strategies in new genre times
 
5G bootcamp Sep 2020 (NPI initiative).pptx
5G bootcamp Sep 2020 (NPI initiative).pptx5G bootcamp Sep 2020 (NPI initiative).pptx
5G bootcamp Sep 2020 (NPI initiative).pptx
 
“Intel’s Approach to Operationalizing AI in the Manufacturing Sector,” a Pres...
“Intel’s Approach to Operationalizing AI in the Manufacturing Sector,” a Pres...“Intel’s Approach to Operationalizing AI in the Manufacturing Sector,” a Pres...
“Intel’s Approach to Operationalizing AI in the Manufacturing Sector,” a Pres...
 
Hire a private investigator to get cell phone records
Hire a private investigator to get cell phone recordsHire a private investigator to get cell phone records
Hire a private investigator to get cell phone records
 
Verti - EMEA Insurer Innovation Award 2024
Verti - EMEA Insurer Innovation Award 2024Verti - EMEA Insurer Innovation Award 2024
Verti - EMEA Insurer Innovation Award 2024
 
Knowledge and Prompt Engineering Part 2 Focus on Prompt Design Approaches
Knowledge and Prompt Engineering Part 2 Focus on Prompt Design ApproachesKnowledge and Prompt Engineering Part 2 Focus on Prompt Design Approaches
Knowledge and Prompt Engineering Part 2 Focus on Prompt Design Approaches
 
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfINDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
 
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
 
Why do You Have to Redesign?_Redesign Challenge Day 1
Why do You Have to Redesign?_Redesign Challenge Day 1Why do You Have to Redesign?_Redesign Challenge Day 1
Why do You Have to Redesign?_Redesign Challenge Day 1
 
Quantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLMQuantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLM
 
Quality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of TimeQuality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of Time
 
HTTP Adaptive Streaming – Quo Vadis (2024)
HTTP Adaptive Streaming – Quo Vadis (2024)HTTP Adaptive Streaming – Quo Vadis (2024)
HTTP Adaptive Streaming – Quo Vadis (2024)
 

A Deep Dive into ScyllaDB's Architecture

  • 1. ScyllaDB Architecture - built for speed Tzach Livyatan, VP Product
  • 2. Tzach Livyatan ■ VP Product ScyllaDB ■ Love Databases and NoSQL ■ <cool hobby> Your photo goes here, smile :)
  • 3. ■ High Availability ■ Data Modeling ■ Implementation Presentation Agenda Vestibulum congue Distributed Node HW Control
  • 5. NoSQL – By Data Model Key / Value Redis, Aerospike, RocksDB Document store MongoDB, Couchbase Wide column store Scylla, Apache Cassandra, HBase, DynamoDB Graph Neo4j, JanusGraph Complexity 5
  • 6. NoSQL– By Availability vs Consistency 6 Pick Two Availability Partition Tolerance Consistency PACELC: Latency vs Consistency
  • 7. Cluster - Node Ring 7 Node 5 Node 1 Node 2 Node 4 Node 3
  • 8. Data Replication ■ Replication Factor: number of nodes where data (rows and partitions) are replicated ■ Done automatically ■ Set for keyspace CREATE KEYSPACE mykeyspace WITH replication = { 'class': 'NetworkTopologyStrategy', 'replication_factor' : 3} AND durable_writes = true; 8
  • 9. Replication Factor (RF) = 3 9 Node 5 Node 1 Node 2 Node 4 Node 3
  • 10. Multiple Data Centers 10 USA DC Asia DC EU DC 'us_1' : 3, 'eu' : 3, 'asia' : 3
  • 11. Consistency Level ■ CL: # of nodes that must acknowledge read/write ■ I.E.: 1, QUORUM, LOCAL_QUORUM, ALL ■ Tunable Consistency: CL set per operation 11
  • 17. CQL Example Query: SELECT * from heartrate_v10 WHERE pet_chip_id = 80d39c78-9dc0-11eb-a8b3-0242ac130003 LIMIT 1; SELECT * from heartrate_v10 WHERE pet_chip_id = 80d39c78-9dc0-11eb-a8b3-0242ac130003 AND time >= '2021-05-01 01:00+0000' AND time < '2021-05-01 01:03+0000'; 17 https://gist.github.com/tzach/7486f1a0cc904c52f4514f20f14d2a97
  • 18. Wide Partition Example CREATE TABLE heartrate_v10 ( pet_chip_id uuid, owner uuid, time timestamp, heart_rate int, PRIMARY KEY (pet_chip_id, time) ); pet_chip_id time heart_rate 80d39c78-9dc0-11eb-a8b3-0242ac130003 2021-05-01 01:00:00.000000+0000 120 80d39c78-9dc0-11eb-a8b3-0242ac130003 2021-05-01 01:01:00.000000+0000 121 80d39c78-9dc0-11eb-a8b3-0242ac130003 2021-05-01 01:02:00.000000+0000 120 Partition Key Clustering Key 18
  • 19. Architecture pet_chip_id time heart_rate 80d39c78-9dc0-11eb-a8b3- 0242ac130003 2021-05-01 01:00:00.000000+0000 120 80d39c78-9dc0-11eb-a8b3- 0242ac130003 2021-05-01 01:01:00.000000+0000 121 80d39c78-9dc0-11eb-a8b3- 0242ac130003 2021-05-01 01:02:00.000000+0000 120 Partitioner Hash Function Partition Key Token Range
  • 21. Advance Data Modeling Materialized Views (MV) Secondary Index (SI) Change Data Capture (CDC) Collections User Defined Types Time To Live (TTL) …
  • 22. 1. INSERT INTO heartrate (pet_chip_id, Owner, Time, heart_rate) VALUES (..); 2. INSERT INTO heartrate Base replica View replica Coordinator 3. INSERT INTO heartrate_by_owner View is another table 22
  • 23. View is another table 2. SELECT * FROM heartrate_by_owner WHERE owner = ‘642a..’; Base replica View replica Coordinator 1. SELECT * FROM heartrate_by_owner WHERE owner = ‘642a..’; 23
  • 24. Global Sec Index - Different Partition key 2. SELECT name FROM pet_by_owner_index WHERE owner = '642a..'; 3. SELECT * FROM heartrate_10 WHERE pet_chip_id in (...) AND time in (...) Base replica View replica Coordinator 1. SELECT * FROM heartrate_v10 WHERE owner = ‘642a..’;
  • 25. Write Path - Replica 25
  • 26. 26 Read Path - Replica SSTables Cache Memory Disk 1 2 3 4 5 … Bloom Filter 2.5
  • 27. Storage - Log-Structured Merge Tree SStable 1 Time
  • 28. Storage - Log-Structured Merge Tree SStable 1 Time SStable 2
  • 29. SStable 1 SStable 2 SStable 3 Time SStable 4 SStable 1+2+3 Storage - Log-Structured Merge Tree
  • 31. ScyllaDB Design Decisions C++ instead of Java 1 2 All Things Async 3 Shard per Core 4 Unified Cache 5 I/O Scheduler 6 Autonomous C++
  • 32. ScyllaDB Design Decisions 1 2 All Things Async 3 Shard per Core 4 Unified Cache 5 I/O Scheduler 6 Autonomous C++
  • 33. ScyllaDB Design Decisions Shards 1 2 All Things Async 3 Shard per Core 4 Unified Cache 5 I/O Scheduler 6 Autonomous C++
  • 34. Small, Medium, Large Machines Why larger nodes? ■ Time between failures is shorter ■ Ease of maintenance ■ No noisy neighbours ■ No virtualization, container overhead ■ No other moving parts ■ Scale up before out!
  • 35. Linear Scale Ingestion Constant Time while volume & throughput double 2X 2X 2X 2X 2X
  • 36. Network Comparison Kernel Cassandra TCP/IP Scheduler queue queue queue queue queue threads NIC Queues Kernel Traditional Stack SeaStar’s Sharded Stack Memory Application TCP/I P Task Scheduler queue queue queue queue queue smp queue NIC Queue DPDK Kernel (isn’t involved) Userspace Application TCP/I P Task Scheduler queue queue queue queue queue smp queue NIC Queue DPDK Kernel (isn’t involved) Userspace Application TCP/I P Task Scheduler queue queue queue queue queue smp queue NIC Queue DPDK Kernel (isn’t involved) Userspace Core Database Task Scheduler queue queue queue queue queue smp queue NIC Queue Userspace
  • 37. ScyllaDB Has Its Own Task Scheduler Traditional Stack Scylla’s Stack Promise Task Promise Task Promise Task Promise Task CPU Promise Task Promise Task Promise Task Promise Task CPU Promise Task Promise Task Promise Task Promise Task CPU Promise Task Promise Task Promise Task Promise Task CPU Promise Task Promise Task Promise Task Promise Task CPU Promise is a pointer to eventually computed value Task is a pointer to a lambda function Scheduler CPU Scheduler CPU Scheduler CPU Scheduler CPU Scheduler CPU Thread Stack Thread Stack Thread Stack Thread Stack Thread Stack Thread Stack Thread Stack Thread Stack Thread is a function pointer Stack is a byte array from 64k to megabytes
  • 38. ScyllaDB Design Decisions Cassandra Scylla Key cache Row cache On-heap / Off-heap Linux page cache SSTables Unified cache SSTables Complex Tuning 1 2 All Things Async 3 Shard per Core 4 Unified Cache 5 I/O Scheduler 6 Autonomous C++
  • 39. ScyllaDB Design Decisions Cassandra Key cache Row cache On-heap / Off-heap Linux page cache SSTables 1 2 All Things Async 3 Shard per Core 4 Unified Cache 5 I/O Scheduler 6 Autonomous C++ App thread Kernel SSD Page fault Suspend thread Initiate I/O Context switch I/O completes Interrupt Context switch Map page Resume thread
  • 40. ScyllaDB Design Decisions Query Commitlog Compaction Userspace I/O Scheduler Disk Max useful disk concurrency Queue Queue Queue 1 2 All Things Async 3 Shard per Core 4 Unified Cache 5 I/O Scheduler 6 Autonomous C++
  • 41. ScyllaDB Design Decisions Memtable Seastar Scheduler Compaction Query Repair Commitlog SSD Compaction Backlog Monitor Memory Monitor Adjust priority Adjust priority WAN CPU 1 2 All Things Async 3 Shard per Core 4 Unified Cache 5 I/O Scheduler 6 Autonomous C++
  • 42. Different types of loads ■ OLTP ● Small work items ● Latency sensitive ● involves narrow portion of the data ■ OLAP ● Large work items ● Throughput oriented ● Performed on large amounts of data
  • 43. Workload Prioritization Load #3 800 shares Load #2 400 shares Load #1 200 shares
  • 44. Scylla Design Decisions 1 2 All Things Async 3 Shard per Core 4 Unified Cache 5 I/O Scheduler 6 Autonomous C++
  • 45. More than 1M req/sec on i4i.8xlarge https://github.com/scylladb/1m-ops-demo by Attila Tóth
  • 46. 46 ■ Built for High Availability ■ Design to meet modern hardware ■ Use a fully async, share nothing, shard per core architecture ■ Superior throughput and consistent low latency ■ Expose internal scheduler to the user as Workload Prioritization Summary
  • 47. Scylla vs Competition ■ 1/7th the cost ■ 26x better in a real life scenario ■ 10x volume ■ 9.3x throughput ■ 1/4x latency ■ 4 Scylla nodes vs 40 Cassandra ■ 2.5X cheaper ■ 11x better latency ■ 1/5th cost in a benchmark ■ 20x better real-life scenario ■ No throttling ■ No locking CockroachDB Google’s Bigtable DynamoDB Cassandra logscaled
  • 48. Stay in Touch Tzach Livyatan tzach@scylladb.com https://twitter.com/TzachL https://github.com/tzach https://www.linkedin.com/in/tzach/