"$10 thousand per minute of downtime: architecture, queues, streaming and fintech", Max Baginskiy

$10 thousand per minute

of downtime: architecture,

queues, streaming and fintech
Max Baginskiy
Solidgate

About me
Head of Engineering

PreviouslyTech Lead and Platform engineer
Head of Engineering

Head of Engineering

10 yrs in Software Engineering

Last 6 years Go, fan of DevOps


Build teams (5 teams, 30+ people hired)

And architecture

And architecture

And architecture

Agenda
Company intro
Architecture of the system
Queues and Streams to choose from
Low latency streaming using outbox
CDC to our solution - comparison
Questions.
Company intro
Questions.
Company intro
Questions.

About company
7+ years online
7+ years online
7+ years online 70 engineers
50 SW engineers

20 Infra + Data engineers + AQA
70 engineers
50 SW engineers

70 engineers
50 SW engineers

PCI DSS Compliant
PCI DSS Compliant
PCI DSS Compliant European Acquirer
European Acquirer
European Acquirer

Business figures
2.5b$
annually
2.5b$
annually
2.5b$
annually
15-18m
tx monthly
15-18m
tx monthly
15-18m
tx monthly
10k$
1 min of downtime
10k$
1 min of downtime
10k$
1 min of downtime
40+
integrated payment methods 
and providers
40+
and providers
40+
and providers

ALBTraffic
We have 100x less traffic on ALB 
during high season than Shopify
Stripe served 250mil API calls 
in 2020 perday
in 2020 perday
in 2020 perday

Kafka Producer
We started integrating kafka lastyear
We started integrating kafka lastyear
We started integrating kafka lastyear 20 rps average
20 rps average
20 rps average 2 mil events perday
2 mil events perday
2 mil events perday

RabbitMQ Producer
100-120 rps average
100-120 rps average
100-120 rps average 10 mil events perday
10 mil events perday
10 mil events perday

Logs
1.5-2k rps of logs
1.5-2k rps of logs
1.5-2k rps of logs 150 mil events per day. 200-300 GB of logs daily
150 mil events per day. 200-300 GB of logs daily
150 mil events per day. 200-300 GB of logs daily

Letthestory 
begin
Go
Letthestory 
begin
Go
Letthestory 
begin
Go
Go
Go

Non functional requirements
Durability out of the box
Durability out of the box
Durability out of the box Queue replay
Queue replay
Queue replay
Single active consumer support
Easy to setup and to maintain
Easy to setup and to maintain
Easy to setup and to maintain Partitioning
Partitioning
Partitioning
Easy scaling for publisher and consumer
Extensiblity: schema registry support, dynamic routing, enrichment

NFR - explanation
What if message is lost in between services 
while processing andwe retry payment?
What if message is lost in between

callback service and callback processor?



payment and finance systems?


what if …what if …what if …?

RabbitMQ dive in
Erlang
Written in Erlang. Erlang made by Ericssonwhich
makes telecommunication devices.
Erlang
Erlang
Proof of fail-safety
ATM AXD301 example, Calculated uptime
99,9999999%, only one problem permany years.
Mnesia as storage
Mnesia doesn’t support recovery from split brain and
othertypes of failures.
Mnesia as storage
Mnesia as storage

RabbitMQ Durability
Mechanisms
Publisher confirms is a MUST have
RabbitMQ can store data to Disk and
different autoheal modes
Different types of queues: Quorum,
Mirrored
Have Streaming in “beta”.
Mechanisms
Mirrored
Mechanisms
Mirrored
What if publisher confirms
disabled?
Delivery after exchange might not
happen
Persistence might not happen
Few replicas might not acknowledge
message in Quorum
Overwhelmed Clusterwill not accept
messages but publisherwill not
know.
disabled?
happen
message in Quorum
know.
disabled?
happen
message in Quorum
know.

Quorum queues
+ Pros
Have Consensus built in
Data written to disk, metadata in memory
Can easily handle restarts.
+ Pros
+ Pros
− Cons
Doesn’t scale well - millions of messages after
restart can replicate hours.
Doen’t have “replay” mechanism
Consumers doesn’t scale
Doesn’t preserve order of messages.
− Cons
− Cons

Split brain - autoheal
ignore
Usewhen network reliability is the highest practically possible and node availability is of topmost importance.
ignore
ignore
pause_minority
Appropriatewhen clustering across racks oravailability zones in a single region and the probability of losing a majority
of nodes (zones) at once is considered to bevery low.
pause_minority
pause_minority
autoheal
Appropriatewhen are more concernedwith continuity of service thanwith data consistency across nodes.
autoheal
autoheal
Summary - noway to guarantee that autohealwillwork properly

RabbitMQ streaming, problem #1

RabbitMQ streaming. Go client, issue #2

RabbitMQ streaming. Go client, issue #3

RabbitMQ + RabbitMQ streaming
Newfeature that not a lot of companies use.
Go client is not ready,what about Python orNode.js I’m aftaid to ask.
Hard to support. Requires updates of Erlang and then RabbitMQ.
Streaming is a plugin that requires specificversion of RabbitMQ.
Not made for fintech: lack of properdurability, lack of functionality.

Kafka dive in
Java
Written in Java by Linkedin and then
opensourced and licenced under
Apache licence.
Java
Apache licence.
Java
Apache licence.
Highly available and durable
HasWAL,works in cluster, saves data to
disk by default.
disk by default.
disk by default.
️Blazing fast
Sequentialwrites, zero copy.
️Blazing fast
️Blazing fast

Kafka dive in
Kafka uses optimizations around Sequentialwrites to optimize disk usagewith zero copy.
HasWAL log forreplication and durability.
Zookeeperas separate system tracks health of the cluster.
Canwork evenwithout Zookeeper.
Chaos engineering shows that Kafka is highlyavailable and durable solution.

Debezium
Debezium howto:
Debezium howto:
Debezium howto: Create a replication slot
Create a replication slot
Create a replication slot Run Debezium Java service in cluster
Run Debezium Java service in cluster
Run Debezium Java service in cluster Configurate itwith Groovy
Configurate itwith Groovy
Configurate itwith Groovy

Debezium
+ Pros
UsesWAL directly - doesn’t create
additional load toWAL(no additional
data iswritten).
Production ready, tested solution
Lowlatency. ️
+ Pros
data iswritten).
Lowlatency. ️
+ Pros
data iswritten).
Lowlatency. ️
− Cons
Howto replay data? Can you specify
Log Sequence Number?What if you
need to stream only a fraction ofwhat
iswritten inWAL
Missing Buf(protobuf on steroids)
Lowflexibility and hard configurability
DB Isolation.
Groovywhich is not easy to use
Random disconnects 
and need to restart.

Transactional outbox
Why to use Transactional
Outbox?
Outbox?
Outbox?
Nor Kafka nor CDC can flexibly
re-stream data.
re-stream data.
re-stream data.
Without specific instruments
you can’t remove specific
events from Kafka.
events from Kafka.
events from Kafka.
Replay with Kafka will require
setup of additional services.
Consistent state with the usage
of Transactions.
of Transactions.
of Transactions.

Outboxtable-WAL
ID-ulid(sortableuuids).
ID-ulid(sortableuuids).
ID-ulid(sortableuuids). Bucket-partitioning.Read/Writepartitioning.
Bucket-partitioning.Read/Writepartitioning.
Bucket-partitioning.Read/Writepartitioning. Payload-jsonbodyofdomainmodel.
Payload-jsonbodyofdomainmodel.
Payload-jsonbodyofdomainmodel.

ChoosingGolib
confluent-kafka-go - CGO + librdkafka
ibm/sarama
ibm/sarama
ibm/sarama
segmentio/kafka-go
segmentio/kafka-go
segmentio/kafka-go

Outboxtable-WAL
BatchSize-usually10.
BatchSize-usually10.
BatchSize-usually10. Batch Timeout-100ms.
Batch Timeout-100ms.
Batch Timeout-100ms. RequiedAcks-allnodesshouldconfirmmessage.
RequiedAcks-allnodesshouldconfirmmessage.
RequiedAcks-allnodesshouldconfirmmessage.
Async-alseforsynchronouserrorhandling.

Schema registry -
“Speca first” approach - speedup development.
Backward compatibility support - linters.
Reusable “menthal model” = simplified migration from api to stream.
Client, server and models are generated for various language.
Simplified versioning.

Taxerv1 optionwe built
Update payment in Gate(kotlin).
Transaction: Save payment update
and create a record in Outbox.
Orderstreamer(Go) - reads batch
from outbox.
from outbox.
from outbox.
Publish data to Stream.
Update Offset in meta table.

v1 comparison with typical architecture
+ Pros
We have a full transaction log that can
be replayed, reworked, saved, fixed
Only 1 new tech - kafka
Streamer + Leaser = 200 lines of code +
800 lines of tests. It can be used as
library not a service
Buf/Go/PostgreSQL - everything
reused - maintenance simplified.
+ Pros
+ Pros
− Cons
WAL amplification - 2x. Transactional
outbox requires 1 more write to each
operation
High delay - 2 min for events
More CPU load than just reading from
WAL.

Orderstreamerdelay: 2min
ULIDs - doesn’t allowus to understand the commit orderof events and missing parts.

Solution-LogicalClock+Autoincrement
Logicalclocksallowadistributedsystemtoenforceapartialorderingofeventswithoutphysicalclocks. 
Youalsocandetectmissingeventswiththem.

v2 Implementation
Auto increment instead of ULIDwill helpyou to report
and look formissing IDs. It’s more like “logical time”.
Look formissing ids for, save them in meta table for2
mins and restream themwhen theyappear.

v2Summary
+Pros
Reducesdelaytimefrom2minsto
literallyseconds
Wecanusethisapproachnotonlyin
reports/taxesbutalsoinprocessing.
+Pros
literallyseconds
+Pros
literallyseconds
−Cons
Orderingcanbebroken,butwecan
supportseveralmodelsofeventual
consistency
HigherDBCPUutilization.
−Cons
consistency
−Cons
consistency

v3readingWAL
WAL“reading”canbe
implementedinjust200lines 
ofcodealongwithreplication
slotcreationandpublication
creation.
creation.
creation.
InPostgreSQLreplicationslots
youhaveanaccessto
received_lsn,latest_end_lsn. 
Itseemslikeyoucanreplay
changes.
youhaveanaccessto
, . 
changes.
received_lsn latest_end_lsn
youhaveanaccessto
, . 
changes.
received_lsn latest_end_lsn

V3WALlib-NextTime
V3WALlib-NextTime
V3WALlib-NextTime
SeeyouatthenextHighload!
.....
.....
.....

"$10 thousand per minute of downtime: architecture, queues, streaming and fintech", Max Baginskiy

Recommended

Recommended

More Related Content

Similar to "$10 thousand per minute of downtime: architecture, queues, streaming and fintech", Max Baginskiy

Similar to "$10 thousand per minute of downtime: architecture, queues, streaming and fintech", Max Baginskiy (20)

More from Fwdays

More from Fwdays (20)

Recently uploaded

Recently uploaded (20)

"$10 thousand per minute of downtime: architecture, queues, streaming and fintech", Max Baginskiy