Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Using Saga

Download as pdf or txt
Download as pdf or txt
You are on page 1of 86

Managing data consistency in a

microservice architecture using


Sagas
Chris Richardson

Founder of eventuate.io
Author of Microservices Patterns
Founder of the original CloudFoundry.com
Author of POJOs in Action

@crichardson
chris@chrisrichardson.net
http://eventuate.io http://learn.microservices.io
@crichardson
Presentation goal

Distributed data management challenges


in a microservice architecture

Sagas as the transaction model

@crichardson
About Chris

@crichardson
About Chris

Consultant and trainer


focusing on modern
application architectures
including microservices
(http://www.chrisrichardson.net/)

@crichardson
About Chris

Founder of a startup that is creating


an open-source/SaaS platform
that simplifies the development of
transactional microservices
(http://eventuate.io)

@crichardson
For more information

40%
discount
with code
ctwsaturn18

http://learn.microservices.io
@crichardson
Agenda

ACID is not an option

Overview of sagas
Coordinating sagas

Countermeasures for data anomalies


Using reliable and transactional messaging

@crichardson
The microservice architecture
structures
an application as a
set of loosely coupled
services
Microservices enable
continuous delivery/deployment
Process:
Continuous delivery/deployment
Services
=
testability
Enables
Enables
and
Successful
Software deployability
Development

Organization: Architecture:
Small, agile, autonomous, Microservice architecture
cross functional teams Enables

Teams own services


@crichardson
Each team owns one or more
services
Catalog Responsible Catalog
Team for Service

Review Review
Team Service

Order Order
Team Service

… …
Team Service

@crichardson
Microservice architecture
Database per service

HTML
Store Customer Customer
Browser
Front UI Service Database

REST

API Order Order


Gateway Service Database
REST

Mobile
Device … …
Service Database
@crichardson
Private database
!=
private database server

@crichardson
Loose coupling =
encapsulated data
Order Service Customer Service

Order Database Customer Database

Customer
Order table
table
orderTotal creditLimit
availableCredit

Change schema without coordinating with other teams @crichardson


But…

How to maintain data


consistency?

How to implement queries?


@crichardson
How to maintain data consistency?
createOrder(customerId, orderTotal)

Pre-conditions:
• customerId is valid
Post-conditions
• Order was created
• Customer.availableCredit -= orderTotal

Customer class

Spans Invariant:
services
availableCredit >= 0
availableCredit <= creditLimit
@crichardson
Cannot use ACID transactions
that span services
Distributed transactions

BEGIN TRANSACTION Private to the


… Order Service
SELECT ORDER_TOTAL
FROM ORDERS WHERE CUSTOMER_ID = ?

SELECT CREDIT_LIMIT
FROM CUSTOMERS WHERE CUSTOMER_ID = ?

INSERT INTO ORDERS … Private to the
Customer Service

COMMIT TRANSACTION @crichardson
2PC is not an option
Guarantees consistency

BUT
2PC coordinator is a single point of failure
Chatty: at least O(4n) messages, with retries O(n^2)

Reduced throughput due to locks


Not supported by many NoSQL databases (or message brokers)
CAP theorem 2PC impacts availability

….

@crichardson
Basically
ACID Available
Soft state
Eventually consistent

http://queue.acm.org/detail.cfm?id=1394128 @crichardson
Agenda

ACID is not an option

Overview of sagas
Coordinating sagas

Countermeasures for data anomalies


Using reliable and transactional messaging

@crichardson
From a 1987 paper

@crichardson
Use Sagas instead of 2PC

X
Distributed transaction

Service A Service B Service C

Saga
Service A Service B Service C
Local Local Local
transaction transaction transaction

@crichardson
Create Order Saga
createOrder() Initiates saga

Order Service Customer Service Order Service


Local transaction Local transaction Local transaction
approve
createOrder() reserveCredit()
order()

Order Order
Customer
state=PENDING state=APPROVED
@crichardson
But what about rollback?

BEGIN TRANSACTION

UPDATE …

INSERT ….

…. BUSINESS RULE VIOLATED!!!! Really simple!

ROLLBACK TRANSACTION

@crichardson
Rolling back sagas

Use compensating transactions

Developer must write application logic to “rollback” eventually


consistent transactions
Careful design required!

@crichardson
Saga: Every Ti has a Ci
FAILS

T1 T2 …

C1 C2

Compensating transactions

T1 T2 C1
@crichardson
Create Order Saga - rollback
createOrder()
Insufficient credit

Order Service Customer Service Order Service


Local transaction Local transaction Local transaction
FAIL reject
createOrder() reserveCredit()
order()

Order Customer Order


@crichardson
Writing compensating
transactions isn’t always easy

Write them so they will always succeed


If a compensating transaction fails => no clear way to
recover

Challenge: Undoing changes when data has already been


changed by a different transaction/saga

More on this later

@crichardson
Non-compensatable actions

For example, sending an email can’t be unsent.


Move actions that can’t be undone to the end of the saga

More on this later.

@crichardson
Sagas complicate API design
Synchronous API vs Asynchronous Saga

Request initiates the saga. When to send back the response?


Option #1: Send response when saga completes:

+ Response specifies the outcome

- Reduced availability
Option #2: Send response immediately after creating the saga
(recommended):
+ Improved availability

- Response does not specify the outcome. Client must poll or be notified

@crichardson
Revised Create Order API

createOrder()

returns id of newly created order


NOT fully validated

getOrder(id)
Called periodically by client to get outcome of validation

@crichardson
Minimal impact on UI

UI hides asynchronous API from the user

Saga will usually appear instantaneous (<= 100ms)


If it takes longer UI displays “processing” popup

Server can push notification to UI

@crichardson
Agenda

ACID is not an option

Overview of sagas
Coordinating sagas

Countermeasures for data anomalies


Using reliable and transactional messaging

@crichardson
How to sequence the saga
transactions?

After the completion of transaction Ti “something” must


decide what step to execute next

Success: which T(i+1) - branching


Failure: C(i - 1)

@crichardson
Use asynchronous, broker-
based messaging
Customer
Order Service ….
Service

Message broker

Guaranteed delivery ensures a saga complete when its


participants are temporarily unavailable
@crichardson
Saga step = a transaction
local to a service
Service

update publish message/event

Database Message Broker

Transactional DB: BEGIN … COMMIT


DDD: Aggregate
NoSQL: single ‘record’
@crichardson
Choreography: distributed decision making

vs.

Orchestration: centralized decision making

@crichardson
Option #1: Choreography-based
coordination using events
Create Order
Order created Order events channel

Order Credit Reserved Customer


Service Service
Customer events channel
OR
create()
Credit Limit Exceeded
approve()/
reserveCredit()
reject()
Order Customer

state creditLimit
total creditReservations
@crichardson
Order Service: publishing
domain events

Create order

Publish event

@crichardson
Customer Service: consuming
domain events…

Subscribe to event

https://github.com/eventuate-tram/eventuate-tram-examples-customers-and-orders

@crichardson
Customer Service: consuming
domain events

Attempt to
Reserve credit

Publish event on
success

Publish event on
failure
@crichardson
More complex choreography example
Consumer Validated

Create Order Consumer events


Consumer
channel Service
Order created

Order
Service Inventory
Order events
channel
Inventory Reserved Service

Inventory events Accounting


channel Service
Card Authorized

Accounting events
channel @crichardson
Benefits and drawbacks of
choreography
Benefits Drawbacks
Simple, especially when Cyclic dependencies -
using event sourcing services listen to each
other’s events
Participants are loosely
coupled Overloads domain objects,
e.g. Order and Customer
know too much
Events = indirect way to
make something happen

https://github.com/eventuate-examples/eventuate-examples-java-customers-and-orders
Option #2: Orchestration-based saga
coordination
CreateOrderSaga
createOrder()
Invokes Invokes Invokes

Order Service Customer Service Order Service


Local transaction Local transaction Local transaction
approve
createOrder() reserveCredit()
order()

Order Order
Customer
state=PENDING state=APPROVED
@crichardson
A saga (orchestrator)
is a persistent object
that
tracks the state of the saga
and
invokes the participants

@crichardson
Saga orchestrator behavior
On create: On reply:

Invokes a saga participant Load state from database


Persists state in database Determine which saga
participant to invoke next
Wait for a reply
Invokes saga participant

Updates its state

Persists updated state


Wait for a reply


CreateOrderSaga orchestrator
Create Order Customer command channel

Order Service Customer Service

OrderService reserveCredit()

create()
Customer
CreateOrder
create() creditLimit
Saga creditReservations
creditReserved() ...
approve()
Order
state
total… Saga reply channel
@crichardson
CreateOrderSaga definition
Saga’s Data

Sequence of
steps

step = (Ti, Ci)

Build command
to send
@crichardson
Customer Service command
handler Route command
to handler

Reserve
credit
Make reply message @crichardson
Eventuate Tram Sagas

Open-source Saga orchestration framework

Currently for Java


https://github.com/eventuate-tram/eventuate-tram-sagas

https://github.com/eventuate-tram/eventuate-tram-sagas-
examples-customers-and-orders

@crichardson
Benefits and drawbacks of
orchestration
Benefits Drawbacks
Centralized coordination Risk of smart sagas
logic is easier to understand directing dumb services
Reduced coupling, e.g.
Customer knows less.
Simply has API

Reduces cyclic
dependencies
Agenda

ACID is not an option

Overview of sagas
Coordinating sagas

Countermeasures for data anomalies


Using reliable and transactional messaging

@crichardson
Lack of isolation complicates business
logic

?
Order Service Customer Service
Local transaction Local transaction
createOrder() reserveCredit()

Order
Customer
state=PENDING

Order Service
Local transaction
cancelOrder()

Time @crichardson
How to cancel a PENDING
Order?
Don’t throw an OrderNotCancellableException

Questionable user experience


“Interrupt” the Create Order saga?
Cancel Order Saga: set order.state = CANCELLED
Causes Create Order Saga to rollback
But is that enough to cancel the order?
Cancel Order saga waits for the Create Order saga to complete?
Suspiciously like a distributed lock
But perhaps that is ok

@crichardson
Countermeasure Transaction
Model

http://bit.ly/semantic-acid-ctm - paywall 😥

@crichardson
Sagas are ACD
Atomicity
Saga implementation ensures that all transactions are
executed OR all are compensated

Consistency
Referential integrity within a service handled by local databases
Referential integrity across services handled by application
Durability
Durability handled by local databases

@crichardson
Lack of I anomalies

@crichardson
Outcome of
concurrent execution
!=
a sequential execution

@crichardson
Sounds scary
BUT
It’s common to relax isolation
to improve performance

@crichardson
Anomaly: Lost update

Create Ti: Create Tj: Approve


Order Order Order
Saga Time
Cancel
Ti: Cancel Overwrites
Order Order cancelled order
Saga

@crichardson
Anomaly: Dirty reads …

Ti: Reserve Ci: Unreserve


Credit credit
Saga 1
Time

Saga 2 Ti: Reserve Reads


Credit uncommitted
changes

Order rejected unnecessarily


@crichardson
… Anomaly: Dirty reads

Ti: Unreserve Ci: Reserve


Credit credit
Saga 1
Time

Saga 2 Ti: Reserve Reads


Credit uncommitted
changes

Credit limit exceeded!


@crichardson
Anomaly: non-repeatable/
fuzzy read
Ti: Begin Tj: Finalize
Revise Order Revise Order
Saga 1

Saga 2 Ti: Update Order has changed


Order since Ti

Time

@crichardson
Countermeasures for
reducing impact
*

of isolation anomalies…

i.e. good enough, eventually consistent application


@crichardson
Saga structure

T1 C1
Compensatable transactions
T2 C2


Pivot transaction = GO/NO GO Tn+1

Retriable transactions that can’t fail Tn+2

….
Countermeasure: Semantic
lock Order.state =
PENDING
Order.state =
REJECTED
Compensatable transaction sets flag,
retriable transaction releases it

Indicates a possible dirty read Create Pending


Reject Order
Flag = lock: Order

prevents other transactions from


accessing it … …
Require deadlock detection, e.g.
timeout

Flag = warning - treat the data
differently, e.g.

Order.state = PENDING Approve Order


a pending deposit

Order.state =
APPROVED
Countermeasure:
Commutative updates

Commutative: g(f(x)) = f(g(x))

For example:
Account.debit() compensates for Account.credit()

Account.credit() compensates for Account.debit()


Avoids lost updates

@crichardson
Countermeasure: Pessimistic
view
… …
Reorder saga
… …
to
Increase avail. Reduce avail. reduce risk
Cancel Order Delivery
credit credit

Cancel Order Delivery Cancel Order

Cancel Order Increase avail. credit

Won’t be compensated,
so no dirty read
@crichardson
Countermeasure: Re-read
value
Verify unchanged, possibly restart

Ti: Read Tj: Re-read, then


Order update Order
Saga 1
Time

Saga 2 Ti: Update


Order

Prevents lost update


~Offline optimistic lock pattern @crichardson
Countermeasure: version file

Ti: Create Tj: Reserve Tj: Authorize


Order Credit Credit Card
Saga 1
Time

Saga 2 Ti: Cancel Tj: Reverse


Order Credit Card

Payment
“log” of changes:
1. reverse()
Enables operations to commute
2. authorize()
@crichardson
Countermeasure: By value

Business risk determines strategy


Low risk => semantic ACID

High risk => use 2PC/distributed transaction

@crichardson
Agenda

ACID is not an option

Overview of sagas
Coordinating sagas

Countermeasures for data anomalies


Using reliable and transactional messaging

@crichardson
Use asynchronous, broker-
based messaging
Customer
Order Service ….
Service

Message broker

Guaranteed delivery ensures a saga complete when its


participants are temporarily unavailable
@crichardson
Messaging must be
transactional How to
make atomic
without 2PC?

Service

update publish

Database Message Broker

@crichardson
Option #1: Use database
table as a message queue

@crichardson
Option #1: Use database
table as a message queue
ACID reserveCredit()
transaction
Publish
Customer Message

?
Service Publisher

Local transaction

Message
INSERT INSERT Broker
QUERY
DELETE
CUSTOMER_CREDIT_RESERVATIONS table MESSAGE table

ORDER_ID CUSTOMER_ID TOTAL ID TYPE DATA DESTINATION

99 101 1234 84784 CreditReserved {…} …

See BASE: An Acid Alternative, http://bit.ly/ebaybase


@crichardson
Publishing messages by
polling the MESSAGE table
Message
SELECT * Publisher
FROM MESSAGES
WHERE …. Publish message

DELETE …

Message table Message table

@crichardson
Transaction log tailing
Order
Service

Update

Datastore

MESSAGE
ORDER tabletable

Changes
Publish
Transaction log
Transaction log Message
miner
Broker

@crichardson
Transaction log tailing

Eventuate Local - reads MySQL binlog


Oracle Golden Gate

AWS DynamoDB streams


MongoDB - Read the oplog

@crichardson
Transaction log tailing: benefits
and drawbacks
Benefits Drawbacks
No 2PC Obscure

No application changes Database specific


required solutions

Guaranteed to be Tricky to avoid duplicate


accurate publishing
Option #2: Event sourcing:
event-centric persistence

@crichardson
Event sourcing: persists an
object as a sequence of events
Event table
Service
Entity Event Event Event
Entity type
id id type data

save events 101 Order 901 OrderCreated …


and
publish
101 Order 902 OrderApproved …

101 Order 903 OrderShipped …


Event Store

Every state change event

@crichardson
Guarantees:

state change
->
event is published

@crichardson
Implementing choreography-
based sagas is straightforward

@crichardson
Preserves history of domain objects

Supports temporal queries

Built-in auditing

@crichardson
Summary

Microservices tackle complexity and accelerate development


Database per service is essential for loose coupling

Use ACID transactions within services

Use orchestration-based or choreography-based sagas to


maintain data consistency across services
Use countermeasures to reduce impact of anomalies caused
by lack of isolation

@crichardson
@crichardson chris@chrisrichardson.net

40%
discount
with code
ctwsaturn18
Questions?

http://learn.microservices.io @crichardson

You might also like