Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Building on Quicksand
About Me
• Work at ThoughtWorks Pune
• DDD and Distributed Computing enthusiast
• A fan of Pat Helland
Twitter handle: @shripadagashe
Blog: https://shripad-agashe.github.io
Evolution of systems
Image source: https://commons.wikimedia.org/wiki/File:Front_Z9_2094.jpg
Image Source: https://en.wikipedia.org/wiki/Solaris_Cluster#/media/
File:Sun_Microsystems_Solaris_computer_cluster.jpg
Vs
Reliability Is a steep curve
Downtime vs % availability
0
225
450
675
900
99 99.9 99.99 99.999
Downtime in Seconds
%Availability Per Year
Per
month
Per day
99 3.65 days 7.2 hour
14.4
minutes
99.9
8.76
hours
43.8
minutes
1.44
minutes
99.99
52
minutes
4.38
munutes
9 seconds
99.999
5.26
minutes
25.9
seconds
0.8
seoncds
And It has a price
0
250
500
750
1000
99 99.9 99.99 99.999
Probability theory to the
rescue
Union of Probability
P(A) Intersection P(B) = P(A) * P(B)
For 99 % Availability i.e. 1% unavailability Probability of unavailability for 2 servers
= (0.01) * (0.01) = 0.0001
More on Probability
• Systems in series
• Availability = P(A) * P(B)
• Systems in parallel
• Availability = 1 - P(1-A) * P(1-B)
So Architectural patterns
evolve around it
App App
DB
Box Cylinder Architecture Best Practices
•App layer should be
stateless
• Architecture should be
layered
But DB is still on a single
machine
Primary DB
Secondary
DB
Asynchronous
replication
DB high availability is achieved via replication
Active-Active Replication Active-Passive Replication
Primary DB
Secondary
DB
Synchronous
Replication
Move to transaction model
Client In Memory Disk
Write
Write
Commit
Write to Disk
Enterprise organization
Image Source: https://www.flickr.com/photos/mwichary/2356663850
Conway’s Law
Inventory
Sales
Finance
Fulfilment Inventory System
Sales System
Finance
System
Fulfilment
System
Organization IT Systems
Integration via DB
App
App
DB
App
App
DB
What Enabled it:
• 2 Phase Commit
• XA transaction
App 1 App 2
Possible Alternative
App
App
DB
App
App
DB
What Enabled it:
• SOAP
• REST
App 1 App 2
Service
Bouquets and brickbats
+
• Integration is simple
• Familiar for most
developers
• Easier to reason
-
• Any sync call will add
to latency
• Sync calls will
expose system to
variations in behavior
of external systems
Possible alternative
App
App
DB
App
App
DB
App 1 App 2
Replication
Replication Patterns
• Via file
• Batch app for replication
• Event driven replication using message queues
Bouquets and brickbats
+
• As there is no sync call, it
does not add additional
latency to app
• As systems are isolated
chances of failure
propagation are minimal
• With Pub Sub changes can
be propagated to multiple
subscriber with minimal
additional work
-
• Integration may not be
trivial
• Async propagation of data
needs careful reasoning
Probabilistic Business Rules
• When we have asynchronous replication we have
windows of failure that mean work may be lost or
delayed.
• Distribution + AsynchronyàProbabilities of
Enforcement
Source:http://db.cs.berkeley.edu/cs286/papers/quicksand-cidr2009.pdf
Asynchrony and Truth
Image source: https://www.flickr.com/photos/stevenpisano/16595925953
Here comes Eventual
Consistency
• Eventual consistency guarantees that subset of previous
writes will be returned; eventually it will return all
writes.
• There is no guarantee of order
• There is no time bound on eventual
• Loosely defined term which guarantees nothing. The
application should tolerate any subset of writes
without any time guarantee
• As opposed to EC being a single concept it is a spectrum
• On one end of spectrum is strong consistency
• On the other end eventual consistency
Eventual consistency thru simple
example
Official scorekeeper:
score = Read (“visitors”);
Write	(“visitors”,	score	+1);
Umpire:
if middle of 9th
inning then
vScore = Read (“visitors”);
hScore = Read (“home”);
if vScore < hScore
end game;
Radio reporter:
do {
vScore = Read (“visitors”);
hScore = Read (“home”);
report vScore and hScore;
sleep (30 minutes);
}
Sportswriter:
While not end of game {
drink beer;
smoke cigar;
}
go out to dinner;
vScore = Read (“visitors”);
hScore = Read (“home”);
write article;
Statistician:
Wait for end of game;
score = Read (“home”);
stat = Read (“season-runs”);
Write	(“season-runs”,	stat	+				
score);
Stat watcher:
stat = Read (“season-
runs”);
discuss stats with friends;
Strong	Consistency See	all	previous	writes.
Eventual	Consistency See	subset	of	previous	writes.
Consistent	Prefix See	initial	sequence	of	writes.
Bounded	Staleness See	all	“old”	writes.
Monotonic	Reads See	increasing	subset	of	writes.
Read	My	Writes See	all	writes	performed	by	reader.
Source:http://cacm.acm.org/magazines/2013/12/169945-replicated-data-consistency-explained-through-baseball/fulltext#F8
Not everyone needs same
thing
• Often different roles have different tolerances for
stale information
• The trade off between correctness and availability
can bring in more revenue
• The trade off is often driven by business value
Whats the Risk Appetite
• Consistency is often cost of doing business
• The major point is that availability (and its cousins offline and
latency-reduction) may be traded off with classic notions of
consistency. This tradeoff may frequently be applied across many
different aspects at many levels of granularity within a single
application.
• Locally clear a check if the face value is less than $10,000. If it exceeds
$10,000, double check with all the replicas to make sure it clears
• Schedule the shipment of a “Harry Potter” book based on a local
opinion of the inventory. In contrast, the one and only one Gutenberg
bible requires strict coordination!
Source:http://db.cs.berkeley.edu/cs286/papers/quicksand-cidr2009.pdf
Memories, Guesses,
Apologies
• The idea is that everything is done locally with a
subset of the global knowledge.
• You know what you know when an action is
performed. Since you have only a subset of the
knowledge, your actions are really only guesses.
• When your knowledge as a replica increases, you
may have an “Oh, crap!” moment.
Source:http://db.cs.berkeley.edu/cs286/papers/quicksand-cidr2009.pdf
• Every business has to be ready for apologies.
• Consider a case where the only book in inventory is
scheduled for delivery.
• In preparing the book for shipment, it is run over by the
forklift in the warehouse.
• So correct software non withstanding you will need
to apologize
More on Apologies
Source:http://db.cs.berkeley.edu/cs286/papers/quicksand-cidr2009.pdf
How to apologize
• First of all recognize if you need to apologize
• Identify promises that could not be completed
• Unique identifier across systems becomes critical to
identify failures
• Typically the mistakes are identified during
reconciliation
• Based on severity apology can be handled by system
based on rules or directed for human involvement
What inhibits business trade offs
• The layering of an arbitrary application atop a storage subsystem
inhibits reordering (and also apologies)
• Logical delete vs Actual delete of row
• Only when commutative operations are used can we achieve the
desired loose coupling.
• Application operations can be commutative
Source:http://db.cs.berkeley.edu/cs286/papers/quicksand-cidr2009.pdf
Commutative Business Transactions
• Order insensitive logic
• Valid Account creation - Create dummy account first and then
attach customer to it later
• Credit to account
• Visibility of business operations
• Logical deletion of record
Back to the Future
Vs
We need to partner with
people like him
Image source: https://commons.wikimedia.org/wiki/File:Jackie_Stewart_2011_British_Grand_Prix.jpg
So whats in it for You
• The technique explained requires business person
to be IT sympathetic
• Business has to align with IT and see IT as a
competitive advantage
• Developing with us rather than developing for us
mentality
Image source: https://commons.wikimedia.org/wiki/File:Jackie_Stewart_2011_British_Grand_Prix.jpg
• Move from DB Centric view of consistency to
application centric view of consistency
• Carefully make trade offs in IT systems to limit
losses and increase upside
• Look for inspiration in business practices
developed for world without instant information
Key takeaway

More Related Content

Building on quicksand microservices indicthreads

  • 2. About Me • Work at ThoughtWorks Pune • DDD and Distributed Computing enthusiast • A fan of Pat Helland Twitter handle: @shripadagashe Blog: https://shripad-agashe.github.io
  • 3. Evolution of systems Image source: https://commons.wikimedia.org/wiki/File:Front_Z9_2094.jpg Image Source: https://en.wikipedia.org/wiki/Solaris_Cluster#/media/ File:Sun_Microsystems_Solaris_computer_cluster.jpg Vs
  • 4. Reliability Is a steep curve Downtime vs % availability 0 225 450 675 900 99 99.9 99.99 99.999 Downtime in Seconds %Availability Per Year Per month Per day 99 3.65 days 7.2 hour 14.4 minutes 99.9 8.76 hours 43.8 minutes 1.44 minutes 99.99 52 minutes 4.38 munutes 9 seconds 99.999 5.26 minutes 25.9 seconds 0.8 seoncds
  • 5. And It has a price 0 250 500 750 1000 99 99.9 99.99 99.999
  • 6. Probability theory to the rescue Union of Probability P(A) Intersection P(B) = P(A) * P(B) For 99 % Availability i.e. 1% unavailability Probability of unavailability for 2 servers = (0.01) * (0.01) = 0.0001
  • 7. More on Probability • Systems in series • Availability = P(A) * P(B) • Systems in parallel • Availability = 1 - P(1-A) * P(1-B)
  • 8. So Architectural patterns evolve around it App App DB Box Cylinder Architecture Best Practices •App layer should be stateless • Architecture should be layered
  • 9. But DB is still on a single machine Primary DB Secondary DB Asynchronous replication DB high availability is achieved via replication Active-Active Replication Active-Passive Replication Primary DB Secondary DB Synchronous Replication
  • 10. Move to transaction model Client In Memory Disk Write Write Commit Write to Disk
  • 11. Enterprise organization Image Source: https://www.flickr.com/photos/mwichary/2356663850
  • 12. Conway’s Law Inventory Sales Finance Fulfilment Inventory System Sales System Finance System Fulfilment System Organization IT Systems
  • 13. Integration via DB App App DB App App DB What Enabled it: • 2 Phase Commit • XA transaction App 1 App 2
  • 14. Possible Alternative App App DB App App DB What Enabled it: • SOAP • REST App 1 App 2 Service
  • 15. Bouquets and brickbats + • Integration is simple • Familiar for most developers • Easier to reason - • Any sync call will add to latency • Sync calls will expose system to variations in behavior of external systems
  • 16. Possible alternative App App DB App App DB App 1 App 2 Replication Replication Patterns • Via file • Batch app for replication • Event driven replication using message queues
  • 17. Bouquets and brickbats + • As there is no sync call, it does not add additional latency to app • As systems are isolated chances of failure propagation are minimal • With Pub Sub changes can be propagated to multiple subscriber with minimal additional work - • Integration may not be trivial • Async propagation of data needs careful reasoning
  • 18. Probabilistic Business Rules • When we have asynchronous replication we have windows of failure that mean work may be lost or delayed. • Distribution + AsynchronyàProbabilities of Enforcement Source:http://db.cs.berkeley.edu/cs286/papers/quicksand-cidr2009.pdf
  • 19. Asynchrony and Truth Image source: https://www.flickr.com/photos/stevenpisano/16595925953
  • 20. Here comes Eventual Consistency • Eventual consistency guarantees that subset of previous writes will be returned; eventually it will return all writes. • There is no guarantee of order • There is no time bound on eventual • Loosely defined term which guarantees nothing. The application should tolerate any subset of writes without any time guarantee • As opposed to EC being a single concept it is a spectrum • On one end of spectrum is strong consistency • On the other end eventual consistency
  • 21. Eventual consistency thru simple example Official scorekeeper: score = Read (“visitors”); Write (“visitors”, score +1); Umpire: if middle of 9th inning then vScore = Read (“visitors”); hScore = Read (“home”); if vScore < hScore end game; Radio reporter: do { vScore = Read (“visitors”); hScore = Read (“home”); report vScore and hScore; sleep (30 minutes); } Sportswriter: While not end of game { drink beer; smoke cigar; } go out to dinner; vScore = Read (“visitors”); hScore = Read (“home”); write article; Statistician: Wait for end of game; score = Read (“home”); stat = Read (“season-runs”); Write (“season-runs”, stat + score); Stat watcher: stat = Read (“season- runs”); discuss stats with friends; Strong Consistency See all previous writes. Eventual Consistency See subset of previous writes. Consistent Prefix See initial sequence of writes. Bounded Staleness See all “old” writes. Monotonic Reads See increasing subset of writes. Read My Writes See all writes performed by reader. Source:http://cacm.acm.org/magazines/2013/12/169945-replicated-data-consistency-explained-through-baseball/fulltext#F8
  • 22. Not everyone needs same thing • Often different roles have different tolerances for stale information • The trade off between correctness and availability can bring in more revenue • The trade off is often driven by business value
  • 23. Whats the Risk Appetite • Consistency is often cost of doing business • The major point is that availability (and its cousins offline and latency-reduction) may be traded off with classic notions of consistency. This tradeoff may frequently be applied across many different aspects at many levels of granularity within a single application. • Locally clear a check if the face value is less than $10,000. If it exceeds $10,000, double check with all the replicas to make sure it clears • Schedule the shipment of a “Harry Potter” book based on a local opinion of the inventory. In contrast, the one and only one Gutenberg bible requires strict coordination! Source:http://db.cs.berkeley.edu/cs286/papers/quicksand-cidr2009.pdf
  • 24. Memories, Guesses, Apologies • The idea is that everything is done locally with a subset of the global knowledge. • You know what you know when an action is performed. Since you have only a subset of the knowledge, your actions are really only guesses. • When your knowledge as a replica increases, you may have an “Oh, crap!” moment. Source:http://db.cs.berkeley.edu/cs286/papers/quicksand-cidr2009.pdf
  • 25. • Every business has to be ready for apologies. • Consider a case where the only book in inventory is scheduled for delivery. • In preparing the book for shipment, it is run over by the forklift in the warehouse. • So correct software non withstanding you will need to apologize More on Apologies Source:http://db.cs.berkeley.edu/cs286/papers/quicksand-cidr2009.pdf
  • 26. How to apologize • First of all recognize if you need to apologize • Identify promises that could not be completed • Unique identifier across systems becomes critical to identify failures • Typically the mistakes are identified during reconciliation • Based on severity apology can be handled by system based on rules or directed for human involvement
  • 27. What inhibits business trade offs • The layering of an arbitrary application atop a storage subsystem inhibits reordering (and also apologies) • Logical delete vs Actual delete of row • Only when commutative operations are used can we achieve the desired loose coupling. • Application operations can be commutative Source:http://db.cs.berkeley.edu/cs286/papers/quicksand-cidr2009.pdf
  • 28. Commutative Business Transactions • Order insensitive logic • Valid Account creation - Create dummy account first and then attach customer to it later • Credit to account • Visibility of business operations • Logical deletion of record
  • 29. Back to the Future Vs
  • 30. We need to partner with people like him Image source: https://commons.wikimedia.org/wiki/File:Jackie_Stewart_2011_British_Grand_Prix.jpg
  • 31. So whats in it for You • The technique explained requires business person to be IT sympathetic • Business has to align with IT and see IT as a competitive advantage • Developing with us rather than developing for us mentality
  • 32. Image source: https://commons.wikimedia.org/wiki/File:Jackie_Stewart_2011_British_Grand_Prix.jpg • Move from DB Centric view of consistency to application centric view of consistency • Carefully make trade offs in IT systems to limit losses and increase upside • Look for inspiration in business practices developed for world without instant information Key takeaway