Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Facebook
    Scaling Walkthrough

Moritz Haarmann - Ultrasuperlargescale Systems
Numbers
•   800.000.000 active users

•   > 50% log on every given day

•   250.000.000 Photos every single day ( Flickr:
    6Bn total )

•   30.000.000.000 new pieces of content monthly
30.000.000.000
30.000.000.000
As if everyone now living on earth posts 4.28 updates every month.
Building Blocks
Facebook is built using



•   Web Servers ( Running Hip-Hop PHP )

•   Services ( Search, Ads )

•   Memcached & MySQL

•   Immense amounts of glue
Write Strategy

•   Writes take place centrally in California

•   3.5 Million Changed Rows per Second ( Peak )

•   2010: 1800 DB Servers

•   horizontal scaling approach not disclosed

•   consistency is important ( avoiding „unhappy
    users“ )
Glue

•   Massively distributed architecture

•   Glue keeping it together

•   Many systems built in-house to meet
    giiaaanoourmus requirements
Haystack


•   Photos

•   Handles everything from HTTP to storage

•   Aimed at minimizing IO-Operations

•   Append-Only!
Memcached


•   Placed between MySQL and Web Tier

•   Stores only „plain data“, no joins or other
    complicated stuff

•   Faster if Web Server works on data
BigPipe


•   Assembles the output pages

•   everything that is needed retrieved in parallel

•   Fault tolerant, will work even if parts of a page
    are not available
Facebook Scaling Overview
What else?
Live Profiling


•   Facebook monitors their life systems
    continously at a PHP-Method level ( using
    XHProf ).
Graceful Degradation


•   High awareness ( Monitoring ) of perfomance
    problems

•   Features can be disabled ( very ne-grained )
    to keep the core features running smoothly
Keeping it running


•   New features are launched ,dark‘, without
    visible elements, to stress test the backend
    with real load

•   Incremental roll-outs decrease the impact of
    bug or malfunction
Open Source



•   Most parts are open source

•   Either used or created and then os‘d
Big Bang

             •     On September 23, 2010, Facebook was down
                   for most users for about 3 hours

             •     A wrongly identi ed ,invalid‘ cache value lead
                   to requests hammering the DB tier

             •     A system designed to prevent failures created
                   one!

             •     Only way to recover was to completely shut
                   down access to the DB - downtime

https://www.facebook.com/note.php?note_id=431441338919&id=9445547199&ref=mf ( great comments, too )
Thanks.
Sources

•   http://royal.pingdom.com/2010/06/18/the-software-behind-
    facebook/

•   https://www.facebook.com/note.php?note_id=76191543919

•   https://www.facebook.com/notes/facebook-engineering/bigpipe-
    pipelining-web-pages-for-high-performance/389414033919

•   http://blog.kissmetrics.com/facebook-statistics/

More Related Content

Facebook Scaling Overview

  • 1. Facebook Scaling Walkthrough Moritz Haarmann - Ultrasuperlargescale Systems
  • 3. 800.000.000 active users • > 50% log on every given day • 250.000.000 Photos every single day ( Flickr: 6Bn total ) • 30.000.000.000 new pieces of content monthly
  • 5. 30.000.000.000 As if everyone now living on earth posts 4.28 updates every month.
  • 7. Facebook is built using • Web Servers ( Running Hip-Hop PHP ) • Services ( Search, Ads ) • Memcached & MySQL • Immense amounts of glue
  • 8. Write Strategy • Writes take place centrally in California • 3.5 Million Changed Rows per Second ( Peak ) • 2010: 1800 DB Servers • horizontal scaling approach not disclosed • consistency is important ( avoiding „unhappy users“ )
  • 9. Glue • Massively distributed architecture • Glue keeping it together • Many systems built in-house to meet giiaaanoourmus requirements
  • 10. Haystack • Photos • Handles everything from HTTP to storage • Aimed at minimizing IO-Operations • Append-Only!
  • 11. Memcached • Placed between MySQL and Web Tier • Stores only „plain data“, no joins or other complicated stuff • Faster if Web Server works on data
  • 12. BigPipe • Assembles the output pages • everything that is needed retrieved in parallel • Fault tolerant, will work even if parts of a page are not available
  • 15. Live Profiling • Facebook monitors their life systems continously at a PHP-Method level ( using XHProf ).
  • 16. Graceful Degradation • High awareness ( Monitoring ) of perfomance problems • Features can be disabled ( very ne-grained ) to keep the core features running smoothly
  • 17. Keeping it running • New features are launched ,dark‘, without visible elements, to stress test the backend with real load • Incremental roll-outs decrease the impact of bug or malfunction
  • 18. Open Source • Most parts are open source • Either used or created and then os‘d
  • 19. Big Bang • On September 23, 2010, Facebook was down for most users for about 3 hours • A wrongly identi ed ,invalid‘ cache value lead to requests hammering the DB tier • A system designed to prevent failures created one! • Only way to recover was to completely shut down access to the DB - downtime https://www.facebook.com/note.php?note_id=431441338919&id=9445547199&ref=mf ( great comments, too )
  • 21. Sources • http://royal.pingdom.com/2010/06/18/the-software-behind- facebook/ • https://www.facebook.com/note.php?note_id=76191543919 • https://www.facebook.com/notes/facebook-engineering/bigpipe- pipelining-web-pages-for-high-performance/389414033919 • http://blog.kissmetrics.com/facebook-statistics/