Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/2535461.2535468guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

TAO: Facebook's distributed data store for the social graph

Published: 26 June 2013 Publication History
  • Get Citation Alerts
  • Abstract

    We introduce a simple data model and API tailored for serving the social graph, and TAO, an implementation of this model. TAO is a geographically distributed data store that provides efficient and timely access to the social graph for Facebook's demanding workload using a fixed set of queries. It is deployed at Facebook, replacing memcache for many data types that fit its model. The system runs on thousands of machines, is widely distributed, and provides access to many petabytes of data. TAO can process a billion reads and millions of writes each second.

    References

    [1]
    Amazon SimpleDB. http://aws.amazon.com/simpledb/.
    [2]
    Facebook - Company Info. http://newsroom.fb.com.
    [3]
    LevelDB. https://code.google.com/p/leveldb.
    [4]
    Project Voldemort. http://project-voldemort.com/.
    [5]
    J. Baker, C. Bond, J. C. Corbett, J. Furman, A. Khorlin, J. Larson, J.-M. Leon, Y. Li, A. Lloyd, and V. Yushprakh. Megastore: Providing scalable, highly available storage for interactive services. In Proceedings of the Conference on Innovative Data system Research, CIDR, 2011.
    [6]
    F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. Gruber. Bigtable: A distributed storage system for structured data. In Proceedings of the 7th USENIX Symposium on Operating System Design and Implementation, OSDI. USENIX Assoc., 2006.
    [7]
    B. F. Cooper, R. Ramakrishnan, U. Srivastava, A. Silberstein, P. Bohannon, H.-A. Jacobsen, N. Puz, D. Weaver, and R. Yerneni. PNUTS: Yahoo!'s hosted data serving platform. PVLDB, 1(2), 2008.
    [8]
    J. C. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J. J. Furman, S. Ghemawat, A. Gubarev, C. Heiser, P. Hochschild, W. Hsieh, S. Kanthak, E. Kogan, H. Li, A. Lloyd, S. Melnik, D. Mwaura, D. Nagle, S. Quinlan, R. Rao, L. Rolig, Y. Saito, M. Szymaniak, C. Taylor, R. Wang, and D. Woodford. Spanner: Google's globally-distributed database. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, OSDI, Berkeley, CA, USA, 2012.
    [9]
    F. Dabek, J. Li, E. Sit, J. Robertson, M. F. Kaashoek, and R. Morris. Designing a DHT for low latency and high throughput. In Proceedings of the 1st Symposium on Networked Systems Design and Implementation, NSDI, Berkeley, CA, USA, 2004.
    [10]
    G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: Amazon's highly available key-value store. In Proceedings of 21st ACM Symposium on Operating Systems Principles, SOSP, New York, NY, USA, 2007.
    [11]
    FlockDB. http://engineering.twitter.com/2010/05/introducingflockdb.html.
    [12]
    L. Glendenning, I. Beschastnikh, A. Krishnamurthy, and T. Anderson. Scalable consistency in scatter. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles, SOSP, New York, NY, USA, 2011.
    [13]
    S. D. Gribble, E. A. Brewer, J. M. Hellerstein, and D. Culler. Scalable, distributed data structures for internet service construction. In Proceedings of the 4th Symposium on Operating System Design and Implementation, OSDI, Berkeley, CA, USA, 2000.
    [14]
    U. Kang, C. E. Tsourakakis, and C. Faloutsos. PEGASUS: mining peta-scale graphs. Knowledge Information Systems, 27(2), 2011.
    [15]
    D. Karger, E. Lehman, T. Leighton, R. Panigrahy, M. Levine, and D. Lewin. Consistent Hashing and Random trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web. In Proceedings of the 29th annual ACM Symposium on Theory of Computing, STOC, 1997.
    [16]
    J. Li, J. Stribling, T. M. Gil, R. Morris, and M. F. Kaashoek. Comparing the performance of distributed hash tables under churn. In Proceedings of the Third International Conference on Peer-to-Peer Systems, IPTPS, Berlin, Heidelberg, 2004.
    [17]
    W. Lloyd, M. J. Freedman, M. Kaminsky, and D. G. Andersen. Don't settle for eventual: scalable causal consistency for wide-area storage with COPS. In T. Wobber and P. Druschel, editors, Proceedings of the 23rd ACM Symposium on Operating System Design and Implementation, SOSP. ACM, 2011.
    [18]
    W. Lloyd, M. J. Freedman, M. Kaminsky, and D. G. Andersen. Stronger semantics for low-latency geo-replicated storage. In Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation, NSDI, 2013.
    [19]
    G. Malewicz, M. H. Austern, A. J. C. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. In A. K. Elmagarmid and D. Agrawal, editors, Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, 2010.
    [20]
    Neo4j. http://neo4j.org/.
    [21]
    R. Nishtala, H. Fugal, S. Grimm, M. Kwiatkowski, H. Lee, H. C. Li, R. McElroy, M. Paleczny, D. Peek, P. Saab, D. Stafford, T. Tung, and V. Venkataramani. Scaling Memcache at Facebook. In Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation, NSDI, 2013.
    [22]
    E. Nygren, R. K. Sitaraman, and J. Sun. The Akamai network: a platform for high-performance internet applications. SIGOPS Operating Systems Review, 44(3), Aug. 2010.
    [23]
    C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins. Pig latin: a not-so-foreign language for data processing. In J. T.- L.Wang, editor, Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, 2008.
    [24]
    V. Ramasubramanian and E. G. Sirer. Beehive: O(1)lookup performance for power-law query distributions in peer-to-peer overlays. In Proceedings of the 1st Symposium on Networked Systems Design and Implementation, NSDI, Berkeley, CA, USA, 2004.
    [25]
    S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker. A scalable content-addressable network. In Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications, SIGCOMM, New York, NY, USA, 2001.
    [26]
    Redis. http://redis.io/.
    [27]
    E. G. S. Robert Escriva, Bernard Wong. Hyperdex: A distributed, searchable key-value store for cloud computing. Technical report, Department of Computer Science, Cornell University, Ithaca, New York, December 2011.
    [28]
    A. I. T. Rowstron and P. Druschel. Pastry: Scalable, decentralized object location, and routing for large-scale peer-to-peer systems. In Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg, Middleware, London, UK, UK, 2001.
    [29]
    M. Satyanarayanan. The evolution of coda. ACM Transactions on Computer Systems, 20(2), May 2002.
    [30]
    B. Shao and H. Wang. Trinity. http://research.microsoft.com/enus/projects/trinity/.
    [31]
    Y. Sovran, R. Power, M. K. Aguilera, and J. Li. Transactional storage for geo-replicated systems. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles, SOSP, New York, NY, USA, 2011.
    [32]
    I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan. Chord: A scalable peer-to-peer lookup service for internet applications. In Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications, SIGCOMM, New York, NY, USA, 2001.
    [33]
    D. B. Terry, M. M. Theimer, K. Petersen, A. J. Demers, M. J. Spreitzer, and C. H. Hauser. Managing update conflicts in Bayou, a weakly connected replicated storage system. In Proceedings of the 15th ACM Symposium on Operating Systems Principles, SOSP, New York, NY, USA, 1995.
    [34]
    The Apache Software Foundation. http://hbase.apache.org, 2010.
    [35]
    W. Vogels. Eventually consistent. Queue, 6(6), Oct. 2008.

    Cited By

    View all
    • (2023)FrozenHot Cache: Rethinking Cache Management for Modern HardwareProceedings of the Eighteenth European Conference on Computer Systems10.1145/3552326.3587446(557-573)Online publication date: 8-May-2023
    • (2021)RAMP-TAOProceedings of the VLDB Endowment10.14778/3476311.347637914:12(3014-3027)Online publication date: 28-Oct-2021
    • (2021)AUTOGRProceedings of the VLDB Endowment10.14778/3461535.346154114:9(1517-1530)Online publication date: 22-Oct-2021
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    USENIX ATC'13: Proceedings of the 2013 USENIX conference on Annual Technical Conference
    June 2013
    364 pages

    Sponsors

    • VMware
    • Akamai: Akamai
    • Google Inc.
    • EMC2: EMC2
    • Facebook: Facebook

    Publisher

    USENIX Association

    United States

    Publication History

    Published: 26 June 2013

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)FrozenHot Cache: Rethinking Cache Management for Modern HardwareProceedings of the Eighteenth European Conference on Computer Systems10.1145/3552326.3587446(557-573)Online publication date: 8-May-2023
    • (2021)RAMP-TAOProceedings of the VLDB Endowment10.14778/3476311.347637914:12(3014-3027)Online publication date: 28-Oct-2021
    • (2021)AUTOGRProceedings of the VLDB Endowment10.14778/3461535.346154114:9(1517-1530)Online publication date: 22-Oct-2021
    • (2021)MonkeyDB: effectively testing correctness under weak isolation levelsProceedings of the ACM on Programming Languages10.1145/34855465:OOPSLA(1-27)Online publication date: 15-Oct-2021
    • (2021)Leveraging NVMe SSDs for Building a Fast, Cost-effective, LSM-tree-based KV StoreACM Transactions on Storage10.1145/348096317:4(1-29)Online publication date: 15-Oct-2021
    • (2021)KangarooProceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles10.1145/3477132.3483568(243-262)Online publication date: 26-Oct-2021
    • (2021)WukongACM SIGOPS Operating Systems Review10.1145/3469379.346938855:1(77-83)Online publication date: 6-Jun-2021
    • (2021)A Large-scale Analysis of Hundreds of In-memory Key-value Cache Clusters at TwitterACM Transactions on Storage10.1145/346852117:3(1-35)Online publication date: 16-Aug-2021
    • (2021) XStore: Fast RDMA-Based Ordered Key-Value Store Using Remote Learned CacheACM Transactions on Storage10.1145/346852017:3(1-32)Online publication date: 16-Aug-2021
    • (2021)Don't be a blockheadProceedings of the Workshop on Hot Topics in Operating Systems10.1145/3458336.3465300(144-151)Online publication date: 1-Jun-2021
    • Show More Cited By

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media