Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3230543.3230569acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article
Free access

Sincronia: near-optimal network design for coflows

Published: 07 August 2018 Publication History
  • Get Citation Alerts
  • Abstract

    We present Sincronia, a near-optimal network design for coflows that can be implemented on top on any transport layer (for flows) that supports priority scheduling. Sincronia achieves this using a key technical result --- we show that given a "right" ordering of coflows, any per-flow rate allocation mechanism achieves average coflow completion time within 4X of the optimal as long as (co)flows are prioritized with respect to the ordering.
    Sincronia uses a simple greedy mechanism to periodically order all unfinished coflows; each host sets priorities for its flows using corresponding coflow order and offloads the flow scheduling and rate allocation to the underlying priority-enabled transport layer. We evaluate Sincronia over a real testbed comprising 16-servers and commodity switches, and using simulations across a variety of workloads. Evaluation results suggest that Sincronia not only admits a practical, near-optimal design but also improves upon state-of-the-art network designs for coflows (sometimes by as much as 8X).

    References

    [1]
    2018. Sincronia Repository. https://github.com/sincronia-coflow.
    [2]
    Saksham Agarwal, Shijin Rajakrishnan, Akshay Narayan, Rachit Agarwal, David Shmoys, and Amin Vahdat. 2018. Sincronia: Near-Optimal Network Design for Coflows. In Tech Report.
    [3]
    Saba Ahmadi, Samir Khuller, Manish Purohit, and Sheng Yang. 2017. On scheduling coflows. In MOS IPCO.
    [4]
    Mohammad Alizadeh, Shuang Yang, Milad Sharif, Sachin Katti, Nick McKeown, Balaji Prabhakar, and Scott Shenker. 2013. pFabric: Minimal Near-optimal Datacenter Transport. In ACM SIGCOMM.
    [5]
    Nikhil Bansal and Subhash Khot. 2010. Inapproximability of hyper-graph vertex cover and applications to scheduling problems. In EATCS ICALP.
    [6]
    Kwok Ho Chan, Jozef Babiarz, and Fred Baker. 2006. Configuration Guidelines for DiffServ Service Classes. https://tools/ietf.org/html/rfc4594.
    [7]
    Mosharaf Chowdhury and Ion Stoica. 2012. Coflow: A networking abstraction for cluster applications. In ACM HotNets.
    [8]
    Mosharaf Chowdhury and Ion Stoica. 2015. Efficient coflow scheduling without prior knowledge. In ACM SIGCOMM.
    [9]
    Mosharaf Chowdhury, Matei Zaharia, Justin Ma, Michael I Jordan, and Ion Stoica. 2011. Managing data transfers in computer clusters with orchestra. In ACM SIGCOMM.
    [10]
    Mosharaf Chowdhury, Yuan Zhong, and Ion Stoica. 2014. Efficient coflow scheduling with varys. In ACM SIGCOMM.
    [11]
    Jeffrey Dean and Sanjay Ghemawat. 2004. MapReduce: simplified data processing on large clusters. In USENIX OSDI.
    [12]
    Fahad R Dogar, Thomas Karagiannis, Hitesh Ballani, and Antony Rowstron. 2014. Decentralized task-aware scheduling for data center networks. In ACM SIGCOMM.
    [13]
    Peter X Gao, Akshay Narayan, Gautam Kumar, Rachit Agarwal, Sylvia Ratnasamy, and Scott Shenker. 2015. phost: Distributed near-optimal datacenter transport over commodity network fabric. In ACM CoNEXT.
    [14]
    Naveen Garg, Amit Kumar, and Vinayaka Pandit. 2007. Order scheduling models: hardness and algorithms. In IARCS FSTTCS.
    [15]
    Mark Handley, Costin Raiciu, Alexandru Agache, Andrei Voinescu, Andrew Moore, Gianni Antichi, and Marcin Wojcik. 2017. Re-architecting datacenter networks and stacks for low latency and high performance. In ACM SIGCOMM.
    [16]
    Chi-Yao Hong, Matthew Caesar, and P Godfrey. 2012. Finishing flows quickly with preemptive scheduling. In ACM SIGCOMM.
    [17]
    Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, and Dennis Fetterly. 2007. Dryad: distributed data-parallel programs from sequential building blocks. In ACM EuroSys.
    [18]
    Hamidreza Jahanjou, Erez Kantor, and Rajmohan Rajaraman. 2017. Asymptotically Optimal Approximation Algorithms for Coflow Scheduling. In ACM SPAA.
    [19]
    Samir Khuller, Jingling Li, Pascal Sturmfels, Kevin Sun, and Prayaag Venkat. 2018. Select and Permute: An Improved Online Framework for Scheduling to Minimize Weighted Completion Time. In LATIN.
    [20]
    Yucheng Low, Danny Bickson, Joseph Gonzalez, Carlos Guestrin, Aapo Kyrola, and Joseph M Hellerstein. 2012. Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proceedings of the VLDB Endowment, 5(8): 716--727.
    [21]
    Grzegorz Malewicz, Matthew H Austern, Aart JC Bik, James C Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. 2010. Pregel: a system for large-scale graph processing. In ACM SIGMOD.
    [22]
    Zhen Qiu, Cliff Stein, and Yuan Zhong. 2015. Minimizing the total weighted completion time of coflows in datacenter networks. In ACM SPAA.
    [23]
    Thomas A. Roemer. 2006. A note on the complexity of the concurrent open shop problem. In Journal of Scheduling, 9(4): 389--396. Springer.
    [24]
    Sushant Sachdeva and Rishi Saket. 2013. Optimal inapproximability for scheduling problems via structural hardness for hypergraph vertex cover. In IEEE CCC.
    [25]
    Christo Wilson, Hitesh Ballani, Thomas Karagiannis, and Ant Rowtron. 2011. Better never than late: Meeting deadlines in datacenter networks. In ACM SIGCOMM.
    [26]
    Yuan Yu, Michael Isard, Dennis Fetterly, Mihai Budiu, Úlfar Erlingsson, Pradeep Kumar Gunda, and Jon Currey. 2008. DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language. In USENIX OSDI.
    [27]
    Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael J Franklin, Scott Shenker, and Ion Stoica. 2010. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In USENIX NSDI.
    [28]
    Hong Zhang, Li Chen, Bairen Yi, Kai Chen, Mosharaf Chowdhury, and Yanhui Geng. 2016. CODA: Toward automatically identifying and scheduling coflows in the dark. In ACM SIGCOMM.
    [29]
    Yangming Zhao, Kai Chen, Wei Bai, Minlan Yu, Chen Tian, Yanhui Geng, Yiming Zhang, Dan Li, and Sheng Wang. 2015. Rapier: Integrating routing and scheduling for coflow-aware data center networks. In IEEE INFOCOM.

    Cited By

    View all
    • (2024)Impossibility Results for Data-Center Routing with Congestion Control and Unsplittable FlowsProceedings of the 43rd ACM Symposium on Principles of Distributed Computing10.1145/3662158.3662777(358-368)Online publication date: 17-Jun-2024
    • (2024)Scheduling Coflows in Hybrid Optical-Circuit and Electrical-Packet Switches With Performance GuaranteeIEEE/ACM Transactions on Networking10.1109/TNET.2024.335424532:3(2299-2314)Online publication date: Jun-2024
    • (2024)Weighted Scheduling of Time-Sensitive CoflowsIEEE Transactions on Cloud Computing10.1109/TCC.2024.338451412:2(644-658)Online publication date: Apr-2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGCOMM '18: Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication
    August 2018
    604 pages
    ISBN:9781450355674
    DOI:10.1145/3230543
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 August 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. approximation algorithms
    2. coflow
    3. datacenter networks

    Qualifiers

    • Research-article

    Conference

    SIGCOMM '18
    Sponsor:
    SIGCOMM '18: ACM SIGCOMM 2018 Conference
    August 20 - 25, 2018
    Budapest, Hungary

    Acceptance Rates

    Overall Acceptance Rate 554 of 3,547 submissions, 16%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)172
    • Downloads (Last 6 weeks)22

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Impossibility Results for Data-Center Routing with Congestion Control and Unsplittable FlowsProceedings of the 43rd ACM Symposium on Principles of Distributed Computing10.1145/3662158.3662777(358-368)Online publication date: 17-Jun-2024
    • (2024)Scheduling Coflows in Hybrid Optical-Circuit and Electrical-Packet Switches With Performance GuaranteeIEEE/ACM Transactions on Networking10.1109/TNET.2024.335424532:3(2299-2314)Online publication date: Jun-2024
    • (2024)Weighted Scheduling of Time-Sensitive CoflowsIEEE Transactions on Cloud Computing10.1109/TCC.2024.338451412:2(644-658)Online publication date: Apr-2024
    • (2024)Efficient Approximation Algorithms for Scheduling Coflows With Total Weighted Completion Time in Identical Parallel NetworksIEEE Transactions on Cloud Computing10.1109/TCC.2023.334072912:1(116-129)Online publication date: Jan-2024
    • (2024)SARS: Towards minimizing average Coflow Completion Time in MapReduce systemsComputer Networks10.1016/j.comnet.2024.110429247(110429)Online publication date: Jun-2024
    • (2023)Saba: Rethinking Datacenter Network Allocation from Application's PerspectiveProceedings of the Eighteenth European Conference on Computer Systems10.1145/3552326.3587450(623-638)Online publication date: 8-May-2023
    • (2023)Recent Advances in Data Intensive Applications: Survey2023 10th International Conference on Wireless Networks and Mobile Communications (WINCOM)10.1109/WINCOM59760.2023.10322920(1-6)Online publication date: 26-Oct-2023
    • (2023)Consistent Low Latency Scheduler for Distributed Key-Value StoresIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.331577734:12(3012-3027)Online publication date: Dec-2023
    • (2023)Scheduling Coflows by Online Identification in Data Center NetworkIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2023.331551211:4(1057-1069)Online publication date: Oct-2023
    • (2023)Bottleneck-Aware Non-Clairvoyant Coflow Scheduling With FaiIEEE Transactions on Cloud Computing10.1109/TCC.2021.312836011:1(1011-1025)Online publication date: 1-Jan-2023
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media