Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2588555.2593681acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

A software-defined networking based approach for performance management of analytical queries on distributed data stores

Published: 18 June 2014 Publication History

Abstract

Nowadays data analytics applications are accessing more and more data from distributed data stores, creating a large amount of data traffic on the network. Therefore, distributed analytic queries are prone to suffer from poor performance when they encounter network contention, which can be quite common in a shared network. Typical distributed query optimizers do not have a way to solve this problem because they treat the network as a black-box: they are unable to monitor it, let alone control it. With the new era of software-defined networking (SDN), we show how SDN can be effectively exploited for performance management for analytical queries in distributed data store environments. More specifically, we present a group of methods to leverage SDN's visibility into and control of the network's state that enable distributed query processors to achieve performance improvements and differentiation for analytical queries. We demonstrate the effectiveness of the methods through detailed experimental studies on a system running on a software-defined network with commercial switches. To the best of our knowledge, this is the first work to analyze and show the opportunities of SDN for distributed query optimization. It is our hope that this will open up a rich area of research and technology development in distributed data intensive computing.

References

[1]
M. Akdere, U. Çetintemel, M. Riondato, E. Upfal, and S. Zdonik. Learning-based query performance modeling and prediction. In Proc. of ICDE, 2012.
[2]
M. Al-Fares, S. Radhakrishnan, B. Raghavan, N. Huang, and A. Vahdat. Hedera: Dynamic flow scheduling for data center networks. In Proc. of NSDI, 2010.
[3]
L. Amsaleg, M. J. Franklin, A. Tomasic, and T. Urhan. Scrambling query plans to cope with unexpected delays. In Proc. of PDIS, 1996.
[4]
R. Avnur and J. M. Hellerstein. Eddies: continuously adaptive query processing. In Proc. of SIGMOD, 2000.
[5]
P. Bernstein and D.-M. Chiu. Using Semi-Joins to Solve Relational Queries. JACM, 28:25--40, 1981.
[6]
B. Chandramouli, C. Bond, S. Babu, and J. Yang. Query suspend and resume. In Proc. of SIGMOD, 2007.
[7]
S. Chaudhuri and U. Dayal. An overview of data warehousing and olap technology. SIGMOD Record, 26:65--74, 1997.
[8]
R. L. Cole and G. Graefe. Optimization of dynamic query evaluation plans. In Proc. of SIGMOD, 1994.
[9]
A. D. Ferguson, A. Guha, C. Liang, R. Fonseca, and S. Krishnamurthi. Participatory networking: An api for application control of sdns. In Proc. of SIGCOMM, 2013.
[10]
M. J. Franklin, B. T. Jónsson, and D. Kossmann. Performance tradeoffs for client-server query processing. In Proc. of SIGMOD, 1996.
[11]
W. Kim, P. Sharma, J. Lee, S. Banerjee, J. Tourrilhes, S.-J. Lee, and P. Yalagandula. Automated and scalable qos control for network convergence. In Proc. of INM/WREN, 2010.
[12]
D. Kossmann. The state of the art in distributed query processing. ACM Comput. Surv., 32(4), Dec. 2000.
[13]
N. McKeown, T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson, J. Rexford, S. Shenker, and J. Turner. Openflow: enabling innovation in campus networks. SIGCOMM Comput. Commun. Rev., 2008.
[14]
K. Morton, R. Bunker, J. D. Mackinlay, R. Morton, and C. Stolte. Dynamic workload driven data integration in tableau. In Proc. of SIGMOD, 2012.
[15]
Open Networking Foundation. Software-Defined Networking: The New Norm for Networks. 2013.
[16]
A. Shieh, S. Kandula, A. Greenberg, C. Kim, and B. Saha. Sharing the data center network. In Proc. of NSDI, 2011.
[17]
A. Simitsis, K. Wilkinson, M. Castellanos, and U. Dayal. Optimizing analytic data flows for multiple execution engines. In Proc. of SIGMOD, 2012.
[18]
T. Urhan and M. J. Franklin. Xjoin: A reactively-scheduled pipelined join operator. IEEE Data Enginerring Bulletin, 23(2):27--33, 2000.
[19]
T. Urhan, M. J. Franklin, and L. Amsaleg. Cost-based query scrambling for initial delays. In Proc. of SIGMOD, 1998.
[20]
G. Wang, T. E. Ng, and A. Shaikh. Programming your network at run-time for big data applications. In Proc. of HotSDN, 2012.
[21]
W. Wu, Y. Chi, S. Zhu, J. Tatemura, H. Hacıgümüş, and J. F. Naughton. Predicting query execution time: Are optimizer cost models really unusable? In Proc. of ICDE, 2013.
[22]
K.-K. Yap, T.-Y. Huang, B. Dodson, M. S. Lam, and N. McKeown. Towards software-friendly networks. In Proc. of APSys, 2010.

Cited By

View all
  • (2023)Demystifying the QoS and QoE of Edge-hosted Video Streaming Applications in the Wild with SNESetProceedings of the ACM on Management of Data10.1145/36267231:4(1-29)Online publication date: 12-Dec-2023
  • (2020)Simulative Evaluation of KPIs in SDN for Topology Classification and Performance Prediction Models2020 16th International Conference on Network and Service Management (CNSM)10.23919/CNSM50824.2020.9269078(1-9)Online publication date: 2-Nov-2020
  • (2019)HarmoniaProceedings of the VLDB Endowment10.14778/3368289.336830113:3(376-389)Online publication date: 1-Nov-2019
  • Show More Cited By

Index Terms

  1. A software-defined networking based approach for performance management of analytical queries on distributed data stores

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '14: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data
    June 2014
    1645 pages
    ISBN:9781450323765
    DOI:10.1145/2588555
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 June 2014

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. analytical queries
    2. distributed data stores
    3. software-defined networking

    Qualifiers

    • Research-article

    Conference

    SIGMOD/PODS'14
    Sponsor:

    Acceptance Rates

    SIGMOD '14 Paper Acceptance Rate 107 of 421 submissions, 25%;
    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)14
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 17 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Demystifying the QoS and QoE of Edge-hosted Video Streaming Applications in the Wild with SNESetProceedings of the ACM on Management of Data10.1145/36267231:4(1-29)Online publication date: 12-Dec-2023
    • (2020)Simulative Evaluation of KPIs in SDN for Topology Classification and Performance Prediction Models2020 16th International Conference on Network and Service Management (CNSM)10.23919/CNSM50824.2020.9269078(1-9)Online publication date: 2-Nov-2020
    • (2019)HarmoniaProceedings of the VLDB Endowment10.14778/3368289.336830113:3(376-389)Online publication date: 1-Nov-2019
    • (2019)On SDN-Enabled Online and Dynamic Bandwidth Allocation for Stream AnalyticsIEEE Journal on Selected Areas in Communications10.1109/JSAC.2019.292706237:8(1688-1702)Online publication date: Aug-2019
    • (2018)Achieving Consistent Real-Time Latency at Scale in a Commodity Virtual Machine Environment Through Socket Outsourcing-Based Network StacksIEEE Access10.1109/ACCESS.2018.28772966(69961-69977)Online publication date: 2018
    • (2018)A computational model to support in-network data analysis in federated ecosystemsFuture Generation Computer Systems10.1016/j.future.2017.05.03280:C(342-354)Online publication date: 1-Mar-2018
    • (2018)Implementation and Comparative Evaluation of an Outsourcing Approach to Real-Time Network Services in Commodity Hosted EnvironmentsCloud Computing – CLOUD 201810.1007/978-3-319-94295-7_13(189-205)Online publication date: 19-Jun-2018
    • (2017)SquirrelJoinProceedings of the VLDB Endowment10.14778/3137628.313763610:11(1250-1261)Online publication date: 1-Aug-2017
    • (2017)NetstoreProceedings of the Second International Workshop on Active Middleware on Modern Hardware10.1145/3155889.3155893(1-10)Online publication date: 11-Dec-2017
    • (2017)A Coflow-Based Co-Optimization Framework for High-Performance Data Analytics2017 46th International Conference on Parallel Processing (ICPP)10.1109/ICPP.2017.48(392-401)Online publication date: Aug-2017
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media