Large-scale cluster-based Internet services often host partitioned datasets to provide incremental scalability. The aggregation of results produced from multiple partitions is a fundamental building block for the delivery of these services. This paper presents the design and implementation of a programming primitive -- Data Aggregation Call (DAC) -- to exploit partition parallelism for cluster-based Internet services. A DAC request specifies a local processing operator and a global reduction operator, and it aggregates the local processing results from participating nodes through the global reduction operator. Applications may allow a DAC request to return partial aggregation results as a tradeoff between quality and availability. Our architecture design aims at improving interactive responses with sustained throughput for typical cluster environments where platform heterogeneity and software/hardware failures are common. At the cluster level, our load-adaptive reduction tree construction algorithm balances processing and aggregation load across servers while exploiting partition parallelism. Inside each node, we employ an event-driven thread pool design that prevents slow nodes from adversely affecting system throughput under highly concurrent workload. We further devise a staged timeout scheme that eagerly prunes slow or unresponsive servers from the reduction tree to meet soft deadlines. We have used the DAC primitive to implement several applications: a search engine document retriever, a parallel protein sequence matcher, and an online parallel facial recognizer. Our experimental and simulation results validate the effectiveness of the proposed optimization techniques for reducing response time, improving throughput, and gracefully handling server unresponsiveness. We also demonstrate the ease-of use of the DAC primitive and the scalability of our architecture design.
References
[1]
S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. Basic Local Alignment Search Tool. Journal of Molecular Biology, 215:403--410, 1990.
D. Andresen, T. Yang, V. Holmedahl, and O. Ibarra. SWEB: Towards a Scalable WWW Server on MultiComputers. In Proc. of the 10th IEEE Intl. Parallel Processing Symposium, Honolulu, HI, Apr. 1996.
M. Banikazemi, V. Moorthy, and D. K. Panda. Efficient Collective Communication on Heterogeneous Networks of Workstations. In Proc. of International Conference on Parallel Processing, 1998.
C. Chang, T. Kurc, A. Sussman, U. Catalyurek, and J. Saltz. A hypergraph-based workload partitioning strategy for parallel data aggregation. In SIAM PPSC, Portsmouth, Virginia, Mar. 2001.
S. D. Gribble, M. Welsh, E. A. Brewer, and D. Culler. The MultiSpace: An Evolutionary Platform for Infrastructural Services. In USENIX Annual Technical Conf., Monterey, CA, June 1999.
W. Gropp, E. Lusk, N. Doss, and A. Skjellum. A High-performance, Portable Implementation of the MPI Message Passing Interface Standard. Parallel Computing, 22(6):789--828, Sept. 1996.
N. T. Karonis, B. R. de~Supinski, I. Foster, W. Gropp, E. Lusk, and J. Bresnahan. Exploiting hierarchy in parallel computer networks to optimize collective operation performance. In Proceedings of the 14th International Parallel and Distributed Processing Symposium (IPDPS'2000), Cancun, Mexico, May 2000.
T. Kielmann, R. F. H. Hofman, H. E. Bal, A. Plaat, and R. A. F. Bhoedjang. MagPIe: MPI's collective communication operations for clustered wide area systems. In ACM PPoPP. ACM, May 1999.
S. R. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong. Tag: a tiny aggregation service for ad-hoc sensor networks. In OSDI, Boston, MA, Dec. 2002.
M. Mitzenmacher. On the Analysis of Randomized Load Balancing Schemes. In Proc. of the 9th ACM Symposium on Parallel Algorithms and Architectures, pages 292--301, Newport, RI, June 1997.
V. S. Pai, M. Aron, G. Banga, M. Svendsen, P. Druschel, W. Zwaenepoel, and E. Nahum. Locality-Aware Request Distribution in Cluster-based Network Servers. In ACM ASPLOS, San Jose, CA, Oct. 1998.
V. S. Pai, P. Druschel, and W. Zwaenepoel. Flash: An Efficient and Portable Web Server. In Proc. of 1999 Annual USENIX Technical Conf., Monterey, CA, June 1999.
Y. Saito, B. N. Bershad, and H. M. Levy. Manageability, Availability, and Performance in Porcupine: a Highly Scalable, Cluster-based Mail Service. In ACM SOSP, Charleston, SC, Dec. 1999.
K. Shen, H. Tang, T. Yang, and L. Chu. Integrated Resource Management for Cluster-based Internet Services. In Proc. of the 5th USENIX Symposium on Operating Systems Design and Implementation, Boston, MA, Dec. 2002.
K. Shen, T. Yang, and L. Chu. Cluster Load Balancing for Fine-grain Network Services. In Proc. of International Parallel & Distributed Processing Symposium, Fort Lauderdale, FL, Apr. 2002.
K. Shen, T. Yang, L. Chu, J. L. Holliday, D. A. Kuschner, and H. Zhu. Neptune: Scalable Replication Management and Programming Support for Cluster-based Network Services. In USITS, San Francisco, CA, Mar. 2001.
H. Tang and T. Yang. Optimizing Threaded MPI Execution on SMP Clusters. In Proc. of 15th ACM International Conference on Supercomputing, Naples, Italy, June 2001.
J. R. von Behren, E. A. Brewer, N. Borisov, M. Chen, M. Welsh, J. MacDonald, J. Lau, S. Gribble, and D. Culler. Ninja: A Framework for Network Services. In Proc. of 2002 Annual USENIX Technical Conf., Monterey, CA, June 2002.
S. Zhou. An Experimental Assessment of Resource Queue Lengths as Load Indices. In Proc. of the Winter USENIX Technical Conf., pages 73--82, Washington, DC, Jan. 1987.
Luo QLin JZhuo YQian XBahar IHerlihy MWitchel ELebeck A(2019)HopProceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3297858.3304009(893-907)Online publication date: 4-Apr-2019
Budiu MIsaacs RMurray DPlotkin GBarham PAl-Kiswany SBoshmaf YLuo QAndoni ATelea AGobbetti EBethel W(2016)Interacting with large distributed datasets using sketchProceedings of the 16th Eurographics Symposium on Parallel Graphics and Visualization10.5555/3061436.3061442(31-43)Online publication date: 6-Jun-2016
Proceedings of the ACM SIGPLAN symposium on principles and practice of parallel programming (PPoPP 2003) and workshop on partial evaluation and semantics-based program manipulation (PEPM 2003)
Large-scale cluster-based Internet services often host partitioned datasets to provide incremental scalability. The aggregation of results produced from multiple partitions is a fundamental building block for the delivery of these services. This paper ...
ICWS '06: Proceedings of the IEEE International Conference on Web Services
Web services have many important advantages. But their great drawback, the invocation overhead, has not been a research focus. So far, only invocations of simple Web services were considered. But Web service composition may create additional performance ...
OSDI '02: Proceedings of the 5th Symposium on Operating Systems Design and Implementation
Client request rates for Internet services tend to be bursty and thus it is important to maintain efficient resource utilization under a wide range of load conditions. Network service clients typically seek services interactively and maintaining ...
Professional software practices and optimizing software engineering processes are the topics covered in this book. It is well suited for all developers, project managers, and people in the software business.
Each of the four main parts focuses on different aspects of software engineering and professional software practices. The chapters are organized as essays, and are more or less related by the main theme of the book: how the software organization process can be improved, and the measures that must be taken to achieve this goal.
Part 1, "The Software Tar Pit," looks into the current state of the art in the software field and why it is in its current state. One point the author starts arguing here-and continues to argue throughout the book-is the software engineering aspect of software development, pointing out that this is different than computer science (that's what most developers with a university degree learn). Clearly a computer scientist is unlikely to be a software engineer, as both disciplines concentrate on different aspects of the software business. Another chapter discusses the different types of development styles and why one is successful and others are not. In the final chapter, the author tries to define what "profession" really means and which principles one must follow to become a well-established professional.
Part 2, "Individual Professionalism," explains how the individual developer, software engineer, project manager, and other specialists can reach a higher level of professionalism. This part begins by looking at different types of programmers (those with different degrees and other qualifications). The author describes some personality types found in two different studies based on psychological measures, such as extroversion/introversion, thinking/feeling, and so on. Not surprisingly, most developers are thinkers and not feelers. The chapter that discusses the different types of job specialization depending on company size is also very interesting. Obviously the degree of specialization varies with the size of companies.
Part 3, "Organizational Professionalism," deals with software projects and practices that are needed to run software projects in a professional manner. The author describes in the various chapters how organizations will benefit by using better software practices. Chapter 13 should be read by all project managers, developers, and other specialists; it may even be the most important chapter to read in the book, as it clearly shows the benefits organizations will gain by using proper software and project engineering methods. The next chapters counteract some usual arguments all software managers encounter: "Those methods make sense theoretically and are helpful, but do not help me in my current situation." The final chapter of this part explains in detail the approach that Construx (the author's company) is using. The methods described there can be used as a starter to initiate a similar scheme in other companies.
Part 4, "Industrial Professionalism," takes a broader view by examining what the software industry as a whole can do to help individuals and organizations achieve and maintain a professional state. It focuses especially on the licensing aspect of software professionals, discussing the pros and cons of such an undertaking, and showing that this can be a big leap for the whole industry. McConnell clearly states that it will not be necessary for all developers to become licensed, but at least a small percentage should reach this level of professionalism for the sake of the whole society. In general, the conclusion here is that a discipline that uses engineering principles should also use licensing principles like other engineering disciplines. Obviously, many of the arguments given in this chapter mainly apply to the situation in the US and other English-speaking countries like Canada and the UK. Countries like Germany have a diverging tradition here, and it may be interesting to have a look at this aspect in a new edition of the book.
This book should be read by everyone in the software industry who wants to achieve a more professional way of doing his or her job. I still remember the discussions we had during my computer science study about what we are: scientists or engineers. As the author wrote, most students tended to opt for the scientist, as this is more reputable. I have some doubts about the generalization of some psychological measures described in the book, as those measures are very culture dependent and, typically, most intelligent people will be able to provide a view of their personality that is in line with their profession and the way the profession is viewed. But, nevertheless, this is quite an interesting point. What I really like in this book is that it takes all the organizational levels into account, from the individual to the whole industry. It shows that only an effort undertaken at all levels will help to make the software business a real "profession." This book does not explain how one should change one's software development habits (this is explained in great depth in other books by this author). It does give one a complete and concise summary about the problems and the potential of applying basic software engineering principles.
Online Computing Reviews Service
Access critical reviews of Computing literature here
Proceedings of the ACM SIGPLAN symposium on principles and practice of parallel programming (PPoPP 2003) and workshop on partial evaluation and semantics-based program manipulation (PEPM 2003)
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Luo QLin JZhuo YQian XBahar IHerlihy MWitchel ELebeck A(2019)HopProceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3297858.3304009(893-907)Online publication date: 4-Apr-2019
Budiu MIsaacs RMurray DPlotkin GBarham PAl-Kiswany SBoshmaf YLuo QAndoni ATelea AGobbetti EBethel W(2016)Interacting with large distributed datasets using sketchProceedings of the 16th Eurographics Symposium on Parallel Graphics and Visualization10.5555/3061436.3061442(31-43)Online publication date: 6-Jun-2016
Kumar GAnanthanarayanan GRatnasamy SStoica ICadar CPietzuch PKeeton KRodrigues R(2016)Hold 'em or fold 'em?Proceedings of the Eleventh European Conference on Computer Systems10.1145/2901318.2901351(1-14)Online publication date: 18-Apr-2016
Tang ZWang QCai S(2012)Network-Based inference algorithm on hadoopProceedings of the 20th international conference on Foundations of Intelligent Systems10.1007/978-3-642-34624-8_42(367-376)Online publication date: 4-Dec-2012
Gonina EKannan AShafer JBudiu MArlitt MFedak GFox G(2011)Parallelizing large-scale data processing applications with data skewProceedings of the second international workshop on MapReduce and its applications10.1145/1996092.1996101(35-42)Online publication date: 8-Jun-2011
Xu YKostamaa PGao LElmagarmid AAgrawal D(2010)Integrating hadoop and parallel DBMsProceedings of the 2010 ACM SIGMOD International Conference on Management of data10.1145/1807167.1807272(969-974)Online publication date: 6-Jun-2010
Yu YIsard MFetterly DBudiu MErlingsson ÚGunda PCurrey J(2008)DryadLINQProceedings of the 8th USENIX conference on Operating systems design and implementation10.5555/1855741.1855742(1-14)Online publication date: 8-Dec-2008
Yang HDasdan AHsiao RParker DZhou LLing TOoi B(2007)Map-reduce-mergeProceedings of the 2007 ACM SIGMOD international conference on Management of data10.1145/1247480.1247602(1029-1040)Online publication date: 11-Jun-2007