research-article

A survey of pipelined workflow scheduling: Models and algorithms

Authors:

Ümit V. Çatalyürek,

Erik SauleAuthors Info & Claims

ACM Computing Surveys (CSUR), Volume 45, Issue 4

Article No.: 50, Pages 1 - 36

https://doi.org/10.1145/2501654.2501664

Published: 30 August 2013 Publication History

Abstract

A large class of applications need to execute the same workflow on different datasets of identical size. Efficient execution of such applications necessitates intelligent distribution of the application components and tasks on a parallel machine, and the execution can be orchestrated by utilizing task, data, pipelined, and/or replicated parallelism. The scheduling problem that encompasses all of these techniques is called pipelined workflow scheduling, and it has been widely studied in the last decade. Multiple models and algorithms have flourished to tackle various programming paradigms, constraints, machine behaviors, or optimization goals. This article surveys the field by summing up and structuring known results and approaches.

References

[1]

Agnetis, A., Mirchandani, P. B., Pacciarelli, D., and Pacifici, A. 2004. Scheduling problems with two competing agents. Oper. Res. 52, 2, 229--242.

Digital Library

[2]

Agrawal, K., Benoit, A., Magnan, L., and Robert, Y. 2010. Scheduling algorithms for linear work-flow optimization. In Proceedings of the 24th IEEE International Parallel and Distributed Processing Symposium (IPDPS'10). IEEE Computer Society Press.

[3]

Agrawal, K., Benoit, A., and Robert, Y. 2008. Mapping linear workflows with computation/communication overlap. In Proceedings of the 14th IEEE International Conference on Parallel and Distributed Systems (ICPADS'08). IEEE.

Digital Library

[4]

Ahmad, I. and Kwok, Y.-K. 1998. On exploiting task duplication in parallel program scheduling. IEEE Trans. Parallel Distrib. Syst. 9, 9, 872--892.

Digital Library

[5]

Alexandrov, A., Ionescu, M. F., Schauser, K. E., and Scheiman, C. 1995. LogGP: Incorporating long messages into the LogP model - One step closer towards a realistic model for parallel computation. In Proceedings of the 7^th Annual Symposium on Parallelism in Algorithms and Architectures (SPAA'95).

Digital Library

[6]

Allan, V. H., Jones, R. B., Lee, R. M., and Allan, S. J. 1995. Software pipelining. ACM Comput. Surv. 27, 3, 367--432.

Digital Library

[7]

Amdahl, G. M. 1967. Validity of the single processor approach to achieving large scale computing capabilities. In Proceedings of the Spring Joint Computer Conference (AFIPS'67). ACM Press, New York, 483--485.

Digital Library

[8]

Banerjee, S., Hamada, T., Chau, P. M., and Fellman, R. D. 1995. Macro pipelining basedscheduling on high performance heterogeneous multiprocessor systems. IEEE Trans. Signal Process. 43, 6, 1468--1484.

Digital Library

[9]

Bansal, N., Kimbrel, T., and Pruhs, K. 2007. Speed scaling to manage energy and temperature. J. ACM 54, 1, 1--39.

Digital Library

[10]

Beaumont, O., Legrand, A., Marchal, L., and Robert, Y. 2004. Assessing the impact and limits of steady-state scheduling for mixed task and data parallelism on heterogeneous platforms. In Proceedings of the 3rd International Symposium on Parallel and Distributed Computing/3rd International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Networks (ISPDC'04). IEEE Computer Society, 296--302.

Digital Library

[11]

Benoit, A. 2009. Scheduling pipelined applications: Models, algorithms and complexity. Habilitation a diriger des recherches. Tech. rep., Ecole Normale Superieure de Lyon.

[12]

Benoit, A., Gaujal, B., Gallet, M., and Robert, Y. 2009a. Computing the throughput of replicated workflows on heterogeneous platforms. In Proceedings of the 38th International Conference on Parallel Processing (ICPP'09). IEEE Computer Society Press.

Digital Library

[13]

Benoit, A., Kosch, H., Rehn-Sonigo, V., and Robert, Y. 2009b. Multi-criteria scheduling of pipeline workflows (and application to the JPEG encoder). Int. J. High Perform. Comput. Appl. 23, 2, 171--187.

Digital Library

[14]

Benoit, A., Robert, Y., and Thierry, E. 2009c. On the complexity of mapping linear chain applications onto heterogeneous platforms. Parallel Process. Lett. 19, 3, 383--397.

[15]

Benoit, A., Rehn-Sonigo, V., and Robert, Y. 2007. Multi-criteria scheduling of pipeline workflows. In Proceedings of the 6th International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Networks (HeteroPar'07).

Digital Library

[16]

Benoit, A., Rehn-Sonigo, V., and Robert, Y. 2008. Optimizing latency and reliability of pipeline workflow applications. In Proceedings of the 17th International Heterogeneity in Computing Workshop (HCW'08). IEEE.

[17]

Benoit, A., Renaud-Goud, P., and Robert, Y. 2010. Performance and energy optimization of concurrent pipelined applications. In Proceedings of the 24th IEEE International Parallel and Distributed Processing Symposium (IPDPS'10). IEEE Computer Society Press.

[18]

Benoit, A. and Robert, Y. 2008. Mapping pipeline skeletons onto heterogeneous platforms. J. Parallel Distrib. Comput. 68, 6, 790--808.

Digital Library

[19]

Benoit, A. and Robert, Y. 2009. Multi-criteria mapping techniques for pipeline workflows on heterogeneous platforms. In Recent Developments in Grid Technology and Applications. G. A. Gravvanis, J. P. Morrison, H. R. Arabnia, and D. A. Power, Eds., Nova Science Publishers, 65--99.

[20]

Benoit, A. and Robert, Y. 2010. Complexity results for throughput and latency optimization of replicated and data-parallel workflows. Algorithmica 57, 4, 689--724.

Digital Library

[21]

Berman, F., Chien, A., Cooper, K., Dongarra, J., Foster, I., Gannon, D., Johnsson, L., Kennedy, K., Kesselman, C., Mellor-Crumme, J., Reed, D., Torczon, L., and Wolski, R. 2001. The GrADS project: Software support for high-level grid application development. Int. J. High Perform. Comput. Appl.15, 4, 327--344.

Digital Library

[22]

Besseron, X., Bouguerra, S., Gautier, T., Saule, E., and Trystram, D. 2009. Fault tolerance and availability awareness in computational grids. In Fundamentals of Grid Computing (Numerical Analysis and Scientific Computing), F. Magoules, Ed., Chapman and Hall/CRC Press.

[23]

Beynon, M. D. 2001. Supporting data intensive applications in a heterogeneous environment. Ph.D. dissertation, University of Maryland.

Digital Library

[24]

Beynon, M. D., Kurc, T., Catalyurek, U. V., Chang, C., Sussman, A., and Saltz, J. 2001. Distributed processing of very large datasets with datacutter. Parallel Comput. 27, 11, 1457--1478.

Digital Library

[25]

Bhat, P. B., Raghavendra, C.S., and Prasanna, V. K. 2003. Efficient collective communication in distributed heterogeneous systems. J. Parallel Distrib. Comput. 63, 3, 251--263.

Digital Library

[26]

Blelloch, G. E., Hardwick, J. C., Sipelstein, J., Zagha, M., and Chatterjee, S. 1994. Implementation of a portable nested data-parallel language. J. Parallel Distrib. Comput. 21, 1, 4--14.

Digital Library

[27]

Blikberg, R. and Sorevik, T. 2005. Load balancing and OpenMP implementation of nested parallelism. Parallel Comput. 31, 10--12, 984--998.

Digital Library

[28]

Bokhari, S. H. 1988. Partitioning problems in parallel, pipeline, and distributed computing. IEEE Trans. Comput. 37, 1, 48--57.

Digital Library

[29]

Bowers, S., Mcphillips, T. M., Ludascher, B., Cohen, S., and Davidson, S. B. 2006. A model for user-oriented data provenance in pipelined scientific workflows. In Proceedings of the Provenance and Annotation of Data, International Provenance and Annotation Workshop (IPAW'06). 133--147.

Digital Library

[30]

Brucker, P. 2007. Scheduling Algorithms 5^th Ed. Springer.

Digital Library

[31]

Chekuri, C., Hasan, W., and Motwani, R. 1995. Scheduling problems in parallel query optimization. In Proceedings of the 14th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS'95). ACM Press, New York, 255--265.

Digital Library

[32]

Choudhary, A., Liao, W.-K., Weiner, D., Varshney, P., Linderman, R., Linderman, M., and Brown, R. 2000. Design, implementation and evaluation of parallel pipelined STAP on parallel computers. IEEE Trans. Aerospace Electron. Syst. 36, 2, 655--662.

[33]

Choudhary, A., Narahari, B., Nicol, D., and Simha, R. 1994. Optimal processor assignment for a class of pipeline computations. IEEE Trans. Parallel Distrib. Syst. 5, 4, 439--443.

Digital Library

[34]

Cole, M. 2004. Bringing skeletons out of the closet: A pragmatic manifesto for skeletal parallel programming. Parallel Comput. 30, 3, 389--406.

Digital Library

[35]

Culler, D., Karp, R., Patterson, D., Sahay, A., Schauser, K. E., Santos, E., Subramonian, R., and Von Eicken, T. 1993. LogP: Towards a realistic model of parallel computation. In Proceedings of the 4th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'93). ACM, New York, 1--12.

Digital Library

[36]

Darte, A., Robert, Y., and Vivien, F. 2000. Scheduling and Automatic Parallelization. Birkhauser.

Digital Library

[37]

Davis, A. L. 1978. Data driven nets: A maximally concurrent, procedural, parallel process representation for distributed control systems. Tech. rep., Department of Computer Science, University of Utah, Salt Lake City, UT.

[38]

Deelman, E., Blythe, J., Gil, Y., and Kesselman, C. 2003. Workflow management in GriPhyN. In Grid Resource Management, Springer.

[39]

Deelman, E., Singh, G., Su, M. H., Blythe, J., Gil, A., Kesselman, C., Mehta, G., Vahi, K., Berriman, G. B., Good, J., Laity, A., Jacob, J. C., and Katz, D. S. 2005. Pegasus: A framework for mapping complex scientific workflows onto distributed systems. Sci. Program. J. 13, 219--237.

Digital Library

[40]

Dennis, J. B. 1974. First version of a data flow procedure language. In Proceedings of the Symposium on Programming. 362--376.

Digital Library

[41]

Dennis, J. B. 1980. Data flow supercomputers. Comput. 13, 11, 48--56.

Digital Library

[42]

Maheswari, U. and Devi, C. 2009. Scheduling recurrent precedence-constrained task graphs on a symmetric shared-memory multiprocessor. In Proceedings of the European Conference on Parallel Processing (EuroPar'09). Lecture Notes in Computer Science, vol. 5704, Springer, 265--280.

Digital Library

[43]

Nascimento, L. T. D., Ferreira, R. A., Meira, W. Jr., and Guedes, D. 2005. Scheduling data flow applications using linear programming. In Proceedings of the 34th International Conference on Parallel Processing (ICPP'05). IEEE Computer Society, 638--645.

Digital Library

[44]

Dutot, P.-F., Rzadca, K., Saule, E., and Trystram, D. 2009. Multi-objective scheduling. In Introduction to Scheduling, Y. Robert and F. Vivien, Eds., CRC Press, Boca Raton, FL.

[45]

Fahringer, T., Jugravu, A., Pllana, S., Prodan, R., Seragiotto, C., Jr., and Truong, H.-L. 2005. ASKALON: A tool set for cluster and Grid computing: Research articles. Concurr. Comput. Pract. Exper. 17, 2--4, 143--169.

Digital Library

[46]

Fahringer, T., Pllana, S., and Testori, J. 2004. Teuta: Tool support for performance modeling of distributed and parallel applications. In Proceedings of the International Conference on Computational Science, Tools for Program Development and Analysis in Computational Science.

[47]

Feitelson, D. G., Rudolph, L., Schwiegelshohn, U., Sevcik, K. C., and Wong, P. 1997. Theory and practice in parallel job scheduling. In Proceedings of the Conference on Job Scheduling Strategies for Parallel Processing. Springer, 1--34.

Digital Library

[48]

Foster, I., Kesselman, C., and Tuecke, S. 2001. The anatomy of the grid: Enabling scalable virtual organizations. Int. J. High Perform. Comput. Appl. 15, 3, 200--222.

Digital Library

[49]

Gairing, M., Monien, B., and Woclaw, A. 2005. A faster combinatorial approximation algorithm for scheduling unrelated parallel machines. In Automata, Languages and Programming, vol. 3580, Springer, 828--839.

Digital Library

[50]

Garey, M. R. and Johnson, D. S. 1979. Computers and Intractability. Freeman, San Francisco.

[51]

Girault, A., Saule, E., and Trystram, D. 2009. Reliability versus performance for critical applications. J. Parallel Distrib. Comput. 69, 3, 326--336.

Digital Library

[52]

Gonzalez, T. F., Ibarra, O. H., and Sahni, S. 1977. Bounds for LPT schedules on uniform processors. SIAM J. Comput. 6, 155--166.

Digital Library

[53]

Graham, R. L. 1966. Bounds for certain multiprocessing anomalies. Bell Syst. Tech. J. 45, 1563--1581.

[54]

Graham, R. L. 1969. Bounds on multiprocessing timing anomalies. SIAM J. Appl. Math. 17, 2, 416--429.

Digital Library

[55]

Guinand, F., Moukrim, A., and Sanlaville, E. 2004. Sensitivity analysis of tree scheduling on two machines with communication delays. Parallel Comput. 30, 103--120.

[56]

Guirado, F., Ripoll, A., Roig, C., Hernandez, A., and Luque, E. 2006. Exploiting throughput for pipeline execution in streaming image processing applications. In Proceedings of the European Conference on Parallel Processing (EuroPar'06). Lecture Notes in Computer Science, vol. 4128, Springer, 1095--1105.

Digital Library

[57]

Guirado, F., Ripoll, A., Roig, C., and Luque, E. 2005. Optimizing latency under throughput requirements for streaming applications on cluster execution. In Proceedings of the IEEE International Conference on Cluster Computing. IEEE, 1--10.

[58]

Ha, S. and Lee, E. A. 1997. Compile-time scheduling of dynamic constructs in dataflow program graphs. IEEE Trans. Comput. 46, 7, 768--778.

Digital Library

[59]

Han, Y., Narahari, B., and Choi, H.-A. 1992. Mapping a chain task to chained processors. Inf. Process. Lett. 44, 141--148.

Digital Library

[60]

Hartley, T. D. R. and Catalyurek, U. V. 2009. A component-based framework for the cell broadband engine. In Proceedings of 23rd International Parallel and Distributed Processing Symposium, The 18th Heterogeneous Computing Workshop (HCW'09).

Digital Library

[61]

Hartley, T. D. R., Catalyurek, U. V., Ruiz, A., Igual, F., Mayo, R., and Ujaldon, M. 2008. Biomedical image analysis on a cooperative cluster of GPUs and multicores. In Proceedings of the 22nd Annual International Conference on Supercomputing (ICS'08). 15--25.

Digital Library

[62]

Hartley, T. D. R., Fasih, A. R., Berdanier, C. A., Ozguner, F., and Catalyurek, U. V. 2009. Investigating the use of GPU-accelerated nodes for SAR image formation. In Proceedings of the IEEE International Conference on Cluster Computing, Workshop on Parallel Programming on Accelerator Clusters (PPAC'09).

[63]

Hary, S. L. and Ozguner, F. 1999. Precedence-constrained task allocation onto point-to-point networks for pipelined execution. IEEE Trans. Parallel Distrib. Syst. 10, 8, 838--851.

Digital Library

[64]

Hasan, W. and Motwani, R. 1994. Optimization algorithms for exploiting the parallelism-communication trade-off in pipelined parallelism. In Proceedings of the 20^th International Conference on Very Large Databases (VLDB'94). 36--47.

Digital Library

[65]

Hochbaum, D. S., ed. 1997. Approximation Algorithms for NP-Hard Problems. PWS Publishing.

Digital Library

[66]

Hochbaum, D. S. and Shmoys, D. B. 1987. Using dual approximation algorithms for scheduling problems: Practical and theoretical results. J. ACM 34, 144--162.

Digital Library

[67]

Hochbaum, D. S. and Shmoys, D. B. 1988. A polynomial approximation scheme for scheduling on uniform processors: Using the dual approximation approach. SIAM J. Comput. 17, 3, 539--551.

Digital Library

[68]

Hong, B. and Prasanna, V. K. 2003. Bandwidth-aware resource allocation for heterogeneous computing systems to maximize throughput. In Proceedings of the 32th International Conference on Parallel Processing (ICPP'03). IEEE Computer Society Press.

[69]

Iqbal, M. A. 1992. Approximate algorithms for partitioning problems. Int. J. Parallel Program. 20, 5, 341--361.

[70]

Isard, M., Budiu, M., Yu, Y., Birrell, A., and Fetterly, D. 2007. Dryad: Distributed data-parallel programs from sequential building blocks. In Proceedings of the 2^nd ACM SIGOPS European Conference on Computer Systems (EuroSys'07). ACM, New York, 59--72.

Digital Library

[71]

Jejurikar, R., Pereira, C., and Gupta, R. 2004. Leakage aware dynamic voltage scaling for real-time embedded systems. In Proceedings of the 41^st Annual Design Automation Conference (DAC'04). ACM, New York, 275--280.

Digital Library

[72]

Jonsson, J. and Vasell, J. 1996. Real-time scheduling for pipelined execution of data flow graphs on a realistic multiprocessor architecture. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP'96). Vol. 6, IEEE, 3314--3317.

Digital Library

[73]

Kahn, G. 1974. The semantics of simple language for parallel programming. In Proceedings of the IFIP Congress. 471--475.

[74]

Kennedy, K. and Allen, J. R. 2002. Optimizing Compilers for Modern Architectures: A Dependence-Based Approach. Morgan Kaufmann, San Fransisco.

Digital Library

[75]

Kijsipongse, E. and Ngamsuriyaroj, S. 2010. Placing pipeline stages on a grid: Single path and multipath pipeline execution. Future Generat. Comput. Syst. 26, 1, 50--62.

Digital Library

[76]

Kim, J., Gil, Y., and Spraragen, M. 2004. A knowledge-based approach to interactive workflow composition. In Proceedings of the 14th International Conference on Automatic Planning and Scheduling (ICAPS 04).

[77]

Knobe, K., Rehg, J. M., Chauhan, A., Nikhil, R. S., and Ramachandran, U. 1999. Scheduling constrained dynamic applications on clusters. In Proceedings of the ACM/IEEE Conference on Supercomputing. ACM, New York, 46.

Digital Library

[78]

Kwok, Y.-K. and Ahmad, I. 1999a. Benchmarking and comparison of the task graph scheduling algorithms. J. Parallel Distrib. Comput. 59, 3, 381--422.

Digital Library

[79]

Kwok, Y.-K. and Ahmad, I. 1999b. Static scheduling algorithms for allocating directed task graphs to multiprocessors. ACM Comput. Surv. 31, 4, 406--471.

Digital Library

[80]

Lee, E. A. and Parks, T. M. 1995. Dataflow process networks. Proc. IEEE 83, 5, 773--801.

[81]

Lee, M., Liu, W., and Prasanna, V. K. 1998. A mapping methodology for designing software task pipelines for embedded signal processing. In Proceedings of the Workshop on Embedded HPC Systems and Applications of IPPS/SPDP. 937--944.

[82]

Lenstra, J. K., Shmoys, D. B., and Tardos, E. 1990. Approximation algorithms for scheduling unrelated parallel machines. Math. Program. 46, 259--271.

Digital Library

[83]

Lepere, R. and Trystram, D. 2002. A new clustering algorithm for large communication delays. In Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS'02). IEEE Computer Society Press.

Digital Library

[84]

Levner, E., Kats, V., Pablo, D. A. L. D., and Cheng, T.C.E. 2010. Complexity of cyclic scheduling problems: A state-of-the-art survey. Comput. Industr. Engin. 59, 352--361.

Digital Library

[85]

Litzkow, M. J., Livny, M., and Mutka, M. W. 1988. Condor-A hunter of idle workstations. In Proceedings of the 8th International Conference on Distributed Computing Systems. 104--111.

[86]

Mackenzie-Graham, A., Payan, A., Dinov, I. D., Horn, J. D. V., and Toga, A. W. 2008. Neuroimaging data provenance using the LONI pipeline workflow environment. In Proceedings of the Provenance and Annotation of Data, International Provenance and Annotation Workshop (IPAW'08). 208--220.

Digital Library

[87]

Manne, F. and Olstad, B. 1995. Efficient partitioning of sequences. IEEE Trans. Comput. 44, 11, 1322--1326.

Digital Library

[88]

Microsoft. 2009. AXUM webpage. http://msdn.microsoft.com/en-us/devlabs/dd795202.aspx.

[89]

Mills, M. P. 1999. The Internet Begins with Coal: A Preliminary Exploration of the Impact of the Internet on Electricity Consumption: A Green Policy Paper for the Greening Earth Society. Mills-McCarthy & Associates.

[90]

Moreno, A., Cesar, E., Guevara, A., Sorribes, J., Margalef, T., and Luque, E. 2008. Dynamic pipeline mapping (DPM). http://www.sciencedirect.com/science/article/pii/S0167819111001566.

[91]

Nicol, D. 1994. Rectilinear partitioning of irregular data parallel computations. J. Parallel Distrib. Comput. 23, 119--134.

Digital Library

[92]

Nikolov, H., Thompson, M., Stefanov, T., Pimentel, A. D., Polstra, S., Bose, R., Zissulescu, C., and Deprettere, E. F. 2008. Daedalus: Toward composable multimedia MP-SoC design. In Proceedings of the 45th Annual Design Automation Conference (DAC'08). ACM, New York, 574--579.

Digital Library

[93]

Oinn, T., Greenwood, M., Addis, M., Alpdemir, N., Ferris, J., Glover, K., Goble, C., Goderis, A., Hull, D., Marvin, D., Li, P., Lord, P., Pocock, M., Senger, M., Stevens, R., Wipat, A., and Wroe, C. 2006. Taverna: Lessons in creating a work- flow environment for the life sciences. Concurr. Comput. Pract. Exper. 18, 10, 1067--1100.

Digital Library

[94]

Okuma, T., Yasuura, H., and Ishihara, T. 2001. Software energy reduction techniques for variable-voltage processors. IEEE Des. Test Comput. 18, 2, 31--41.

Digital Library

[95]

Papadimitriou, C. H. and Yannakakis, M. 2000. On the approximability of trade-offs and optimal access of web sources. In Proceedings of the 41^st Annual Symposium on Foundations of Computer Science (FOCS'00). 86--92.

Digital Library

[96]

Pecero-Sanchez, J. E. and Trystram, D. 2005. A new genetic convex clustering algorithm for parallel time minimization with large communication delays. In PARCO (John von Neumann Institute for Computing Series), G. R. Joubert, W. E. Nagel, F. J. Peters, O. G. Plata, P. Tirado, and E. L. Zapata, Eds., vol. 33, Central Institute for Applied Mathematics, Julich, Germany, 709--716.

[97]

Pinar, A. and Aykanat, C. 2004. Fast optimal load balancing algorithms for 1D partitioning. J. Parallel Distrib. Comput. 64, 8, 974--996.

Digital Library

[98]

Pinar, A., Tabak, E. K., and Aykanat, C. 2008. One-dimensional partitioning for heterogeneous systems: Theory and practice. J. Parallel Distrib. Comput. 68, 1473--1486.

Digital Library

[99]

Prathipati, R. B. 2004. Energy efficient scheduling techniques for real-time embedded systems. Master's thesis, Texas A&M University.

[100]

Ranaweera, S. and Agrawal, D. P. 2001. Scheduling of periodic time critical applications for pipelined execution on heterogeneous systems. In Proceedings of the International Conference on Parallel Processing (ICPP'01). IEEE Computer Society, 131--140.

Digital Library

[101]

Rayward-Smith, V. J. 1987. UET scheduling with interprocessor communication delays. Discr. Appl. Math. 18, 55--71.

Digital Library

[102]

Rayward-Smith, V. J., Burton, F. W., and Janacek, G. J. 1995. Scheduling parallel program assuming preallocation. In Scheduling Theory and its Applications, P. Chretienne, E. G. Coffman Jr., J. K. Lenstra, and Z. Liu, Eds., Wiley, 146--165.

[103]

Reinders, J. 2007. Intel Threading Building Blocks. O' Reilly.

Digital Library

[104]

Rowe, A., Kalaitzopoulos, D., Osmond, M., Ghanem, M., and Guo, Y. 2003. The discovery net system for high throughput bioinformatics. Bioinf. 19, 1, 225--231.

[105]

Saif, T. and Parashar, M. 2004. Understanding the behavior and performance of non-blocking communications in MPI. In Proceedings of the European Conference on Parallel Processing (EuroPar'04). Lecture Notes in Computer Science, vol. 3149, Springer, 173--182.

[106]

Sertel, O., Kong, J., Shimada, H., Catalyurek, U. V., Saltz, J. H., and Gurcan, M. N. 2009. Computer-aided prognosis of neuroblastoma on whole-slide images: Classification of stromal development. Pattern Recogn. 42, 6, 1093--1103.

Digital Library

[107]

Spencer, M., Ferreira, R., Beynon, M. D., Kurc, T., Catalyurek, U. V., Sussman, A., and Saltz, J. 2002. Executing multiple pipelined data analysis operations in the grid. In Proceedings of the ACM/IEEE Conference on Supercomputing. IEEE Computer Society Press, 1--18.

Digital Library

[108]

Subhlok, J. and Vondran, G. 1995. Optimal mapping of sequences of data parallel tasks. In Proceedings of the 5th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'95). ACM, New York, 134--143.

Digital Library

[109]

Subhlok, J. and Vondran, G. 1996. Optimal latency-throughput tradeoffs for data parallel pipelines. In Proceedings of the 8th Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA'96). ACM Press, New York, 62--71.

Digital Library

[110]

Suhendra, V., Raghavan, C., and Mitra, T. 2006. Integrated scratchpad memory optimization and task scheduling for MPSoC architectures. In Proceedings of the ACM/IEEE International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES'06).

Digital Library

[111]

Tannenbaum, T., Wright, D., Miller, K., and Livny, M. 2001. Condor- A distributed job scheduler. In Beowulf Cluster Computing with Linux, T. Sterling, Ed. MIT Press.

Digital Library

[112]

Taura, K. and Chien, A. 1999. A heuristic algorithm for mapping communicating tasks on heterogeneous resources. In Proceedings of the Heterogeneous Computing Workshop (HCW'99). IEEE Computer Society Press, 102--115.

Digital Library

[113]

Taylor, V., Wu, X., and Stevens, R. 2003. Prophesy: An infrastructure for performance analysis and modeling of parallel and grid applications. SIGMETRICS Perform. Eval. Rev. 30, 4, 13--18.

Digital Library

[114]

Teodoro, G., Fireman, D., Guedes, D., Meira, W. Jr., and Ferreira, R. 2008. Achieving multi-level parallelism in filter-labeled stream programming model. In Proceedings of the 37th International Conference on Parallel Processing (ICPP'08).

Digital Library

[115]

Thain, D., Tannenbaum, T., and Livny, M. 2002. Condor and the grid. In Grid Computing: Making the Global Infrastructure a Reality, F. Berman, G. Fox, and T. Hey, Eds., John Wiley & Sons.

[116]

T'Kindt, V. and Billaut, J.-C. 2007. Multicriteria Scheduling. Springer.

[117]

T'Kindt, V., Bouibede-Hocine, K., and Esswein, C. 2007. Counting and enumeration complexity with application to multicriteria scheduling. Ann. Oper. Res. 153, 215--234.

[118]

Valdes, J., Tarjan, R. E., and Lawler, E. L. 1982. The recognition of series parallel digraphs. SIAM J. Comput. 11, 2, 298--313.

Digital Library

[119]

Vydyanathan, N., Catalyurek, U. V., Kurc, T. M., Sadayappan, P., and Saltz, J. H. 2007. Toward optimizing latency under throughput constraints for application workflows on clusters. In Proceedings of the European Conference on Parallel Processing (EuroPar'07). 173--183.

Digital Library

[120]

Vydyanathan, N., Catalyurek, U. V., Kurc, T. M., Sadayappan, P., and Saltz, J. H. 2010. Optimizing latency and throughput of application workflows on clusters. Parallel Comput. 37, 10--11, 694--712.

Digital Library

[121]

Wang, L., Von Laszewski, G., Dayal, J., and Wang, F. 2010. Towards energy aware scheduling for precedence constrained parallel tasks in a cluster with DVFS. In Proceedings of the 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing (CCGrid'10). 368--377.

Digital Library

[122]

Wolfe, M. 1989. Optimizing Supercompilers for Supercomputers. MIT Press, Cambridge MA.

Digital Library

[123]

Wolski, R., Spring, N. T., and Hayes, J. 1999. The network weather service: A distributed resource performance forecasting service for metacomputing. Future Gener. Comput. Syst. 15, 5--6, 757--768.

Digital Library

[124]

Yang, M.-T., Kasturi, R., and Sivasubramaniam, A. 2003. A pipeline-based approach for scheduling video processing algorithms on NOW. IEEE Trans. Parallel Distrib. Syst. 14, 2, 119--130.

Digital Library

[125]

Yao, F., Demers, A., and Shenker, S. 1995. A scheduling model for reduced CPU energy. In Proceedings of the 36th Annual Symposium on Foundations of Computer Science (FOCS'95). IEEE Computer Society, 374.

Digital Library

[126]

Yu, J. and Buyya, R. 2005. A taxonomy of workflow management systems for grid computing. J. Grid Comput. 3, 3, 171--200.

Cited By

Chen RLi FLuna DRanawaka ISong FPamidighantam SLiang XLiang Y(2024)Asynchronous modeling workflows in CyberWater with on-demand HPC/Cloud accessFuture Generation Computer Systems10.1016/j.future.2024.04.023159:C(307-322)Online publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1016/j.future.2024.04.023
Viriyasitavat WDa Xu LDhiman GSapsomboon APungpapong VBi Z(2023)Service Workflow: State-of-the-Art and Future TrendsIEEE Transactions on Services Computing10.1109/TSC.2021.312139416:1(757-772)Online publication date: 1-Jan-2023
https://doi.org/10.1109/TSC.2021.3121394
Imes CKing DWalters J(2023)Distributed Edge Machine Learning Pipeline Scheduling with Reverse Auctions2023 Eighth International Conference on Fog and Mobile Edge Computing (FMEC)10.1109/FMEC59375.2023.10306169(196-203)Online publication date: 18-Sep-2023
https://doi.org/10.1109/FMEC59375.2023.10306169
Show More Cited By

Index Terms

A survey of pipelined workflow scheduling: Models and algorithms
1. Computer systems organization
  1. Architectures
    1. Parallel architectures
2. Theory of computation
  1. Design and analysis of algorithms

Recommendations

Scheduling a single server in a two-machine flow shop

We study the problem of scheduling a single server that processes n jobs in a two-machine flow shop environment. A machine dependent setup time is needed whenever the server switches from one machine to the other. The problem with a given job sequence ...
Enabling Workflow-Aware Scheduling on HPC Systems
HPDC '17: Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing

Scientific workflows are increasingly common in the workloads of current High Performance Computing (HPC) systems. However, HPC schedulers do not incorporate workflow-specific mechanisms beyond the capacity to declare dependencies between their jobs. ...
Scheduling of scientific workflow in non-dedicated heterogeneous multicluster platform

Many scientific workflows can be structured as Parallel Task Graphs (PTGs), that is, graphs of data-parallel tasks. Adding data parallelism to a workflow provides opportunities for higher performance and scalability. Workflow tasks are data-parallel and ...

Comments

Information & Contributors

Information

Published In

cover image ACM Computing Surveys

ACM Computing Surveys Volume 45, Issue 4

August 2013

490 pages

ISSN:0360-0300

EISSN:1557-7341

DOI:10.1145/2501654

Issue’s Table of Contents

Copyright © 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 August 2013

Accepted: 01 August 2012

Revised: 01 September 2011

Received: 01 June 2010

Published in CSUR Volume 45, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

28
Total Citations
View Citations
1,274
Total Downloads

Downloads (Last 12 months)55
Downloads (Last 6 weeks)5

Reflects downloads up to 12 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Chen RLi FLuna DRanawaka ISong FPamidighantam SLiang XLiang Y(2024)Asynchronous modeling workflows in CyberWater with on-demand HPC/Cloud accessFuture Generation Computer Systems10.1016/j.future.2024.04.023159:C(307-322)Online publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1016/j.future.2024.04.023
Viriyasitavat WDa Xu LDhiman GSapsomboon APungpapong VBi Z(2023)Service Workflow: State-of-the-Art and Future TrendsIEEE Transactions on Services Computing10.1109/TSC.2021.312139416:1(757-772)Online publication date: 1-Jan-2023
https://doi.org/10.1109/TSC.2021.3121394
Imes CKing DWalters J(2023)Distributed Edge Machine Learning Pipeline Scheduling with Reverse Auctions2023 Eighth International Conference on Fog and Mobile Edge Computing (FMEC)10.1109/FMEC59375.2023.10306169(196-203)Online publication date: 18-Sep-2023
https://doi.org/10.1109/FMEC59375.2023.10306169
Bocewicz GBocewicz G(2023)Concurrent Cyclic ProcessesDeclarative Models of Concurrent Cyclic Processes10.1007/978-3-031-40552-5_1(1-36)Online publication date: 31-Oct-2023
https://doi.org/10.1007/978-3-031-40552-5_1
Zhang YZhang AZhang DKang ZLiang Y(2022)Design and Development of Maritime Data Security Management PlatformApplied Sciences10.3390/app1202080012:2(800)Online publication date: 13-Jan-2022
https://doi.org/10.3390/app12020800
Stavrinides GKaratza H(2022)Data-Aware Resource Allocation of Linear Pipeline Applications in a Distributed Environment2022 13th International Conference on Information and Communication Systems (ICICS)10.1109/ICICS55353.2022.9811176(121-126)Online publication date: 21-Jun-2022
https://doi.org/10.1109/ICICS55353.2022.9811176
Ali IBagchi S(2021)A Comparative Study and Algorithmic Analysis of Workflow Decomposition in Distributed SystemsResearch Anthology on Architectures, Frameworks, and Integration Strategies for Distributed and Cloud Computing10.4018/978-1-7998-5339-8.ch112(2282-2315)Online publication date: 2021
https://doi.org/10.4018/978-1-7998-5339-8.ch112
Valdez-Vivas MSharma VStanisha NLi SMi LJiang WKalinin AMetzler JZhu FChin Ooi BMiao CWang HSkrypnyk IHsu WChawla S(2021)Clockwork: A Delay-Based Global Scheduling Framework for More Consistent Landing Times in the Data WarehouseProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining10.1145/3447548.3467119(3627-3637)Online publication date: 14-Aug-2021
https://dl.acm.org/doi/10.1145/3447548.3467119
Helala MQureshi FPu K(2020)A Stream Algebra for Performance Optimization of Large Scale Computer Vision PipelinesIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2020.3015867(1-1)Online publication date: 2020
https://doi.org/10.1109/TPAMI.2020.3015867
Soldavini SAlarcon SLukowiak M(2020)Using Reduced Graphs for Efficient HLS Scheduling2020 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS45731.2020.9181274(1-5)Online publication date: Oct-2020
https://doi.org/10.1109/ISCAS45731.2020.9181274
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents