Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1542452.1542464acmconferencesArticle/Chapter ViewAbstractPublication PagescpsweekConference Proceedingsconference-collections
research-article

Push-assisted migration of real-time tasks in multi-core processors

Published: 19 June 2009 Publication History

Abstract

Multicores are becoming ubiquitous, not only in general-purpose but also embedded computing. This trend is a reflexion of contemporary embedded applications posing steadily increasing demands in processing power. On such platforms, prediction of timing behavior to ensure that deadlines of real-time tasks can be met is becoming increasingly difficult. While real-time multicore scheduling approaches help to assure deadlines based on firm theoretical properties, their reliance on task migration poses a significant challenge to timing predictability in practice. Task migration actually (a) reduces timing predictability for contemporary multicores due to cache warm-up overheads while (b) increasing traffic on the network-on-chip (NoC) interconnect.
This paper puts forth a fundamentally new approach to increase the timing predictability of multicore architectures aimed at task migration in embedded environments. A task migration between two cores imposes cache warm-up overheads on the migration target, which can lead to missed deadlines for tight real-time schedules. We propose novel micro-architectural support to migrate cache lines. Our scheme shows dramatically increased predictability in the presence of cross-core migration.
Experimental results for schedules demonstrate that our scheme enables real-time tasks to meet their deadlines in the presence of task migration. Our results illustrate that increases in execution time due to migration is reduced by our scheme to levels that may prevent deadline misses of real-time tasks that would otherwise occur. Our mechanism imposes an overhead at a fraction of the task's execution time, yet this overhead can be steered to fill idle slots in the schedule, i.e., it does not contribute to the execution time of the migrated task. Overall, our novel migration scheme provides a unique mechanism capable of significantly increasing timing predictability in the wake of task migration.

References

[1]
Tera-scale research prototype: Connecting 80 simple sores on a single test chip. ftp://download.intel.com/research/platform/terascale/terascaleresearchprototypebackgrounder.pdf.
[2]
Tilera processor family. http://www.tilera.com/products/processors.php.
[3]
Wcet project benchmarks, 2007. http://www.mrtc.mdh.se/projects/wcetbenchmarks.html.
[4]
A. Acquaviva, A. Alimonda, S. Carta, and M. Pittau. Assessing task migration impact on embedded soft real-time streaming multimedia applications. EURASIP J. Embedded Syst., 2008(2):1?15, 2008.
[5]
J. Anderson, J. Calandrino, and U. Devi. Real-time scheduling on multicore platforms. In IEEE Real-Time Embedded Technology and Applications Symposium, pages 179--190, Apr. 2006.
[6]
J. Anderson and A. Srinivasan. Early-release fair scheduling. In Euromicro Conference on Real-Time Systems, pages 35--43, June 2000.
[7]
J. Anderson and A. Srinivasan. Mixed pfair/erfair scheduling of asynchronous periodic tasks. In Euromicro Conference on Real-Time Systems, pages 76--85, June 2001.
[8]
R. Arnold, F. Mueller, D. B. Whalley, and M. Harmon. Bounding worst-case instruction cache performance. In IEEE Real-Time Systems Symposium, pages 172--181, Dec. 1994.
[9]
S. Baruah. Techniques for multiprocessor global schedulability analysis. In IEEE Real--Time Systems Symposium, pages 119--128, 2007.
[10]
S. Baruah, N. Cohen, C. Plaxton, and D. Varvel. Proportionate progress: A notion of fairness in resource allocation. Algorithmica, 15:600--625, 1996.
[11]
S. Bertozzi, A. Acquaviva, D. Bertozzi, and A. Poggiali. Supporting task migration in multi-processor systems-on-chip: a feasibility study. In Proceedings of the conference on Design, automation and test in Europe, pages 15--20, 2006.
[12]
A. Burchard, J. Liebeherr, Y. Oh, and S. Son. New strategies for assigning real-time tasks to multiprocessor systems. IEEE Trans. on Computers, 44(12):1429--1442, 1995.
[13]
J. Calandrino and J. Anderson. Cache-aware real-time scheduling on multicore platforms: Heuristics and a case study. In Euromicro Conference on Real-Time Systems, pages 209?308, July 2008.
[14]
D. Chandra, F. Guo, S. Kim, and Y. Solihin. Predicting interthread cache contention on a chip multi-processor architecture. In International Symposium on High Performance Computer Architecture, pages 340--351, 2005.
[15]
R. Chandra, R. Menon, L. Dagum, D. Kohr, D. Maydan, and J. McDonald. Parallel Programming in OpenMP. Morgan Kaufmann Publishers, Los Altos, CA 94022, USA, 2000.
[16]
J. Chang and G. S. Sohi. Cooperative caching for chip multiprocessors. In International Symposium on Computer Architecture, pages 264--276, 2006.
[17]
Z. Chishti, M. D. Powell, and T. N. Vijaykumar. Optimizing replication, communication, and capacity allocation in cmps. In International Symposium on Computer Architecture, pages 357--368, 2005.
[18]
D. Choffnes, M. Astley, and M. J.Ward. Migration policies for multicore fair-share scheduling. ACM SIGOPS Operating Systems Review, 42:92--93, 2008.
[19]
S. Dhall and C. Liu. On a real-time scheduling problem. Operations Research, 26(1):127--140, 1978.
[20]
N. Eisley, L.-S. Peh, and L. Shang. In-network cache coherence. In International Symposium on Microarchitecture, pages 321--332, 2006.
[21]
N. Eisley, L.-S. Peh, and L. Shang. Leveraging on-chip networks for data cache migration in chip multiprocessors. In International conference on Parallel architectures and compilation techniques, pages 197--207, 2008.
[22]
A. Fedorova, M. Seltzer, and M. Smith. Cache-fair thread scheduling for multicore processors. Technical Report TR-17-06, Harvard University, Oct. 2006.
[23]
J. Gummaraju and M. Rosenblum. Stream programming on general-purpose processors. In International Symposium on Microarchitecture, pages 343--354, 2005.
[24]
D. Hardy and I. Puaut. Wcet analysis of multi-level non-inclusive setassociative instruction caches. In Proceedings of Real-Time Systems Symposium, pages 456--466, 2008.
[25]
R. Iyer. Cqos: a framework for enabling qos in shared caches of cmp platforms. In Proceedings of international conference on Supercomputing, pages 257--266, 2004.
[26]
N. Jerger, M. Lipasti, and L. Peh. Virtual tree coherence: Leveraging regions and in-network multicast trees for scalable cache coherence. In International Symposium on Microarchitecture, pages 35--46, Nov. 2008.
[27]
S. W. Kim, M. Voss, B. Kuhn, H.-C. Hoppe, and W. Nagel. Vgv: Supporting performance analysis of object-oriented parallel applications. In Proc. of IPDPS'2002 (HIPS'2002): Workshop on High-Level Parallel Programming Models and Supportive Environments, pages 108--115, Apr. 2002.
[28]
S. Lauzac, R. Melhem, and D. Mosse. Comparison of global and partitioning schemes for scheduling rate monotonic tasks on a multiprocessor. In Euromicro Workshop on Real-Time Systems, pages 188--195, 1998.
[29]
C. Lee, J. Hahn, Y. Seo, S. Min, R. Ha, S. Hong, C. Park, M. Lee, and C. Kim. Analysis of cache-related preemption delay in fixed-priority preemptive scheduling. In IEEE Real-Time Systems Symposium, pages 700--713, Dec. 1996.
[30]
T. Li, D. Baumberger, D. A. Koufaty, and S. Hahn. Efficient operating system scheduling for performance-asymmetric multi-core architectures. In In ACM/IEEE conference on Supercomputing, pages 1--11, Nov. 2007.
[31]
T. Li, P. Brett, B. Hohlt, R. Knauerhase, S. McElderry, and S. Hahn. Operating system support for shared-isa asymmetric multi-core architectures. In Workshop on the Interaction between Operating Systems and Computer Architecture, pages 19--26, June 2008.
[32]
J. Liu. Real-Time Systems. Prentice Hall, 2000.
[33]
M. R. Marty, J. D. Bingham, M. D. Hill, A. J. Hu, M. M. K. Martin, and D. A. Wood. Improving multiple-cmp systems using token coherence. In International Symposium on High Performance Computer Architecture, pages 328?339, 2005.
[34]
M. R. Marty and M. D. Hill. Virtual hierarchies to support server consolidation. In International Symposium on Computer Architecture, pages 46--56, 2007.
[35]
M. Moir and S. Ramamurthy. Pfair scheduling of fixed and migrating periodic tasks on multiple resources. In IEEE Real-Time Systems Symposium, pages 294--303, Dec. 1999.
[36]
F. Mueller. Timing predictions for multi-level caches. In ACM SIGPLAN Workshop on Language, Compiler, and Tool Support for Real-Time Systems, pages 29--36, June 1997.
[37]
F. Mueller. Timing analysis for instruction caches. Real-Time Systems, 18(2/3):209--239, May 2000.
[38]
H. Ramaprasad and F. Mueller. Tightening the bounds on feasible preemptions. Transactions on Embedded Computing Systems, Mar. 2008 (accepted).
[39]
J. Renau, B. Fragela, J. Tuck, W. Liu, L. Ceze, S. Sarangi, P. Sack, and a. P. M. K. Strauss. Sesc simulator. http://sesc.sourceforge.net, Jan. 2005.
[40]
S. Ryoo, C. I. Rodrigues, S. S. Baghsorkhi, S. S. Stone, D. B. Kirk, and W. mei W. Hwu. Optimization principles and application performance evaluation of a multithreaded gpu using cuda. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 73--82, 2008.
[41]
A. Srinivasan and J. Anderson. Optimal rate-based scheduling on multiprocessors. In ACM Symposium on Theory of Computing, pages 189--198, May 2002.
[42]
J. Staschulat and R. Ernst. Multiple process execution in cache related preemption delay analysis. In In international conference on Embedded software, pages 278--286, 2004.
[43]
J. Staschulat, S. Schliecker, and R. Ernst. Scheduling analysis of real-time systems with precise modeling of cache related preemption delay. In Euromicro Conference on Real-Time Systems, pages 41--48, 2005.
[44]
K. Strauss, X. Shen, and J. Torrellas. Uncorq: Unconstrained snoop request delivery in embedded-ring multiprocessors. In International Symposium on Microarchitecture, pages 327--342, 2007.
[45]
W. Thies, M. Karczmarek, and S. P. Amarasinghe. Streamit: A language for streaming applications. In Compiler Construction, pages 179--196, 2002.
[46]
J. Wegener and F. Mueller. A comparison of static analysis and evolutionary testing for the verification of timing constraints. Real-Time Systems, 21(3):241--268, Nov. 2001.
[47]
R. Wilhelm, J. Engblohm, A. Ermedahl, N. Holsti, S. Thesing, D. Whalley, G. Bernat, C. Ferdinand, R. Heckmann, T. Mitra, F. Mueller, I. Puaut, P. Puschner, J. Staschulat, and P. Stenstrom. The worst-case execution time problem -- overview of methods and survey of tools. ACM Transactions on Embedded Computing Systems, 7(3):1--53, Apr. 2008.
[48]
www.openmp.org. Official OpenMP Specification, May 2005.
[49]
J. Yan and W. Zhang. Time-predictable l2 caches for real-time multicore processors. In Work in Progress session of IEEE Real-Time Systems Symposium, Dec. 2007.
[50]
J. Yan and W. Zhang. Wcet analysis of multi-core processors. In Work in Progress session of IEEE Real-Time Systems Symposium, Dec. 2007.
[51]
J. Yan and W. Zhang. Wcet analysis for multi-core processors with shared l2 instruction caches. In IEEE Real-Time Embedded Technology and Applications Symposium, pages 80--89, Apr. 2008.
[52]
M. Zhang and K. Asanovic. Victim migration: Dynamically adapting between private and shared cmp caches. TR 2005-064, MIT CSAIL, 2005.

Cited By

View all
  • (2022) Terminator: A Secure Coprocessor to Accelerate Real-Time AntiViruses Using Inspection BreakpointsACM Transactions on Privacy and Security10.1145/349453525:2(1-34)Online publication date: 4-Mar-2022
  • (2016)Global Scheduling Not Required: Simple, Near-Optimal Multiprocessor Real-Time Scheduling with Semi-Partitioned Reservations2016 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS.2016.019(99-110)Online publication date: Nov-2016
  • (2016)Performance/energy aware task migration algorithm for many-core chipsIET Computers & Digital Techniques10.1049/iet-cdt.2015.013110:4(165-173)Online publication date: 1-Jul-2016
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
LCTES '09: Proceedings of the 2009 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
June 2009
188 pages
ISBN:9781605583563
DOI:10.1145/1542452
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 44, Issue 7
    LCTES '09
    July 2009
    176 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/1543136
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 June 2009

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. multi-core architectures
  2. real-time systems
  3. task migration.
  4. timing analysis

Qualifiers

  • Research-article

Conference

Acceptance Rates

LCTES '09 Paper Acceptance Rate 18 of 81 submissions, 22%;
Overall Acceptance Rate 116 of 438 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2022) Terminator: A Secure Coprocessor to Accelerate Real-Time AntiViruses Using Inspection BreakpointsACM Transactions on Privacy and Security10.1145/349453525:2(1-34)Online publication date: 4-Mar-2022
  • (2016)Global Scheduling Not Required: Simple, Near-Optimal Multiprocessor Real-Time Scheduling with Semi-Partitioned Reservations2016 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS.2016.019(99-110)Online publication date: Nov-2016
  • (2016)Performance/energy aware task migration algorithm for many-core chipsIET Computers & Digital Techniques10.1049/iet-cdt.2015.013110:4(165-173)Online publication date: 1-Jul-2016
  • (2015)Hard Real-Time scheduling on a multicore platform2015 Annual IEEE Systems Conference (SysCon) Proceedings10.1109/SYSCON.2015.7116771(324-331)Online publication date: Apr-2015
  • (2015)Hardware task migration module for improved fault tolerance and predictability2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)10.1109/SAMOS.2015.7363676(197-202)Online publication date: Jul-2015
  • (2015)Architecture aware semi partitioned real-time scheduling on multicore platformsReal-Time Systems10.1007/s11241-015-9221-451:3(274-313)Online publication date: 1-Jun-2015
  • (2014)Network-on-Chip aware scheduling of hard-real-time tasksProceedings of the 9th IEEE International Symposium on Industrial Embedded Systems (SIES 2014)10.1109/SIES.2014.6871198(141-150)Online publication date: Jun-2014
  • (2014)Interference-aware fixed-priority schedulability analysis on multiprocessorsReal-Time Systems10.1007/s11241-013-9198-950:4(411-455)Online publication date: 1-Jul-2014
  • (2012)Packet Triggered Prediction Based Task Migration for Network-on-Chip2012 20th Euromicro International Conference on Parallel, Distributed and Network-based Processing10.1109/PDP.2012.37(491-498)Online publication date: Feb-2012
  • (2012)Semi-Partitioned Hard-Real-Time Scheduling under Locked Cache Migration in Multicore SystemsProceedings of the 2012 24th Euromicro Conference on Real-Time Systems10.1109/ECRTS.2012.27(331-340)Online publication date: 11-Jul-2012
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media