Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

K2: a mobile operating system for heterogeneous coherence domains

Published: 24 February 2014 Publication History

Abstract

Mobile System-on-Chips (SoC) that incorporate heterogeneous coherence domains promise high energy efficiency to a wide range of mobile applications, yet are difficult to program. To exploit the architecture, a desirable, yet missing capability is to replicate operating system (OS) services over multiple coherence domains with minimum inter-domain communication. In designing such an OS, we set three goals: to ease application development, to simplify OS engineering, and to preserve the current OS performance. To this end, we identify a shared-most OS model for multiple coherence domains: creating per-domain instances of core OS services with no shared state, while enabling other extended OS services to share state across domains. To test the model, we build K2, a prototype OS on the TI OMAP4 SoC, by reusing most of the Linux 3.4 source. K2 presents a single system image to applications with its two kernels running on top of the two coherence domains of OMAP4. The two kernels have independent instances of core OS services, such as page allocator and interrupt management, as coordinated by K2; the two kernels share most extended OS services, such as device drivers, whose state is kept coherent transparently by K2. Despite platform constraints and unoptimized code, K2 improves energy efficiency for light OS workloads by 8x-10x, while incurring less than 6% performance overhead for a device driver shared between kernels. Our experiences with K2 show that the shared-most model is promising.

References

[1]
Y. Agarwal, S. Hodges, R. Chandra, J. Scott, P. Bahl, and R. Gupta. Somniloquy: Augmenting network interfaces to reduce PC energy usage. In Proc. USENIX Symp. Networked Systems Design and Implementation (NSDI), pages 365--380, 2009.
[2]
G. Ammons, J. Appavoo, M. Butrico, D. Da Silva, D. Grove, K. Kawachiya, O. Krieger, B. Rosenburg, E. Van Hensbergen, and R. W. Wisniewski. Libra: a library operating system for a jvm in a virtualized execution environment. In Proc. Int. Conf. Virtual Execution Environments (VEE), pages 44--54. 2007.
[3]
J. Appavoo, D. D. Silva, O. Krieger, M. Auslander, M. Ostrowski, B. Rosenburg, A. Waterland, R. W. Wisniewski, J. Xenidis, M. Stumm, et al. Experience distributing objects in an SMMP OS. ACM Transactions on Computer Systems (TOCS), 25(3):6, 2007.
[4]
ARM. ARM v7-M architecture reference manual, 2010.
[5]
F. J. Ballesteros, N. Evans, C. Forsyth, G. Guardiola, J. McKie, R. Minnich, and E. Soriano. Nix: An operating system for high performance manycore computing. Bell Labs Technical Journal, 2012.
[6]
A. Baumann, P. Barham, P.-E. Dagand, T. Harris, R. Isaacs, S. Peter, T. Roscoe, A. Schüpbach, and A. Singhania. The multikernel: a new os architecture for scalable multicore systems. In Proc. ACM Symp. Operating Systems Principles (SOSP), pages 29--44. 2009.
[7]
S. Boyd-Wickizer, H. Chen, R. Chen, Y. Mao, M. F. Kaashoek, R. Morris, A. Pesterev, L. Stein, M. Wu, Y.-h. Dai, Y. Zhang, and Z. Zhang. Corey: An operating system for many cores. In Proc. USENIX Conf. Operating Systems Design and Implementation (OSDI), pages 43--57, 2008.
[8]
E. Bugnion, S. Devine, K. Govil, and M. Rosenblum. Disco: Running commodity operating systems on scalable multiprocessors. ACM Transactions on Computer Systems (TOCS), 15(4):412--447, 1997.
[9]
J. Chapin, M. Rosenblum, S. Devine, T. Lahiri, D. Teodosiu, and A. Gupta. Hive: Fault containment for shared-memory multiprocessors. In Proc. ACM Symp. Operating Systems Principles (SOSP), pages 12--25, 1995.
[10]
D. Cheriton. The V distributed system. Communications of the ACM, 31(3):314--333, 1988.
[11]
M. DeVuyst, A. Venkat, and D. M. Tullsen. Execution migration in a heterogeneous-isa chip multiprocessor. In Proc. ACM Int. Conf. Architectural Support for Programming Languages & Operating Systems (ASPLOS), pages 261--272, 2012.
[12]
eLinux.org. PandaBoard Power Measurements. http://elinux.org/PandaBoard_Power_Measurements.
[13]
B. Gamsa, O. Krieger, J. Appavoo, and M. Stumm. Tornado: Maximizing locality and concurrency in a shared memory multiprocessor operating system. In Proc. USENIX Conf. Operating Systems Design and Implementation (OSDI), pages 87--100, 1999.
[14]
I. Gelado, J. E. Stone, J. Cabezas, S. Patel, N. Navarro, and W.-m. W. Hwu. An asymmetric distributed shared memory model for heterogeneous parallel systems. In Proc. ACM Int. Conf. Architectural Support for Programming Languages & Operating Systems (ASPLOS), pages 347--358, 2010.
[15]
P. Greenhalgh. Big.LITTLE processing with ARM Cortex-A15 and Cortex-A7. Technical report, 2011.
[16]
K. Li and P. Hudak. Memory coherence in shared virtual memory systems. ACM Trans. Comput. Syst., 7(4):321--359, Nov. 1989.
[17]
F. X. Lin, Z. Wang, R. LiKamWa, and L. Zhong. Reflex: using low-power processors in smartphones without knowing them. In Proc. ACM Int. Conf. Architectural Support for Programming Languages & Operating Systems (ASPLOS), pages 13--24, 2012.
[18]
F. X. Lin, Z.Wang, and L. Zhong. Supporting distributed execution of smartphone workloads on loosely coupled heterogeneous processors. In Proc.Workshp. Power-Aware Computing and Systems (HotPower), pages 2--2, 2012.
[19]
A. Madhavapeddy, R. Mortier, C. Rotsos, D. Scott, B. Singh, T. Gazagnaire, S. Smith, S. Hand, and J. Crowcroft. Unikernels: Library operating systems for the cloud. In Proc. ACM Int. Conf. Architectural Support for Programming Languages & Operating Systems (ASPLOS), pages 461--472. 2013.
[20]
T. G. Mattson, M. Riepen, T. Lehnig, P. Brett, W. Haas, P. Kennedy, J. Howard, S. Vangal, N. Borkar, G. Ruhl, and S. Dighe. The 48-core SCC processor: the programmer's view. In Proc. ACM/IEEE Int. Conf. High Performance Computing, Networking, Storage and Analysis (SC), pages 1--11. 2010.
[21]
NICTA. Linux-panda project. http://www.ertos.nicta.com.au/downloads/linux-panda-m3.tbz2, 2012.
[22]
E. B. Nightingale, O. Hodson, R. McIlroy, C. Hawblitzel, and G. Hunt. Helios: heterogeneous multiprocessing with satellite kernels. In Proc. ACM Symp. Operating Systems Principles (SOSP), pages 221--234, 2009.
[23]
NVIDIA. Tegra2 Family: Technical reference manual, 2011.
[24]
NVIDIA. Tegra3 HD mobile processors: Technical reference manual, 2012.
[25]
D. E. Porter, S. Boyd-Wickizer, J. Howell, R. Olinsky, and G. C. Hunt. Rethinking the library os from the top down. In Proc. ACM Int. Conf. Architectural Support for Programming Languages & Operating Systems (ASPLOS), pages 291--304, 2011.
[26]
B. Priyantha, D. Lymberopoulos, and J. Liu. Littlerock: Enabling energy-efficient continuous sensing on mobile phones. Pervasive Computing, IEEE, 10(2):12--15, 2011.
[27]
M.-R. Ra, B. Priyantha, A. Kansal, and J. Liu. Improving energy efficiency of personal sensing applications with heterogeneous multi-processors. In Proc. Int. Conf. Ubiquitous Computing (UbiComp), pages 1--10. 2012.
[28]
L. Ryzhyk, P. Chubb, I. Kuz, and G. Heiser. Dingo: Taming device drivers. In Proc. The European Conf. Computer Systems (EuroSys), pages 275--288. 2009.
[29]
Samsung. Exynos 4210 application processor. http:// www.samsung.com/global/business/semiconductor/productInfo.do?fmly_id=844&partnum=Exynos%204210.
[30]
D. Scales, K. Gharachorloo, and C. Thekkath. Shasta: A low overhead, software-only approach for supporting fine-grain shared memory. ACM SIGOPS Operating Systems Review, 30(5):174--185, 1996.
[31]
SGI. Cellular IRIX 6.4 technical report. http://www.sgistuff.net/software/irixintro/documents/irix6.4TR.html.
[32]
Y. Shin, K. Shin, P. Kenkare, R. Kashyap, H.-J. Lee, D. Seo, B. Millar, Y. Kwon, R. Iyengar, M.-S. Kim, A. Chowdhury, S.- I. Bae, I. Hong, W. Jeong, A. Lindner, U. Cho, K. Hawkins, J. C. Son, and S. H. Hwang. 28nm high-metal-gate heterogeneous quad-core CPUs for high-performance and energyefficient mobile application processor. In Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC), pages 154--155. 2013.
[33]
P. Smith and N. C. Hutchinson. Heterogeneous process migration: The Tui system. Software-Practice and Experience, 28(6):611--640, 1998.
[34]
J. Sorber, N. Banerjee, M. D. Corner, and S. Rollins. Turducken: hierarchical power management for mobile devices. In Proc. USENIX/ACM Int. Conf. Mobile Systems, Applications, & Services (MobiSys), pages 261--274. 2005.
[35]
D. J. Sorin, M. D. Hill, and D. A. Wood. A primer on memory consistency and cache coherence. Synthesis Lectures on Computer Architecture, 6(3):1--212, 2011.
[36]
Texas Instruments. OMAP4 applications processor: Technical reference manual. http://www.ti.com/product/OMAP4470, 2010.
[37]
Texas Instruments. OMAP543x: Technical reference manual. http://www.ti.com/litv/pdf/swpu249v, 2010.
[38]
R. C. Unrau, O. Krieger, B. Gamsa, and M. Stumm. Hierarchical clustering: A structure for scalable multiprocessor operating system design. The Journal of Supercomputing, 9(1- 2):105--134, 1995.
[39]
C. A. Waldspurger. Memory resource management in VMware ESX server. SIGOPS Oper. Syst. Rev., 36(SI):181--194, Dec. 2002.
[40]
D. Wentzlaff and A. Agarwal. Factored operating systems (fos): the case for a scalable operating system for multicores. SIGOPS Oper. Syst. Rev., 43(2):76--85, 2009.
[41]
F. Xu, Y. Liu, T. Moscibroda, R. Chandra, L. Jin, Y. Zhang, and Q. Li. Optimizing background email sync on smartphones. In Proc. USENIX/ACM Int. Conf. Mobile Systems, Applications, & Services (MobiSys), 2013.
[42]
L. Zhong and N. K. Jha. Dynamic power optimization targeting user delays in interactive systems. IEEE Trans. Mobile Computing, 5(11):1473--1488, 2006.

Cited By

View all
  • (2019)GAIAProceedings of the 2019 USENIX Conference on Usenix Annual Technical Conference10.5555/3358807.3358864(661-674)Online publication date: 10-Jul-2019
  • (2019)ExtOSProceedings of the 10th ACM SIGOPS Asia-Pacific Workshop on Systems10.1145/3343737.3343742(31-39)Online publication date: 19-Aug-2019
  • (2019)Scheduling HPC workloads on heterogeneous-ISA architecturesProceedings of the 24th Symposium on Principles and Practice of Parallel Programming10.1145/3293883.3295717(409-410)Online publication date: 16-Feb-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News
ACM SIGARCH Computer Architecture News  Volume 42, Issue 1
ASPLOS '14
March 2014
729 pages
ISSN:0163-5964
DOI:10.1145/2654822
Issue’s Table of Contents
  • cover image ACM Conferences
    ASPLOS '14: Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
    February 2014
    780 pages
    ISBN:9781450323055
    DOI:10.1145/2541940
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 February 2014
Published in SIGARCH Volume 42, Issue 1

Check for updates

Author Tags

  1. coherence domains
  2. energy efficiency
  3. heterogeneous architecture
  4. mobile

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)32
  • Downloads (Last 6 weeks)0
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2019)GAIAProceedings of the 2019 USENIX Conference on Usenix Annual Technical Conference10.5555/3358807.3358864(661-674)Online publication date: 10-Jul-2019
  • (2019)ExtOSProceedings of the 10th ACM SIGOPS Asia-Pacific Workshop on Systems10.1145/3343737.3343742(31-39)Online publication date: 19-Aug-2019
  • (2019)Scheduling HPC workloads on heterogeneous-ISA architecturesProceedings of the 24th Symposium on Principles and Practice of Parallel Programming10.1145/3293883.3295717(409-410)Online publication date: 16-Feb-2019
  • (2018)TabulaROSA: Tabular Operating System Architecture for Massively Parallel Heterogeneous Compute Engines2018 IEEE High Performance extreme Computing Conference (HPEC)10.1109/HPEC.2018.8547577(1-8)Online publication date: Sep-2018
  • (2016)Trusted, Heterogeneous, and Autonomic Mobile CloudSecure System Design and Trustable Computing10.1007/978-3-319-14971-4_14(439-455)Online publication date: 2016
  • (2022)An OpenMP Runtime for Transparent Work Sharing across Cache-Incoherent Heterogeneous NodesACM Transactions on Computer Systems10.1145/350522439:1-4(1-30)Online publication date: 5-Jul-2022
  • (2022)A Case for Second-Level Software Cache Coherency on Many-Core Accelerators2022 IEEE International Workshop on Rapid System Prototyping (RSP)10.1109/RSP57251.2022.10038999(29-35)Online publication date: 13-Oct-2022
  • (2020)An OpenMP Runtime for Transparent Work Sharing Across Cache-Incoherent Heterogeneous NodesProceedings of the 21st International Middleware Conference10.1145/3423211.3425679(415-429)Online publication date: 7-Dec-2020
  • (2020)Scaling Shared Memory Multiprocessing Applications in Non-cache-coherent DomainsProceedings of the 13th ACM International Systems and Storage Conference10.1145/3383669.3398278(13-24)Online publication date: 30-May-2020
  • (2020)Ch’i: Scaling Microkernel Capabilities in Cache-Incoherent Systems2020 IEEE/ACM International Workshop on Runtime and Operating Systems for Supercomputers (ROSS)10.1109/ROSS51935.2020.00007(12-21)Online publication date: Nov-2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media