Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Exploring software partitions for fast security processing on a multiprocessor mobile SoC

Published: 01 June 2007 Publication History
  • Get Citation Alerts
  • Abstract

    The functionality of mobile devices, such as cell phones and personal digital assistants (PDAs), has evolved to include various applications where security is a critical concern (secure web transactions, mobile commerce, download and playback of protected audio/video content, connection to corporate private networks, etc.). Security mechanisms (e.g., secure communication protocols) involve cryptographic algorithms, and are often quite computationally intensive, challenging the constrained processing and battery resources of mobile devices. Extensive design effort and aggressive hardware and software optimizations are required to address this challenge. Previous work has addressed the design of hardware architectures (custom accelerators, domain-specific processors, etc.) to accelerate security processing, and many emerging systems-on-chip (SoCs) feature some form of hardware support for security. In this paper, we address the complementary problem of mapping a complex security software library to an SoC platform with security hard-ware enhancements. We present a systematic methodology for exploring the software architecture for security processing for a commercial heterogeneous multiprocessor SoC for mobile devices. The SoC contains multiple host processors executing applications and a dedicated programmable security processing engine. We developed an exploration methodology to map the code and data of security software libraries onto the platform, with the objective of maximizing the overall application-visible performance. The salient features of the methodology include: 1) the use of real performance measurements from a prototyping board, which contains the target platform, to drive the exploration; 2) a new data structure access profiling framework that allows us to accurately model the communication overheads involved in offloading a given set of functions to the security processor; and 3) an exact branch-and-bound-based design space exploration algorithm that determines the best mapping of security library functions and data structures to the host and security processors. We used the proposed framework to map a commercial security llbrary to the target mobile application SoC. The resulting optimized software architecture outperformed several manually designed software architectures, resulting in up to 12.5× speed-up for individual cryptographic operations (encryption, hashing) and 2.2-6.2× speed-up for applications such as a digital rights management (DRM) agent and secure sockets layer (SSL) client. We also demonstrate the applicability of our framework to software architecture exploration in other multiprocessor scenarios.

    References

    [1]
    {1} ePaynews, "Mobile commerce statistics," (2005). {Onlinel. Available: http://www.epaynews.com/statistics/mcommstats.html
    [2]
    {2} Trusted Computing Group, "Mobile phone specifications," (2007). {Online}. Available: https://www.trustedcomputinggroup.org/specs/ mobilephone
    [3]
    {3} Texas Instruments Inc., Dallas, TX, "OMAP platform--Overview," (2006). {Online}. Available: http://www.ti.com/sc/omap
    [4]
    {4} S. Torii et al., "A 600MIPS 120 mW 70 µ A leakage triple-CPU mobile application processor chip," in Proc. IEEE Solid-State Circuits Conf., 2005, pp. 136-138.
    [5]
    {5} S. W. Smith and S. Weingart, "Building a high-performance, programmable secure coprocessor," Comput. Netw., vol. 31, no. 9, pp. 831-860, Apr. 1999.
    [6]
    {6} S. W. Smith, E. R. Palmer, and S. Weingart, "Using a high-performance, programmable secure coprocessor," in Proc. Int. Conf. Financial Cryptography, 1998, pp. 73-89.
    [7]
    {7} L. Wu, C. Weaver, and T. Austin, "CryptoManiac: A fast flexible architecture for secure communication," in Proc. Int. Symp. Comput. Arch., 2001, pp. 110-119.
    [8]
    {8} I. Verbauwhede, P. Schaumont, and H. Kuo, "Design and performance testing of a 2.29 Gb/s Rijndael processor," IEEE J. Solid-State Circuits, vol. 38, no. 3, pp. 569-572, Mar. 2003.
    [9]
    {9} S. Trimberger, R. Pang, and A. Singh, "A 12 Gbps DES encryptor/ decryptor core in an FPGA," in Proc. Int. Workshop Cryptographic Hardw. Embedded Syst., 2000, pp. 156-163.
    [10]
    {10} A. Hodjat and I. Verbauwhede, "A 21.54 gbits/s fully pipelined AES processor on FPGA," in Proc. IEEE Symp. Field-Program. Custom Comput. Mach., 2004, pp. 308-309.
    [11]
    {11} T.-F. Lin, C.-P. Su, C.-T. Huang, and C.-W. Wu, "A high-throughput low-cost AES cipher chip," in Proc. IEEE Asia-Pacific Conf. ASIC, 2002, pp. 85-88.
    [12]
    {12} B. Yang, R. Karri, and D. A. McGrew, "Divide-and-concatenate: An architecture level optimization technique for universal hash functions," in Proc. ACM/IEEE Design Autom. Conf., 2004, pp. 614-617.
    [13]
    {13} A. Satoh and T. Inoue, "ASIC-hardware-focused comparison for hash functions MD5, RIPEMD-160, and SHS," in Proc. Int. Conf. Inf. Technol.: Coding Comput., 2005, pp. 532-537.
    [14]
    {14} M. Shand and J. E. Vuillemin, "Fast implementations of RSA cryptography," in Proc. IEEE Symp. Comput. Arithmetic, 1993, pp. 252-259.
    [15]
    {15} C. K. Koc, "RSA hardware implementation," RSA Labs., Bedford, MA, 1996.
    [16]
    {16} L. Batina, S. Berna, B. Preneel, and J. Vandewalle, "Hardware architectures for public key cryptography," Integr., VLSI J., vol. 34, no. 1-2, pp. 1-64, May 2003.
    [17]
    {17} J. Burke, J. McDonald, and T. Austin, "Architectural support for fast symmetric-key cryptography," in Proc. Int. Conf. Arch. Support Program. Lang. Operat. Syst., 2000, pp. 178-189.
    [18]
    {18} Z. Shi and R. Lee, "Bit permutation instructions for accelerating software cryptography," in Proc. Int. Conf. Appl.-Specific Syst., Arch. Process., 2000, pp. 138-148.
    [19]
    {19} N. Potlapally, S. Ravi, A. Raghunathan, and G. Lakshminarayana, "Algorithm exploration for efficient public-key security processing on wireless handsets," in Proc. DATE Designers Forum, 2002, pp. 42-46.
    [20]
    {20} S. Ravi, A. Raghunathan, N. Potlapally, and M. Sankaradass, "System design methodologies for a wireless security processing platform," in Proc. ACM/IEEE Design Autom. Conf., 2002, pp. 777-782.
    [21]
    {21} P. Schaumont and I. Verbauwhede, "Domain-specific codesign for embedded security," IEEE Comput., vol. 36, no. 4, pp. 68-74, Apr. 2003.
    [22]
    {22} D. Hwang, B.-C. Lai, P. Schaumont, K. Sakiyama, Y. Fan, S. Yang, A. Hodjat, and I. Verbauwhede, "Design flow for HW/SW acceleration transparency in the ThumbPod secure embedded system," in Proc. ACM/IEEE Design Autom. Conf., 2003, pp. 60-65.
    [23]
    {23} Safenet Inc., Belcamp, MD, "Safenet EmbeddedIPTM," (2007). {On-line}. Available: http://www.safenet-inc.com
    [24]
    {24} Discretix Technologies Ltd., San Mateo, CA, Discretix Technologies Ltd. Homepage (2007). {Online}. Available: http://www.discretix.com
    [25]
    {25} G. De Micheli, R. Ernst, and W. Wolf, Readings in Hardware/Software Co-design. Norwell, MA: Kluwer Academic, 2002.
    [26]
    {26} A. Jerraya and W. Wolf, Multiprocessor Systems-on-Chips. San Mateo, CA: Morgan Kaufman, 2004.
    [27]
    {27} Valgrind (2007). {Online}. Available: http://valgrind.org
    [28]
    {28} IBM, Armonk, NY, "IBM rational software," (2007). {Online}. Available: http://www-306.ibm.com/software/rational/
    [29]
    {29} B. Schneier, Applied Cryptography: Protocols, Algorithms and Source Code in C. New York: Wiley, 1996.
    [30]
    {30} Open SSL, "Open SSL project," (2004). {Online}. Available: http:// www.openssl.org
    [31]
    {31} Open Mobile Alliance, "DRM specification V2.0," (2007). {Online}. Available: http://www.openmobileallianee.org
    [32]
    {32} D. Thull and R. Sannino, "Performance considerations for an embedded implementation of OMA DRM 2," in Proc. Design Autom. Test Eur. Conf., 2005, pp. 46-51.

    Cited By

    View all
    • (2018)Exploring Partitions Based on Search Space Smoothing for Heterogeneous Multiprocessor SystemIEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences10.1093/ietfec/e91-a.9.2456E91-A:9(2456-2464)Online publication date: 21-Dec-2018

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image IEEE Transactions on Very Large Scale Integration (VLSI) Systems
    IEEE Transactions on Very Large Scale Integration (VLSI) Systems  Volume 15, Issue 6
    June 2007
    117 pages

    Publisher

    IEEE Educational Activities Department

    United States

    Publication History

    Published: 01 June 2007

    Author Tags

    1. Embedded processors
    2. embedded processors
    3. performance
    4. security and protection
    5. software partitioning

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 27 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2018)Exploring Partitions Based on Search Space Smoothing for Heterogeneous Multiprocessor SystemIEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences10.1093/ietfec/e91-a.9.2456E91-A:9(2456-2464)Online publication date: 21-Dec-2018

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media