Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

The Instruction-Set Extension Problem: A Survey

Published: 01 May 2011 Publication History

Abstract

The extension of a given instruction-set with specialized instructions has become a common technique used to speed up the execution of applications. By identifying computationally intensive portions of an application to be partitioned in segments of code to execute in software and segments of code to execute in hardware, the execution of an application can be considerably speeded up. Each segment of code implemented in hardware can then be seen as a specialized application-specific instruction extending a given instruction-set. Although a number of approaches exist in literature proposing different methodologies to customize an instruction-set, the description of the problem consists only of sporadic comparisons limited to isolated problems. This survey presents a unique detailed description of the problem and provides an exhaustive overview of the research in the past years in instruction-set extension. This article presents a thorough analysis of the issues involved during the customization of an instruction-set by means of a set of specialized application-specific instructions. The investigation of the problem covers both instruction generation and instruction selection and different kinds of customizations are analyzed in a great detail.

References

[1]
Aho, A. V., Ganapathi, M., and Tjiang, S. W. K. 1989. Code generation using tree matching and dynamic programming. ACM Trans. Programm. Lang. Syst. 11, 4, 491--516.
[2]
Aletà, A., Codina, J. M., González, A., and Kaeli, D. 2004. Removing communications in clustered microarchitectures through instruction replication. ACM Trans. Archit. Code Optimiz. 1, 2, 127--151.
[3]
Alippi, C., Fornaciari, W., Pozzi, L., and Sami, M. 1999. A dag-based design approach for reconfigurable vliw processors. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’99). 778--779.
[4]
Alippi, C., Fornaciari, W., Pozzi, L., and Sami, M. 2001. Determining the optimum extended instruction-set architecture for application specific reconfigurable vliw cpus. In Proceedings of the 12th International Workshop on Rapid System Prototyping (RSP’01). 50--56.
[5]
Alomary, A., Nakata, T., Honma, Y., Imai, M., and Hikichi, N. 1993. An asip instruction set optimization algorithm with functional module sharing constraint. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’93). 526--532.
[6]
Alomary, A. Y. 1996. A hardware/software codesign partitioner for asip design. In Proceedings of the 3rd IEEE International Conference on Electronics, Circuits, and Systems (ICECS’96). 251--254.
[7]
Arató, P., Juhász, S., Ádám Mann, Z., Orbán, A., and Papp, D. 2003. Hardware-Software partitioning in embedded system design. In Proceedings of the IEEE International Symposium on Intelligent Signal Processing (WISP’03). 197--202.
[8]
Arnold, M. 2001. Instruction set extension for embedded processors. Ph.D. thesis, University of Delft, The Netherlands.
[9]
Arnold, M. and Corporaal, H. 1999. Automatic detection of recurring operation patterns. In Proceedings of the 7th International Workshop on Hardware/Software Codesign (CODES’99). 22--26.
[10]
Arnold, M. and Corporaal, H. 2001. Designing domain-specific processors. In Proceedings of the 9th International Symposium on Hardware/Software Codesign (CODES’01). 61--66.
[11]
Atasu, K. 2007. Hardware/software partitioning for custom instruction processors. Ph.D. thesis, Boğaziçi University, Turkey. December.
[12]
Atasu, K., Pozzi, L., and Ienne, P. 2003a. Automatic application-specific instruction-set extensions under microarchitectural constraints. In Proceedings of the 40th Conference on Design Automation (DAC’03). 256--261.
[13]
Atasu, K., Pozzi, L., and Ienne, P. 2003b. Automatic application-specific instruction-set extensions under microarchitectural constraints. Int. J. Parall. Programm. 31, 6, Special issue: Workshop on application specific processors (WASP), 411--428.
[14]
Atasu, K., Dündar, G., and Özturan, C. 2005. An integer linear programming approach for identifying instruction-set extensions. In Proceedings of the 3rd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’05). 172--177.
[15]
Atasu, K., Dimond, R. G., Mencer, O., Luk, W., Özturan, C., and Dündar, G. 2007. Optimizing instruction-set extensible processors under data bandwidth constraints. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’07). 588--593.
[16]
Atasu, K., Mencer, O., Luk, W., Özturan, C., and Dündar, G. 2008. Fast custom instruction identification by convex subgraph enumeration. In Proceedings of the International Conference on Application-Specific Systems, Architectures and Processors (ASAP’08). 1--6.
[17]
Athanas, P. M. and Silverman, H. F. 1993. Processor reconfiguration through instruction-set metamorphosis. Comput. 26, 3, 11--18.
[18]
Baleani, M., Gennari, F., Jiang, Y., Patel, Y., Brayton, R. K., and Sangiovanni-Vincentelli, A. 2002. Hw/sw partitioning and code generation of embedded control applications on a reconfigurable architecture platform. In Proceedings of the 10th International Symposium on Hardware/Software Codesign (CODES’02). 151--156.
[19]
Barat, F. and Lauwereins, R. 2000. Reconfigurable instruction set processors: A survey. In Proceedings of the 11th IEEE International Workshop on Rapid System Prototyping (RSP’00). IEEE Computer Society, 168.
[20]
Barat, F., Lauwereins, R., and Deconinck, G. 2002. Reconfigurable instruction set processors from a hardware/software perspective. IEEE Trans. Softw. Engin. 28, 9, 847--862.
[21]
Bình, N. N., Imai, M., and Hikichi, N. 1995. A hardware/software partitioning algorithm for pipelined instruction set processor. In Proceedings of the Conference on European Design Automation (EURO-DAC’95/EURO-VHDL’95). 176--181.
[22]
Bình, N. N., Imai, M., and Shiomi, A. 1996a. A new hw/sw partitioning algorithm for synthesizing the highest performance pipelined asips with multiple identical fus. In Proceedings of the Conference on European Design Automation (EURO-DAC’96/EURO-VHDL’96). 126--131.
[23]
Bình, N. N., Imai, M., Shiomi, A., and Hikichi, N. 1996b. A hardware/software partitioning algorithm for designing pipelined asips with least gate counts. In Proceedings of the 33rd Annual Conference on Design Automation (DAC’96). 527--532.
[24]
Biswas, P. and Dutt, N. 2003a. Greedy and heuristic-based algorithms for synthesis of complex instructions in heterogeneous-connectivity-based DSPs. Tech. rep. 03-16, UCI-ISR.
[25]
Biswas, P. and Dutt, N. 2003b. Reducing code size for heterogeneous-connectivity-based vliw dsps through syntheis of instruction set extensions. In Proceedings of the 2003 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES’03). 104--112.
[26]
Biswas, P. and Dutt, N. D. 2005. Code size reduction in heterogeneous-connectivity-based dsps using instruction set extensions. IEEE Trans. Comput. 54, 10, 1216--1226.
[27]
Biswas, P., Banerjee, S., Dutt, N., Pozzi, L., and Ienne, P. 2004a. Fast automated generation of high-quality instruction set extensions for processor customization. In Proceedings of the 3rd Workshop on Application Specific Processors (WASP’04).
[28]
Biswas, P., Choudhary, V., Atasu, K., Pozzi, L., Ienne, P., and Dutt, N. 2004b. Introduction of local memory elements in instruction set extensions. In Proceedings of the 41st Annual Conference on Design Automation (DAC’04). 729--734.
[29]
Biswas, P., Banerjee, S., Dutt, N., Pozzi, L., and Ienne, P. 2005. Isegen: Generation of high-quality instruction set extensions by iterative improvement. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’05). 1246--1251.
[30]
Biswas, P., Dutt, N., Ienne, P., and Pozzi, L. 2006. Automatic identification of application-specific functional units with architecturally visible storage. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’06). European Design and Automation Association, 212--217.
[31]
Bobda, C. 2007. Introduction to Reconfigurable Computing. Springer.
[32]
Bonzini, P. and Pozzi, L. 2007a. Polynomial-Time subgraph enumeration for automated instruction set extension. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’07). 1331--1336.
[33]
Bonzini, P. and Pozzi, L. 2007b. A retargetable framework for automated discovery of custom instructions. In Proceedings of the International Conference on Application-Specific Systems, Architectures and Processors (ASAP07).
[34]
Borin, E., Klein, F., Moreano, N., Azevedo, R., and Araujo, G. 2004. Fast instruction set customization. In 2nd Workshop on Embedded Systems for Real-Time Multimedia (ESTImedia’04). 53--58.
[35]
Brayton, R. K. and Somenzi, F. 1989. Boolean relations and the incomplete specification of logic networks. In Proceedings of the 1992 IEEE/ACM International Conference on Computer-Aided Design (ICCAD’89). 316--319.
[36]
Brisk, P., Kaplan, A., Kastner, R., and Sarrafzadeh, M. 2002. Instruction generation and regularity extraction for reconfigurable processors. In Proceedings of the 2002 International Conference on Compilers, Architecture, and Sfor Embedded Systems (CASES’02). 262--269.
[37]
Brisk, P., Kaplan, A., and Sarrafzadeh, M. 2004. Area-Efficient instruction set synthesis for reconfigurable system-on-chip designs. In Proceedings of the 41st annual conference on Design automation (DAC’04). 395--400.
[38]
Buell, D., Kleinfelder, W., and Arnold, J. 1996. Splash 2: FPGAs in a Custom Computing Machine.
[39]
Chen, L. 1996. Graph isomorphism and identification matrices: Parallel algorithms. IEEE Trans. Parall. Distrib. Syst. 7, 3, 308--319.
[40]
Cheung, N., Henkel, J., and Parameswaran, S. 2003a. Rapid configuration and instruction selection for an asip: A case study. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’03).
[41]
Cheung, N., Parameswaran, S., and Henkel, J. 2003b. Inside: Instruction selection/identification and design exploration for extensible processors. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’03).
[42]
Cheung, N., Parameswaran, S., and Henkel, J. 2005. Battery-Aware instruction generation for embedded processors. In Proceedings of the Conference on Asia South Pacific Design Automation (ASP-DAC’05). 553--556.
[43]
Choi, H., Hwang, S. H., Kyung, C.-M., and Park, I.-C. 1998. Synthesis of application specific instructions for embedded dsp software. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’98). 665--671.
[44]
Choi, H., Kim, J.-S., Yoon, C.-W., Park, I.-C., Hwang, S. H., and Kyung, C.-M. 1999. Synthesis of application specific instructions for embedded dsp software. IEEE Trans. Comput. 48, 6, 603--614.
[45]
Clark, N. 2007. Customizing the computation capabilities of microprocessors. Ph.D. thesis, University of Michigan, Ann Arbor.
[46]
Clark, N. T. and Zhong, H. 2005. Automated custom instruction generation for domain-specific processor acceleration. IEEE Trans. Comput. 54, 10, 1258--1270.
[47]
Clark, N., Tang, W., and Mahlke, S. 2002. Automatically generating custom instruction set extensions. In Proceedings of 1st Workshop on Application Specific Processors (WASP). 94--101.
[48]
Clark, N., Zhong, H., and Mahlke, S. 2003. Processor acceleration through automated instruction set customization. In Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture (MICRO’36).
[49]
Clark, N., Kudlur, M., Park, H., Mahlke, S., and Flautner, K. 2004. Application-specific processing on a general-purpose core via transparent instruction set customization. In Proceedings of the 37th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’37). 30--40.
[50]
Clark, N., Blome, J., Chu, M., Mahlke, S., Biles, S., and Flautner, K. 2005. An architecture framework for transparent instruction set customization in embedded processors. SIGARCH Comput. Archit. News 33, 2, 272--283.
[51]
Clark, N., Hormati, A., Mahlke, S., and Yehia, S. 2006. Scalable subgraph mapping for acyclic computation accelerators. In Proceedings of the International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES’06). 147--157.
[52]
Compton, K. and Hauck, S. 2002. Reconfigurable computing: A survey of systems and software. ACM Comput. Surv. 34, 2, 171--210.
[53]
Cong, J., Fan, Y., Han, G., and Zhang, Z. 2004. Application-specific instruction generation for configurable processor architectures. In Proceedings of the ACM/SIGDA 12th International Symposium on Field Programmable Gate Arrays (FPGA’04). 183--189.
[54]
Coudert, O. 1996. On solving covering problems. In Proceedings of the 33rd Annual Conference on Design Automation (DAC’96). 197--202.
[55]
Coudert, O. and Madre, J. C. 1995. New ideas for solving covering problems. In Proceedings of the 32nd ACM/IEEE Conference on Design Automation (DAC’95). 641--646.
[56]
De Micheli, G. and Gupta, R. K. 1997. Hardware/software co-design. Proc. IEEE 85, 3, 349--365.
[57]
Ebeling, C., Cronquist, D., and Franklin, P. 1996. Rapid - reconfigurable pipelined datapath. In Proceedings of the 6th International Workshop on Field-Programmable Logic, Smart Applications, New Paradigms and Compilers (FPL’96). Springer, 126--135.
[58]
Faraboschi, P., Brown, G., Fisher, J. A., Desoli, G., and Homewood, F. 2000. Lx: a technology platform for customizable vliw embedded processing. ACM SIGARCH Comput. Archit. News 28, 2, 203--213.
[59]
Fornaciari, W., Pozzi, L., and Sami, M. 1999. Processori riconfigurabili: unalternativa flessibile per i sistemi dedicati. Alta Frequenza - Rivista di Elettronica, 22--28.
[60]
Fortin, S. 1996. The graph isomorphism problem. Tech. rep. TR 96-20, Department of Computing Science, University of Alberta, Canada.
[61]
Galuzzi, C., Bertels, K., and Vassiliadis, S. 2007a. A linear complexity algorithm for the automatic generation of convex multiple input multiple output instructions. In Proceedings of the 3rd International Workshop Reconfigurable Computing: Architectures, Tools and Applications (ARC’07), P. C. Diniz, E. Marques, K. Bertels, M. M. Fernandes, and J. M. P. Cardoso Eds., Lecture Notes in Computer Science, vol. 4419. Springer, 130--141.
[62]
Galuzzi, C., Bertels, K., and Vassiliadis, S. 2007b. A linear complexity algorithm for the generation of multiple input single output instructions of variable size. In Proceedings of the Embedded Computer Systems: Architectures, Modeling, and Simulation, 7th International Workshop (SAMOS’07), S. Vassiliadis, M. Berekovic, and T. D. Hämäläinen, Eds. Lecture Notes in Computer Science, vol. 4599. Springer, 283--293.
[63]
Galuzzi, C., Moscu Panainte, E., Yankova, Y., Bertels, K., and Vassiliadis, S. 2006. Automatic selection of application-specific instruction-set extensions. In Proceedings of the 4th International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’06). 160--165.
[64]
Geurts, W. 1995. Synthesis of accelerator data paths for high-throughput signal processing applications. Ph.D. thesis, Katholieke Universiteit Leuven.
[65]
Geurts, W. 1997. Accelerator Data-Path Synthesis for High-Throughput Signal Processing Applications. Kluwer Academic Publishers, Norwell, MA.
[66]
Gokhale, M., Holmes, W., Kopser, A., Lucas, S., Minnich, R., Sweely, D., and Lopresti, D. 1991. Building and using a highly parallel programmable logic array. Comput. 24, 1, 81--89.
[67]
Goldstein, S. C., Schmit, H., Moe, M., Budiu, M., Cadambi, S., Taylor, R. R., and Laufer, R. 1999. Piperench: A co-processor for streaming multimedia acceleration. SIGARCH Comput. Archit. News 27, 2, 28--39.
[68]
Grasselli, A. and Luccio, F. 1965. A method for minimizing the number of internal states in incompletely specified sequential networks. IEEE Trans. Electron. Comp. EC-14, 350--359.
[69]
Guo, Y. 2006. Mapping applications to a coarse-grained reconfigurable architecture. Ph.D. thesis, University of Twente, The Netherlands.
[70]
Guo, Y., Smit, G. J., Broersma, H., and Heysters, P. M. 2003. A graph covering algorithm for a coarse grain reconfigurable system. In Proceedings of the ACM SIGPLAN Conference on Language, Compiler, and Tool for Embedded Systems (LCTES’03). 199--208.
[71]
Gutin, G., Johnstone, A., Reddington, J., Scott, E., Soleimanfallah, A., and Yeo, A. 2007. An algorithm for finding connected convex subgraphs of an acyclic digraph. In Proceedings of the ACiD 2007.
[72]
Hartenstein, R. 2001a. Coarse grain reconfigurable architecture (embedded tutorial). In Proceedings of the Conference on Asia South Pacific Design Automation (ASP-DAC’01). 564--570.
[73]
Hartenstein, R. 2001b. A decade of reconfigurable computing: a visionary retrospective. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’01). 642--649.
[74]
Hauck, S., Fry, T. W., Hosler, M. M., and Kao, J. P. 1997. The chimaera reconfigurable functional unit. In Proceedings of the 5th IEEE Symposium on FPGA-Based Custom Computing Machines (FCCM’97).
[75]
Hauck, S., Fry, T. W., Hosler, M. M., and Kao, J. P. 2004. The chimaera reconfigurable functional unit. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 12, 2, 206--217.
[76]
Hauser, J. R. and Wawrzynek, J. 1997. Garp: a mips processor with a reconfigurable coprocessor. In Proceedings of the 5th IEEE Symposium on FPGA-Based Custom Computing Machines (FCCM’97).
[77]
Haynes, S. D., Cheung, P. Y. K., Luk, W., and Stone, J. 1999. Sonic - A plug-in architecture for video processing. In Proceedings of the 7th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM’99).
[78]
Haynes, S. D., Stone, J., Cheung, P. Y. K., and Luk, W. 2000. Video image processing with the sonic architecture. Comput. 33, 4, 50--57.
[79]
Holmer, B. 1993. Automatic design of computer instruction sets. Ph.D. thesis.
[80]
Huang, I.-J. and Despain, A. M. 1994a. Generating instruction sets and microarchitectures from applications. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’94). 391--396.
[81]
Huang, I.-J. and Despain, A. M. 1994b. Synthesis of instruction sets for pipelined microprocessors. In Proceedings of the 31st Annual Conference on Design Automation (DAC’94). 5--11.
[82]
Huang, Z. and Malik, S. 2001. Managing dynamic reconfiguration overhead in system-on-a-chip design using reconfigurable datapaths and optimized interconnection networks. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’01). 735--740.
[83]
Huang, Z., Malik, S., Moreano, N., and Araujo, G. 2004. The design of dynamically reconfigurable datapath coprocessors. Trans. Embed. Comput. Syst. 3, 2, 361--384.
[84]
Huynh, H. P., Sim, J. E., and Mitra, T. 2007. An efficient framework for dynamic reconfiguration of instruction-set customization. In Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES’07). 135--144.
[85]
Ienne, P. and Leupers, R. 2006. Customizable Embedded Processors: Design Technologies and Applications (Systems on Silicon). Morgan Kaufmann Publishers, San Francisco, CA.
[86]
Imai, M., Sato, J., Alomary, A., and Hikichi, N. 1992. An integer programming approach to instruction implementation method selection problem. In Proceedings of the Conference on European Design Automation (EURO-DAC’92). 106--111.
[87]
Iseli, C. 1996. Spyder: A reconfigurable processor development system. Ph.D. thesis, Ecole Polytechnique Federale de Lausanne.
[88]
Iseli, C. and Sanchez, E. 1995. Spyder: A sure (superscalar and reconfigurable) processor. J. Supercomput. 9, 3, 231--252.
[89]
Janssen, M., Catthoor, F., and de Man, H. 1996. A specification invariant technique for regularity improvement between flow-graph clusters. In Proceedings of the European Conference on Design and Test (EDTC’96).
[90]
Jayaseelan, R., Liu, H., and Mitra, T. 2006. Exploiting forwarding to improve data bandwidth of instruction-set extensions. In Proceedings of the 43rd Annual Conference on Design Automation (DAC’06). 43--48.
[91]
Kastner, R., Ogrenci-Memik, S., Bozorgzadeh, E., and Sarrafzadeh, M. 2001. Instruction generation for hybrid reconfigurable systems. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’01). 127--130.
[92]
Kastner, R., Kaplan, A., Memik, S. O., and Bozorgzadeh, E. 2002. Instruction generation for hybrid reconfigurable systems. ACM Trans. Des. Automa. Electron. Syst. (TODAES) 7, 4, 605--627.
[93]
Kavvadias, N. and Nikolaidis, S. 2005. Automated instruction-set extension of embedded processors with application to mpeg-4 video encoding. In Proceedings of the IEEE International Conference on Application-Specific Systems, Architecture Processors (ASAP’05). 140--145.
[94]
Kavvadias, N. and Nikolaidis, S. 2006. A flexible instruction generation framework for extending embedded processors. In Proceedings of the 13th IEEE Mediterranean Electrotechnical Conference (MELECON’06). 125--128.
[95]
Keutzer, K., Malik, S., and Newton, A. R. 2002. From asic to asip: The next design discontinuity. In Proceedings of the IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD’02). 84--90.
[96]
Lam, S.-K. and Srikanthan, T. 2009. Rapid design of area-efficient custom instructions for reconfigurable embedded processing. J. Syst. Archit. 55, 1, 1--14.
[97]
Lam, S. K., Srikantham, T., and Clarke, C. T. 2006. Rapid generation of custom instructions using predefined dataflow structures. Microprocess. Microsyst. 30, 6, (Special Issue on FPGA’s), 355--366.
[98]
Lee, C., Potkonjak, M., and Mangione-Smith, W. H. 1997. Mediabench: A tool for evaluating and synthesizing multimedia and communicatons systems. In Proceedings of the 30th Annual ACM/IEEE International Symposium on Microarchitecture (MICRO’30). 330--335.
[99]
Lee, J.-E., Choi, K., and Dutt, N. 2002. Efficient instruction encoding for automatic instruction set design of configurable asips. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’02). 649--654.
[100]
Lee, J.-E., Choi, K., and Dutt, N. D. 2003a. Energy-efficient instruction set synthesis for application-specific processors. In Proceedings of the 2003 International Symposium on Low Power Electronics and Design (ISLPED’03). 330--333.
[101]
Lee, J.-E., Choi, K., and Dutt, N. D. 2003b. An algorithm for mapping loops onto coarse-grained reconfigurable architectures. In Proceedings of the ACM SIGPLAN Conference on Language, Compiler, and Tool for Embedded Systems (LCTES’03). 183--188.
[102]
Lee, J.-E., Choi, K., and Dutt, N. D. 2007. Instruction set synthesis with efficient instruction encoding for configurable processors. ACM Trans. Des. Autom. Electron. Syst. 12, 1, 8.
[103]
Leupers, R., Karuri, K., Kraemer, S., and Pandey, M. 2006. A design flow for configurable embedded processors based on optimized instruction set extension synthesis. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’06). European Design and Automation Association, 581--586.
[104]
Li, X. Y., Stallmann, M. F., and Brglez, F. 2005. Effective bounding techniques for solving unate and binate covering problems. In Proceedings of the 42nd Annual Conference on Design Automation (DAC’05). 385--390.
[105]
Liao, S. and Devadas, S. 1997. Solving covering problems using lpr-based lower bounds. In Proceedings of the 34th Annual Conference on Design Automation (DAC’97). 117--120.
[106]
Liao, S., Devadas, S., Keutzer, K., and Tjiang, S. 1995. Instruction selection using binate covering for code size optimization. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’95). 393--399.
[107]
Liao, S., Keutzer, K., Tjiang, S., and Devadas, S. 1998. A new viewpoint on code generation for directed acyclic graphs. ACM Trans. Design Automat. Electron. Syst. (TODAES) 3, 1, 51--75.
[108]
Liem, C., May, T., and Paulin, P. 1994. Instruction-set matching and selection for DSP and ASIP code generation. In Proceedings of the European Design and Test Conference (ED&TC). 31--37.
[109]
Lin, S. and Kernighan, B. 1973. An effective heuristic algorithm for the traveling-salesman problem. Oper. Res. 21, 2, 498--516.
[110]
Lu, G., Singh, H., Lee, M.-H., Bagherzadeh, N., Kurdahi, F. J., and Filho, E. M. C. 1999. The morphosys parallel reconfigurable system. In Proceedings of the 5th International Euro-Par Conference on Parallel Processing (Euro-Par’99). Springer, 727--734.
[111]
Mei, B., Vernalde1, S., Verkest, D., Man, H. D., and Lauwereins, R. 2003. Adres: An architecture with tightly coupled vliw processor and coarse-grained reconfigurable matrix. In Proceedings of the International Conference on Field-Programmable Logic and Applications (FPL’03). Springer, 61--70.
[112]
Messmer, B. T. and Bunke, H. 1995. Subgraph isomorphism in polynomial time. Tech. rep. IAM 95-003, University of Bern, Switzerland.
[113]
Miyamori, T. and Olukotun, K. 1998. Remarc (abstract): Reconfigurable multimedia array coprocessor. In Proceedings of the ACM/SIGDA 6th International Symposium on Field Programmable Gate Arrays (FPGA’98).
[114]
Moreano, N., Araujo, G., Huang, Z., and Malik, S. 2002. Datapath merging and interconnection sharing for reconfigurable architectures. In Proceedings of the 15th International Symposium on System Synthesis (ISSS’02). 38--43.
[115]
Niemann, R. and Marwedel, P. 1996. Hardware/software partitioning using integer programming. In Proceedings of the European Conference on Design and Test (EDTC9’6).
[116]
Niemann, R. and Marwedel, P. 1997. An algorithm for hardware/software partitioning using mixed integer linear programming. Des. Automat. Embedd. Syst. 2, 2, Special Issue: Partitioning Methods for Embedded Systems, 165--193.
[117]
Peymandoust, A., Pozzil, L., Ienne, P., and Micheli, G. D. 2003. Automatic instruction set extension and utilization for embedded processors. In Proceedings of the 14th International Conference on Application-Specific Systems, Architectures and Processors (ASAP’03). 108--118.
[118]
Pothineni, N., Kumar, A., and Paul, K. 2007. Application specific datapath extension with distributed i/o functional units. In Proceedings of the 20th International Conference on VLSI Design held jointly with 6th International Conference (VLSID’07). 551--558.
[119]
Pozzi, L. 2000. Methodologies for the design of application-specific reconfigurable vliw processors. Ph.D. thesis, Politecnico di Milano, Milano, Italy.
[120]
Pozzi, L. and Ienne, P. 2005. Exploiting pipelining to relax register-file port constraints of instruction-set extensions. In Proceedings of the International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES’05). 2--10.
[121]
Pozzi, L., Vuletić, M., and Ienne, P. 2001. Automatic topology-based identification of instruction-set extensions for embedded processors. Tech. rep. CS 01/377, EPFL, DI-LAP, Lausanne.
[122]
Pozzi, L., Vuletić, M., and Ienne, P. 2002. Automatic topology-based identification of instruction-set extensions for embedded processors. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’02).
[123]
Pozzi, L., Atasu, K., and Ienne, P. 2006a. Exact and approximate algorithms for the extension of embedded processor instruction sets. IEEE Trans. Comput.-Aid. Desi. Integra. Circ. Syst. 25, 7, 1209--1229.
[124]
Pozzi, L., Atasu, K., and Ienne, P. 2006b. Exact and approximate algorithms for the extension of embedded processor instruction sets. IEEE Trans. Comput.-Aid. Des. Integra. Circ. Syst. 25, 7, 1209--1229.
[125]
Rabaey, J. 1997. Reconfigurable processing: The solution to low-power programmable dsp. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’97). vol. 1.
[126]
Radunovic, B. and Milutinovic, V. M. 1998. A survey of reconfigurable computing architectures. In Proceedings of the 8th International Workshop on Field-Programmable Logic and Applications, From FPGAs to Computing Paradigm (FPL’98). Springer, 376--385.
[127]
Razdan, R., Brace, K. S., and Smith, M. D. 1994. PRISC software acceleration techniques. In Proceedings of the IEEE International Conference on Computer Design: VLSI in Computer & Processors (ICCS’’94). 145--149.
[128]
Razdan, R. and Smith, M. D. 1994. A high-performance microarchitecture with hardware-programmable functional units. In Proceedings of the 27th Annual International Symposium on Microarchitecture (MICRO’27). 172--180.
[129]
Rupp, C. R., Landguth, M., Garverick, T., Gomersall, E., Holt, H., Arnold, J. M., and Gokhale, M. 1998. The napa adaptive processing architecture. In Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines (FCCM’98).
[130]
Sang, S., Li, X., and Ye, Y. 2005. Automatic instruction generation for application specific co-processor. In 6th International Conference On ASIC (ASICON’05). 934--938.
[131]
Scharwaechter, H., Youn, J. M., Leupers, R., Paek, Y., Ascheid, G., and Meyr, H. 2007. A code-generator generator for multi-output instructions. In Proceedings of the 5th IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’07). 131--136.
[132]
Seto, K. and Fujita, M. 2008. Custom instruction generation with high-level synthesis. In Proceedings of the 2008 Symposium on Application Specific Processors (SASP). Anaheim, California, 14--19.
[133]
Sreenivasa Rao, D. and Kurdahi, F. J. 1992. Partitioning by regularity extraction. In Proceedings of the 29th ACM/IEEE Conference on Design Automation (DAC’92). 235--238.
[134]
Sreenivasa Rao, D. and Kurdahi, F. J. 1993a. Hierarchical design space exploration for a class of digital systems. IEEE Trans. Very Large Scale Integra. (VLSI) Syst. 1, 3, 282--295.
[135]
Sreenivasa Rao, D. and Kurdahi, F. J. 1993b. On clustering for maximal regularity extraction. IEEE Trans. Comput.-Aid. Des. 12, 8, 1198--1208.
[136]
Strozek, L. and Brooks, D. 2006. Efficient architectures through application clustering and architectural heterogeneity. In Proceedings of the International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES’06). 190--200.
[137]
Sun, F., Ravi, S., Raghunathan, A., and Jha, N. K. 2002. Synthesis of custom processors based on extensible platforms. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’02). 641--648.
[138]
Sun, F., Ravi, S., Raghunathan, A., and Jha, N. K. 2003. A scalable application-specific processor synthesis methodology. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’03).
[139]
Sun, F., Ravi, S., Raghunathan, A., and Jha, N. K. 2004. Custom-instruction synthesis for extensible processor platform. IEEE Trans. Comput.-Aid. Des. Integra. Circ. 23, 2, 216--228.
[140]
Todman, T., Constantinides, G., Wilton, S., Mencer, O., Luk, W., and Cheung, P. 2005. Reconfigurable computing: Architectures and design methods. IEE Proc. - Comput. Digital Tech. 152, 2, 193--207.
[141]
Van Praet, J., Goossens, G., Lanneer, D., and De Man, H. 1994. Instruction set definition and instruction selection for asips. In Proceedings of the 7th International Symposium on High-level Synthesis (ISSS’94). 11--16.
[142]
Vassiliadis, S. and Soudris, D., Eds. 2007. Fine- and Coarse-Grain Reconfigurable Computing. Springer.
[143]
Vassiliadis, S., Wong, S., and Cotofana, S. 2001. The molen ϱμ-coded processor. In Proceedings of the 11th International Conference on Field-Programmable Logic and Applications (FPL’01). Springer-Verlag, London, UK, 275--285.
[144]
Vassiliadis, S., Wong, S., Gaydadjiev, G., Bertels, K., Kuzmanov, G., and Moscu Panainte, E. 2004. The molen polymorphic processor. IEEE Trans. Comput. 53, 11, 1363--1375.
[145]
Vassiliadis, N., Kavvadias, N., Theodoridis, G., and Nikolaidis, S. 2006. A risc architecture extended by an efficient tightly coupled reconfigurable unit. Inte. J. Electron. 93, 6, 421--438.
[146]
Vassiliadis, N., Theodoridis, G., and Nikolaidis, S. 2007. Enhancing a reconfigurable instruction set processor with partial predication and virtual opcode support. In Proceedings of the 2nd International Workshop on Applied Reconfigurable Computing (ARC’06). Lecture Notes in Computer Science, vol. 3985. Springer, 217--229.
[147]
Verma, A. K., Atasu, K., Vuletić, M., Pozzi, L., and Ienne, P. 2002. Automatic application-specific instruction-set extensions under microarchitectural constraints. In Proceedings of the 1st Workshop on Application Specific Processors (WASP-1).
[148]
Verma, A. K., Brisk, P., and Ienne, P. 2007. Rethinking custom ise identification: A new processor-agnostic method. In Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES’07). 125--134.
[149]
Wang, A., Killian, E., Maydan, D., and Rowen, C. 2001. Hardware/software instruction set configurability for system-on-chip processors. In Proceedings of the 38th Conference on Design Automation (DAC’01). 184--188.
[150]
Wazlowski, M., Agarwal, L., Lee, T., Smith, A., Lam, E., Athanas, P., Silverman, H., and Ghosh, S. 1993. Prism-ii compiler and architecture. In Proceedings of the IEEE Workshop on FPGAs for Custom Computing Machines. 9--16.
[151]
Wirthlin, M. J. and Hutchings, B. L. 1995. Disc: The dynamic instruction set computer. In Proceedings of the International Society of Optical Engineering SPIE. Field Programmable Gate Arrays (FPGAs) for Fast Board Development and Reconfigurable Computing. vol. 2607. 92--103.
[152]
Wittig, R. and Chow, P. 1996. OneChip: An FPGA processor with reconfigurable logic. In Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines. 126--135.
[153]
Wittig, R. D. 1995. Onechip: An fpga processor with reconfigurable logic. M.S. thesis, Department of Electrical and Computer Engineering, University of Toronto.
[154]
Wolinski, C. and Kuchcinski, K. 2007. Identification of application specific instructions based on sub-graph isomorphism constraints. In Proceedings of the IEEE International Application -specific Systems, Architectures and Processors. 328--333.
[155]
Wolinski, C. and Kuchcinski, K. 2008. Automatic selection of application-specific reconfigurable processor extensions. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’08). 1214--1219.
[156]
Wong, S., Vassiliadis, S., and Cotofana, S. 2007. Instruction set extension generation with considering physical constraints. In Proceedings of the International Conference on High Performance Embedded Architectures and Compilers. 291--305.
[157]
Ye, Z. A., Moshovos, A., Hauck, S., and Banerjee, P. 2000. CHIMAERA: A high-performance architecture with a tightly-coupled reconfigurable functional unit. In ACM SIGARCH Comput. Archit. News (Special Issue: Proceedings of the 27th annual international symposium on Computer architecture ISCA), 225--235.
[158]
Yu, P. and Mitra, T. 2004. Scalable custom instructions identification for instruction-set extensible processors. In Proceedings of the 2004 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES’04). 69--78.
[159]
Yu, P. and Mitra, T. 2005. Satisfying real-time constraints with custom instructions. In Proceedings of the 3rd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’05). 166--171.
[160]
Yu, P. and Mitra, T. 2007. Disjoint pattern enumeration for custom instructions identification. In Proceedings of the 17th IEEE International Conference on Field Programmable Logic and Applications (FPL’07). Amsterdam, The Netherlands, --.
[161]
Zhao, K., Bian, J., Dong, S., Song, Y., and Goto, S. 2008. Fast custom instruction identification algorithm based on basic convex pattern model for supporting asip automated design. IEICE Trans. Fundam. Electron. Comm. Comput. Sci. E91-A, 6, 1478--1487.

Cited By

View all
  • (2024)Using Source-to-Source to Target RISC-V Custom Extensions: UVE Case-StudyProceedings of the 16th Workshop on Rapid Simulation and Performance Evaluation for Design10.1145/3642921.3642930(42-50)Online publication date: 18-Jan-2024
  • (2024)Automating application-driven customization of ASIPsJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2024.103080148:COnline publication date: 1-Mar-2024
  • (2023)Evaluation of Special Instruction Implementations in Soft Processors for High-level Synthesis高位合成可能なソフトプロセッサにおける専用命令実装手法の評価IEEJ Transactions on Industry Applications10.1541/ieejias.143.94143:2(94-100)Online publication date: 1-Feb-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Reconfigurable Technology and Systems
ACM Transactions on Reconfigurable Technology and Systems  Volume 4, Issue 2
May 2011
216 pages
ISSN:1936-7406
EISSN:1936-7414
DOI:10.1145/1968502
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 2011
Accepted: 01 January 2010
Revised: 01 September 2009
Received: 01 May 2008
Published in TRETS Volume 4, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. HW/SW codesign
  2. Instruction-set
  3. customization
  4. instruction generation
  5. instruction selection
  6. instruction-set extension
  7. reconfigurable architecture

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)42
  • Downloads (Last 6 weeks)8
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Using Source-to-Source to Target RISC-V Custom Extensions: UVE Case-StudyProceedings of the 16th Workshop on Rapid Simulation and Performance Evaluation for Design10.1145/3642921.3642930(42-50)Online publication date: 18-Jan-2024
  • (2024)Automating application-driven customization of ASIPsJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2024.103080148:COnline publication date: 1-Mar-2024
  • (2023)Evaluation of Special Instruction Implementations in Soft Processors for High-level Synthesis高位合成可能なソフトプロセッサにおける専用命令実装手法の評価IEEJ Transactions on Industry Applications10.1541/ieejias.143.94143:2(94-100)Online publication date: 1-Feb-2023
  • (2023)Modular Processor Architecture with Cryptography ISA Extensions2023 21st IEEE Interregional NEWCAS Conference (NEWCAS)10.1109/NEWCAS57931.2023.10198046(1-2)Online publication date: 26-Jun-2023
  • (2023)Invited Paper: Instruction Set Extensions for Post-Quantum Cryptography2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD)10.1109/ICCAD57390.2023.10323931(1-6)Online publication date: 28-Oct-2023
  • (2022)Virtual Prototype driven Design, Implementation and Evaluation of RISC-V Instruction Set Extensions2022 25th International Symposium on Design and Diagnostics of Electronic Circuits and Systems (DDECS)10.1109/DDECS54261.2022.9770108(14-19)Online publication date: 6-Apr-2022
  • (2021)A Binary Translation Framework for Automated Hardware GenerationIEEE Micro10.1109/MM.2021.308867041:4(15-23)Online publication date: 1-Jul-2021
  • (2021)A lightweight ISE for ChaCha on RISC-V2021 IEEE 32nd International Conference on Application-specific Systems, Architectures and Processors (ASAP)10.1109/ASAP52443.2021.00011(25-32)Online publication date: Jul-2021
  • (2021)Software/Hardware Co-Verification for Custom Instruction Set ProcessorsIEEE Access10.1109/ACCESS.2021.31312139(160559-160579)Online publication date: 2021
  • (2020)Early Verification of ISA Extension Specifications using Deep Reinforcement LearningProceedings of the 2020 on Great Lakes Symposium on VLSI10.1145/3386263.3406901(297-302)Online publication date: 7-Sep-2020
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media