2003 Symposium on VLSI Circuits. Digest of Technical Papers (IEEE Cat. No.03CH37408), 2003
Page 1. 2-4 A Design for Digital, Dynamic Clock Deskew Charles E. Dike, Nasser A. Kurd, Priyadars... more Page 1. 2-4 A Design for Digital, Dynamic Clock Deskew Charles E. Dike, Nasser A. Kurd, Priyadarsan Patra, Javed Barkatuhh Intel Corporation, Hillsboro, OR 97124 ... 35. no. 11, pp 1545-1552. Nov. 2ooO. [3] N. Kurd I. Barkatullah, R.Dizon, T Fletcher, and P. Madland. ...
2010 International Symposium on Electronic System Design, 2010
Fast design space exploration of complex nano-CMOS mixed-signal circuits is an important problem.... more Fast design space exploration of complex nano-CMOS mixed-signal circuits is an important problem. In this paper, a design process flow that uses metamodels is introduced. In this flow the most important task is the sampling of the design space. In this paper, different sampling techniques for producing an accurate metamodel are investigated to minimize the number of samples required by using a nano-CMOS ring oscillator (RO) as an example. Through SPICE simulations, it is shown that the parasitics have a drastic effect on performance metrics, such as the frequency of oscillation. Alternative sampling techniques, both random, such as Monte Carlo (MC), and uniform, such as Latin Hypercube Sampling (LHS), and Design of Experiments (DOE), are considered as and compared for speed and accuracy. Due to the time constraints of the circuit design process, this paper can be used as a guideline for which sampling technique will produce the most accurate result to minimize the design time. All a experimental results are presented for a 45 nm technology.
Symposium on Asynchronous Circuits and Systems, 1997
Asychronous designs have been touted as having po- tential advantages in average performance, pow... more Asychronous designs have been touted as having po- tential advantages in average performance, power con- sumption, modularity, and tolerance of metastability as compared to traditional synchronous logic. While delay- insensitive (DI) asynchronous circuits are theoretically the most desirable type of asynchronous logic because they make the weakest timing assumptions, the complexity of im- plementing DI circuits in CMOS or similar
Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03., 2003
Page 1. ESTIMA: An Architectural-Level Power Estimator for Multi-Ported Pipelined Register Files ... more Page 1. ESTIMA: An Architectural-Level Power Estimator for Multi-Ported Pipelined Register Files Kavel M. Büyüksahin Technology CAD, Intel Corp RA3-254, 2501 NW 229th Ave Hillsboro, OR 97124, USA kavel@kavel.org ...
... 191 6.5.2 Time-and Resource-Constrained Scheduling Algorithms . . ... The fractions indicate ... more ... 191 6.5.2 Time-and Resource-Constrained Scheduling Algorithms . . ... The fractions indicate the probabilities that the indicated node will be at logic level 1 .... 77 3.23 A NAND implementation of a half-adder. ... 83 4.2 Mobile platform power breakdown ..... ...
... Jacob Joern Janneck Ravindra Jejurikar Andrew B. Kahng Seiji Kajihara Timothy Kam Matton Kamo... more ... Jacob Joern Janneck Ravindra Jejurikar Andrew B. Kahng Seiji Kajihara Timothy Kam Matton Kamon Chandra Kashyap Ryan Kastner ... Sinha Sasa Slijepcevic Justin Sobaje Mani Soma ICCAD-2000 REVIEWERS Fabio Somenzi Mandayam Srivas Ankur Srivatsava Karsten ...
3 , and priyadarsan.patra@intel.com 4 . Abstract— The design and optimization complexity of analo... more 3 , and priyadarsan.patra@intel.com 4 . Abstract— The design and optimization complexity of analog/mixed-signal (AMS) components causes significant in-crease in the design cycle as the technology progresses towards deep nanoscale. This paper presents a two-tier approach to significantly reduce the design cycle time by combining ac-curate metamodeling and intelligent optimization. The paper first presents metamodeling which is a surrogate model of a parasitic-aware SPICE model of the circuit in order to simplify the optimization calculations and minimize the design space exploration time. The paper then introduces the Bee Colony Optimization (BCO) algorithm for nano-CMOS AMS circuit optimization. To best of the authors' knowledge, this is the first research combining metamodel and BCO for AMS design space exploration. The proposed design optimization flow is used on 5 metamodels with 21 design parameters each, corresponding to 5 distinct Figures of Merit (FoMs) to conduct multi o...
Fast design space exploration of complex nano-CMOS mixed-signal circuits is an important problem.... more Fast design space exploration of complex nano-CMOS mixed-signal circuits is an important problem. In this paper, a design process flow that uses metamodels is introduced. In this flow the most important task is the sampling of the design space. In this paper, different sampling techniques for producing an accurate metamodel are investigated to minimize the number of samples required by using a nano-CMOS ring oscillator (RO) as an example. Through SPICE simulations, it is shown that the parasitics have a drastic effect on performance metrics, such as the frequency of oscillation. Alternative sampling techniques, both random, such as Monte Carlo (MC), and uniform, such as Latin Hypercube Sampling (LHS), and Design of Experiments (DOE), are considered as and compared for speed and accuracy. Due to the time constraints of the circuit design process, this paper can be used as a guideline for which sampling technique will produce the most accurate result to minimize the design time. All a...
Second ACM/IEEE International Symposium on Networks-on-Chip (nocs 2008), 2008
Abstract With the continuing scaling of CMOS technologies, process variation is becoming a key fa... more Abstract With the continuing scaling of CMOS technologies, process variation is becoming a key factor highly impacting system-level power and temperature. Traditional methods of assuming a uniform temperature and no process variation can lead to gross inaccuracies ...
2009 Digest of Technical Papers International Conference on Consumer Electronics, 2009
We introduce a Net-centric Multimedia Processor (NMP) with built-in Digital Rights Management (DR... more We introduce a Net-centric Multimedia Processor (NMP) with built-in Digital Rights Management (DRM) facilities to facilitate internet protocol packet processing and video processing without use of the main CPU. Packet classification and scheduling are the two most computational intensive operations. In this paper we propose an algorithm and architecture which can perform simultaneous classification and scheduling towards high-performance and power-efficient realization of the NMP. The architecture is prototyped in VHDL and simulated for power, frequency, logic usage and throughput for 4 different logic families in the Xilinx environment.
2008 IEEE 14th International Symposium on High Performance Computer Architecture, 2008
An important correctness issue for emerging multi/many-core shared memory systems is to ensure th... more An important correctness issue for emerging multi/many-core shared memory systems is to ensure that the inter-processor communication through shared memory conforms to the memory ordering rules, as specified by the architecture's memory consistency model (1). This presents a significant validation challenge. Growing system complexity makes it increasingly hard to identify all deep-state logic bugs in pre-silicon verification. Further, aggressive technology
Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors, 2002
System structure and a 2GHz product application result are described for a domino synthesis capab... more System structure and a 2GHz product application result are described for a domino synthesis capability that covers all aspects of domino design, from estimation to silicon-ready layout, with custom-class optimization. The described optimization flow, abstraction modes, and key cost factors deliver power-optimized, noise-correct domino performance on complex logic.
Symposium on Asynchronous Circuits and Systems, 1997
Asychronous designs have been touted as having po- tential advantages in average performance, pow... more Asychronous designs have been touted as having po- tential advantages in average performance, power con- sumption, modularity, and tolerance of metastability as compared to traditional synchronous logic. While delay- insensitive (DI) asynchronous circuits are theoretically the most desirable type of asynchronous logic because they make the weakest timing assumptions, the complexity of im- plementing DI circuits in CMOS or similar
ABSTRACT High performance circuit techniques such as domino logic have migrated from the micropro... more ABSTRACT High performance circuit techniques such as domino logic have migrated from the microprocessor world into more mainstream ASIC designs but domino logic comes at a heavy cost in terms of total power dissipation. A set of results related to automated phase assignment for the synthesis of low-power domino circuits is presented: (1) it is demonstrated that the choice of phase assignment at the primary outputs of a circuit can significantly impact lower dissipation in the domino block, and (2) a method to determine a phase assignment that minimises power consumption in the final circuit implementation is proposed. Preliminary experimental results on a mixture of public domain benchmarks and real industry circuits show potential power savings as high as 34% over the minimum area realisation of the logic. Furthermore, the low-power synthesised circuits still meet timing constraints
Abstract-Digital VLSI design courses are a standard component in most electrical and computer eng... more Abstract-Digital VLSI design courses are a standard component in most electrical and computer engineering curricula. Electronic Design Automation (EDA) or Computer Aided Design (CAD) tools and frameworks are an integral and indispensable part of such courses. In this paper we present our findings during the preparation and setup of such a course, centered around nanoscale CMOS, standard cell based design. Practical issues, such as the choice of licensing model, hardware and software platform selection, point tool identification and deployment as well as the availability of readily useable standard cell libraries are discussed. In addition, a design flow that incorporates the tools in a comprehensive framework is presented. A sample syllabus and a suggested teaching methodology are also given.
The semiconductor industry is headed towards a new era of scaling and uncertainty with new key bu... more The semiconductor industry is headed towards a new era of scaling and uncertainty with new key building blocks for the next-generation chips, the high-κ metal-gate transistor. There is a need for statistical characterization of high-κ metal-gate digital gates as a function of process parameter variations to make them available for designers. In this paper, we present a methodology for PVT aware high-κ metal-gate logic library creation while considering the variability effect in 15 parameters. First, statistical models for GIDL current (Î GIDL ), offcurrent (Î OF F ) and drive current (Î ON ) are presented at the device level. This is followed by statistical characterization of logic cells at room temperature. Data for subthreshold current (Î sub ),Î GIDL , dynamic current (Î dyn ) and delay is presented. This is followed by results for PVT aware characterization of logic cells. To the best of the authors' knowledge, this is the first research which provides a PVT aware statistical characterization for high-κ metal-gate nano-CMOS based logic gates.
2003 Symposium on VLSI Circuits. Digest of Technical Papers (IEEE Cat. No.03CH37408), 2003
Page 1. 2-4 A Design for Digital, Dynamic Clock Deskew Charles E. Dike, Nasser A. Kurd, Priyadars... more Page 1. 2-4 A Design for Digital, Dynamic Clock Deskew Charles E. Dike, Nasser A. Kurd, Priyadarsan Patra, Javed Barkatuhh Intel Corporation, Hillsboro, OR 97124 ... 35. no. 11, pp 1545-1552. Nov. 2ooO. [3] N. Kurd I. Barkatullah, R.Dizon, T Fletcher, and P. Madland. ...
2010 International Symposium on Electronic System Design, 2010
Fast design space exploration of complex nano-CMOS mixed-signal circuits is an important problem.... more Fast design space exploration of complex nano-CMOS mixed-signal circuits is an important problem. In this paper, a design process flow that uses metamodels is introduced. In this flow the most important task is the sampling of the design space. In this paper, different sampling techniques for producing an accurate metamodel are investigated to minimize the number of samples required by using a nano-CMOS ring oscillator (RO) as an example. Through SPICE simulations, it is shown that the parasitics have a drastic effect on performance metrics, such as the frequency of oscillation. Alternative sampling techniques, both random, such as Monte Carlo (MC), and uniform, such as Latin Hypercube Sampling (LHS), and Design of Experiments (DOE), are considered as and compared for speed and accuracy. Due to the time constraints of the circuit design process, this paper can be used as a guideline for which sampling technique will produce the most accurate result to minimize the design time. All a experimental results are presented for a 45 nm technology.
Symposium on Asynchronous Circuits and Systems, 1997
Asychronous designs have been touted as having po- tential advantages in average performance, pow... more Asychronous designs have been touted as having po- tential advantages in average performance, power con- sumption, modularity, and tolerance of metastability as compared to traditional synchronous logic. While delay- insensitive (DI) asynchronous circuits are theoretically the most desirable type of asynchronous logic because they make the weakest timing assumptions, the complexity of im- plementing DI circuits in CMOS or similar
Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03., 2003
Page 1. ESTIMA: An Architectural-Level Power Estimator for Multi-Ported Pipelined Register Files ... more Page 1. ESTIMA: An Architectural-Level Power Estimator for Multi-Ported Pipelined Register Files Kavel M. Büyüksahin Technology CAD, Intel Corp RA3-254, 2501 NW 229th Ave Hillsboro, OR 97124, USA kavel@kavel.org ...
... 191 6.5.2 Time-and Resource-Constrained Scheduling Algorithms . . ... The fractions indicate ... more ... 191 6.5.2 Time-and Resource-Constrained Scheduling Algorithms . . ... The fractions indicate the probabilities that the indicated node will be at logic level 1 .... 77 3.23 A NAND implementation of a half-adder. ... 83 4.2 Mobile platform power breakdown ..... ...
... Jacob Joern Janneck Ravindra Jejurikar Andrew B. Kahng Seiji Kajihara Timothy Kam Matton Kamo... more ... Jacob Joern Janneck Ravindra Jejurikar Andrew B. Kahng Seiji Kajihara Timothy Kam Matton Kamon Chandra Kashyap Ryan Kastner ... Sinha Sasa Slijepcevic Justin Sobaje Mani Soma ICCAD-2000 REVIEWERS Fabio Somenzi Mandayam Srivas Ankur Srivatsava Karsten ...
3 , and priyadarsan.patra@intel.com 4 . Abstract— The design and optimization complexity of analo... more 3 , and priyadarsan.patra@intel.com 4 . Abstract— The design and optimization complexity of analog/mixed-signal (AMS) components causes significant in-crease in the design cycle as the technology progresses towards deep nanoscale. This paper presents a two-tier approach to significantly reduce the design cycle time by combining ac-curate metamodeling and intelligent optimization. The paper first presents metamodeling which is a surrogate model of a parasitic-aware SPICE model of the circuit in order to simplify the optimization calculations and minimize the design space exploration time. The paper then introduces the Bee Colony Optimization (BCO) algorithm for nano-CMOS AMS circuit optimization. To best of the authors' knowledge, this is the first research combining metamodel and BCO for AMS design space exploration. The proposed design optimization flow is used on 5 metamodels with 21 design parameters each, corresponding to 5 distinct Figures of Merit (FoMs) to conduct multi o...
Fast design space exploration of complex nano-CMOS mixed-signal circuits is an important problem.... more Fast design space exploration of complex nano-CMOS mixed-signal circuits is an important problem. In this paper, a design process flow that uses metamodels is introduced. In this flow the most important task is the sampling of the design space. In this paper, different sampling techniques for producing an accurate metamodel are investigated to minimize the number of samples required by using a nano-CMOS ring oscillator (RO) as an example. Through SPICE simulations, it is shown that the parasitics have a drastic effect on performance metrics, such as the frequency of oscillation. Alternative sampling techniques, both random, such as Monte Carlo (MC), and uniform, such as Latin Hypercube Sampling (LHS), and Design of Experiments (DOE), are considered as and compared for speed and accuracy. Due to the time constraints of the circuit design process, this paper can be used as a guideline for which sampling technique will produce the most accurate result to minimize the design time. All a...
Second ACM/IEEE International Symposium on Networks-on-Chip (nocs 2008), 2008
Abstract With the continuing scaling of CMOS technologies, process variation is becoming a key fa... more Abstract With the continuing scaling of CMOS technologies, process variation is becoming a key factor highly impacting system-level power and temperature. Traditional methods of assuming a uniform temperature and no process variation can lead to gross inaccuracies ...
2009 Digest of Technical Papers International Conference on Consumer Electronics, 2009
We introduce a Net-centric Multimedia Processor (NMP) with built-in Digital Rights Management (DR... more We introduce a Net-centric Multimedia Processor (NMP) with built-in Digital Rights Management (DRM) facilities to facilitate internet protocol packet processing and video processing without use of the main CPU. Packet classification and scheduling are the two most computational intensive operations. In this paper we propose an algorithm and architecture which can perform simultaneous classification and scheduling towards high-performance and power-efficient realization of the NMP. The architecture is prototyped in VHDL and simulated for power, frequency, logic usage and throughput for 4 different logic families in the Xilinx environment.
2008 IEEE 14th International Symposium on High Performance Computer Architecture, 2008
An important correctness issue for emerging multi/many-core shared memory systems is to ensure th... more An important correctness issue for emerging multi/many-core shared memory systems is to ensure that the inter-processor communication through shared memory conforms to the memory ordering rules, as specified by the architecture's memory consistency model (1). This presents a significant validation challenge. Growing system complexity makes it increasingly hard to identify all deep-state logic bugs in pre-silicon verification. Further, aggressive technology
Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors, 2002
System structure and a 2GHz product application result are described for a domino synthesis capab... more System structure and a 2GHz product application result are described for a domino synthesis capability that covers all aspects of domino design, from estimation to silicon-ready layout, with custom-class optimization. The described optimization flow, abstraction modes, and key cost factors deliver power-optimized, noise-correct domino performance on complex logic.
Symposium on Asynchronous Circuits and Systems, 1997
Asychronous designs have been touted as having po- tential advantages in average performance, pow... more Asychronous designs have been touted as having po- tential advantages in average performance, power con- sumption, modularity, and tolerance of metastability as compared to traditional synchronous logic. While delay- insensitive (DI) asynchronous circuits are theoretically the most desirable type of asynchronous logic because they make the weakest timing assumptions, the complexity of im- plementing DI circuits in CMOS or similar
ABSTRACT High performance circuit techniques such as domino logic have migrated from the micropro... more ABSTRACT High performance circuit techniques such as domino logic have migrated from the microprocessor world into more mainstream ASIC designs but domino logic comes at a heavy cost in terms of total power dissipation. A set of results related to automated phase assignment for the synthesis of low-power domino circuits is presented: (1) it is demonstrated that the choice of phase assignment at the primary outputs of a circuit can significantly impact lower dissipation in the domino block, and (2) a method to determine a phase assignment that minimises power consumption in the final circuit implementation is proposed. Preliminary experimental results on a mixture of public domain benchmarks and real industry circuits show potential power savings as high as 34% over the minimum area realisation of the logic. Furthermore, the low-power synthesised circuits still meet timing constraints
Abstract-Digital VLSI design courses are a standard component in most electrical and computer eng... more Abstract-Digital VLSI design courses are a standard component in most electrical and computer engineering curricula. Electronic Design Automation (EDA) or Computer Aided Design (CAD) tools and frameworks are an integral and indispensable part of such courses. In this paper we present our findings during the preparation and setup of such a course, centered around nanoscale CMOS, standard cell based design. Practical issues, such as the choice of licensing model, hardware and software platform selection, point tool identification and deployment as well as the availability of readily useable standard cell libraries are discussed. In addition, a design flow that incorporates the tools in a comprehensive framework is presented. A sample syllabus and a suggested teaching methodology are also given.
The semiconductor industry is headed towards a new era of scaling and uncertainty with new key bu... more The semiconductor industry is headed towards a new era of scaling and uncertainty with new key building blocks for the next-generation chips, the high-κ metal-gate transistor. There is a need for statistical characterization of high-κ metal-gate digital gates as a function of process parameter variations to make them available for designers. In this paper, we present a methodology for PVT aware high-κ metal-gate logic library creation while considering the variability effect in 15 parameters. First, statistical models for GIDL current (Î GIDL ), offcurrent (Î OF F ) and drive current (Î ON ) are presented at the device level. This is followed by statistical characterization of logic cells at room temperature. Data for subthreshold current (Î sub ),Î GIDL , dynamic current (Î dyn ) and delay is presented. This is followed by results for PVT aware characterization of logic cells. To the best of the authors' knowledge, this is the first research which provides a PVT aware statistical characterization for high-κ metal-gate nano-CMOS based logic gates.
Uploads
Papers by Priyadarsan Patra