The memory hierarchy of high-performance and embedded processors has been shown to be one of the major energy consumers. For example, the Level-1 (L1) instruction cache (I-Cache) of the StrongARM processor accounts for 27% of the power... more
The memory hierarchy of high-performance and embedded processors has been shown to be one of the major energy consumers. For example, the Level-1 (L1) instruction cache (I-Cache) of the StrongARM processor accounts for 27% of the power dissipation of the whole chip, whereas the instruction fetch unit (IFU) and the I-Cache of Intel's Pentium Pro processor are the single most important power consuming modules with 14% of the total power dissipation [2]. Extrapolating current trends, this portion is likely to increase in the near future, since the devices devoted to the caches occupy an increasingly larger percentage of the total area of the chip. In this paper, we propose a technique that uses an additional mini cache, the LO-Cache, located between the I-Cache and the CPU core. This mechanism can provide the instruction stream to the data path and, when managed properly, it can effectively eliminate the need for high utilization of the more expensive I-Cache. We propose, implement, and evaluate five techniques for dynamic analysis of the program instruction access behavior, which is then used to proactively guide the access of the LO-Cache. The basic idea is that only the most frequently executed portions of the code should be stored in the LO-Cache since this is where the program spends most of its time. We present experimental results to evaluate the effectiveness of our scheme in terms of performance and energy dissipation for a series of SPEC95 benchmarks. We also discuss the performance and energy tradeoffs that are involved in these dynamic schemes. Results for these benchmarks indicate that more than 60% of the dissipated energy in the I-Cache subsystem can be saved
Research Interests: Computer Science, Distributed Computing, Computer Hardware, Energy Management, Low Power Design, and 15 moreMemory, Dynamic Analysis, Cache Memory, Embedded processor, Low Power Electronics, Energy Dissipation, High performance, Chip, Level, Instruction Sets, Electrical And Electronic Engineering, Power Dissipation, Integrated Circuit Design, cache management, and Instruction cache
Next generation video standards have strict and increasing performance demands due to real-time requirements and the trend towards higher frame resolutions and bit rates. Leveraging the advantages of reconfigurable logic and emerging... more
Next generation video standards have strict and increasing performance demands due to real-time requirements and the trend towards higher frame resolutions and bit rates. Leveraging the advantages of reconfigurable logic and emerging multi-core processor architectures to exploit all levels of parallelism of such workloads is necessary to achieve real time functionality at a reasonable cost.
Research Interests:
Wide-angle (fisheye) lenses are often used in virtual reality and computer vision applications to widen the field of view of conventional cameras. Those lenses, however, distort images. For most real-world applications the video stream... more
Wide-angle (fisheye) lenses are often used in virtual reality and computer vision applications to widen the field of view of conventional cameras. Those lenses, however, distort images. For most real-world applications the video stream needs to be transformed, at real-time (20 frames/sec or better), back to the natural-looking, central perspective space. This paper presents the implementation, optimization and characterization of a fisheye lens distortion correction application on three platforms: a conventional, homogeneous ...