Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleSeptember 2024
Offloading Datacenter Jobs to RISC-V Hardware for Improved Performance and Power Efficiency
SYSTOR '24: Proceedings of the 17th ACM International Systems and Storage ConferencePages 39–52https://doi.org/10.1145/3688351.3689152The end of Moore's Law has brought significant changes in the architecture of servers used in data centers, increasingly incorporating new ISAs beyond x86-64 as well as diverse accelerators. Further, single-board computers have become increasingly ...
- research-articleJanuary 2024
First Impressions of the NVIDIA Grace CPU Superchip and NVIDIA Grace Hopper Superchip for Scientific Workloads
HPCAsia '24 Workshops: Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region WorkshopsPages 36–44https://doi.org/10.1145/3636480.3637097The engineering samples of the NVIDIA Grace CPU Superchip and NVIDIA Grace Hopper Superchips were tested using different benchmarks and scientific applications. The benchmarks include HPCC and HPCG. The real application-based benchmark includes AI-...
- ArticleMarch 2024
Least Information Redundancy Algorithm of Printable Shellcode Encoding for X86
Computer Security. ESORICS 2023 International WorkshopsPages 361–376https://doi.org/10.1007/978-3-031-54129-2_21AbstractShellcode is a critical element in computer security that exploits vulnerabilities within software systems. Shellcode is written in machine code and often designed to be compact in size, evading detection by security software. Printable shellcode, ...
- research-articleFebruary 2023
Are we ready for broader adoption of ARM in the HPC community: Performance and Energy Efficiency Analysis of Benchmarks and Applications Executed on High-End ARM Systems
- Nikolay A. Simakov,
- Robert L. Deleon,
- Joseph P. White,
- Matthew D. Jones,
- Thomas R. Furlani,
- Eva Siegmann,
- Robert J. Harrison
HPCAsia '23 Workshops: Proceedings of the HPC Asia 2023 WorkshopsPages 78–86https://doi.org/10.1145/3581576.3581618A set of benchmarks, including numerical libraries and real-world scientific applications, were run on several modern ARM systems (Amazon Graviton 3/2, Futjutsu A64FX, Ampere Altra, Thunder X2) and compared to x86 systems (Intel and AMD) as well as to ...
- ArticleDecember 2021
At the Bottom of Binary Analysis: Instructions
AbstractWe present here a careful exploration of the set of instructions for the x86 processor architecture. This is a preliminary step towards a systematic comparison of SMT-based retro-engineering tools. The latter arose in the context of binary code ...
-
- research-articleJune 2021
Revamping hardware persistency models: view-based and axiomatic persistency models for Intel-x86 and Armv8
PLDI 2021: Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and ImplementationPages 16–31https://doi.org/10.1145/3453483.3454027Non-volatile memory (NVM) is a cutting-edge storage technology that promises the performance of DRAM with the durability of SSD. Recent work has proposed several persistency models for mainstream architectures such as Intel-x86 and Armv8, describing the ...
- research-articleNovember 2020
RIVER 2.0: an open-source testing framework using AI techniques
LANGETI 2020: Proceedings of the 1st ACM SIGSOFT International Workshop on Languages and Tools for Next-Generation TestingPages 13–18https://doi.org/10.1145/3416504.3424335This paper presents the latest updates to the RIVER open-source testing platform for x86 programs, focusing on how artificial intelligence (AI) techniques can be used to improve the automated testing processes. It is also important to mention that RIVER ...
- research-articleOctober 2020
A Retargetable MATLAB-to-C Compiler Exploiting Custom Instructions and Data Parallelism
- Ioannis Latifis,
- Karthick Parashar,
- Grigoris Dimitroulakos,
- Hans Cappelle,
- Christakis Lezos,
- Konstantinos Masselos,
- Francky Catthoor
ACM Transactions on Embedded Computing Systems (TECS), Volume 19, Issue 6Article No.: 50, Pages 1–27https://doi.org/10.1145/3391898This article presents a MATLAB-to-C compiler that exploits custom instructions present in state-of-the-art processor architectures and supports semi-automatic vectorization. A parameterized processor model is used to describe the target instruction set ...
- research-articleJune 2020
Benchmarking of state-of-the-art HPC Clusters with a Production CFD Code
PASC '20: Proceedings of the Platform for Advanced Scientific Computing ConferenceArticle No.: 3, Pages 1–11https://doi.org/10.1145/3394277.3401847Computing technologies populating high-performance computing (HPC) clusters are getting more and more diverse, offering a wide range of architectural features. As a consequence, efficient programming of such platforms becomes a complex task. In this ...
- ArticleJune 2020
Understanding HPC Benchmark Performance on Intel Broadwell and Cascade Lake Processors
AbstractHardware platforms in high performance computing are constantly getting more complex to handle even when considering multicore CPUs alone. Numerous features and configuration options in the hardware and the software environment that are relevant ...
BYOC: A "Bring Your Own Core" Framework for Heterogeneous-ISA Research
- Jonathan Balkind,
- Katie Lim,
- Michael Schaffner,
- Fei Gao,
- Grigory Chirkov,
- Ang Li,
- Alexey Lavrov,
- Tri M. Nguyen,
- Yaosheng Fu,
- Florian Zaruba,
- Kunal Gulati,
- Luca Benini,
- David Wentzlaff
ASPLOS '20: Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating SystemsPages 699–714https://doi.org/10.1145/3373376.3378479Heterogeneous architectures and heterogeneous-ISA designs are growing areas of computer architecture and system software research. Unfortunately, this line of research is significantly hindered by the lack of experimental systems and modifiable hardware ...
- research-articleApril 2019
uops.info: Characterizing Latency, Throughput, and Port Usage of Instructions on Intel Microarchitectures
ASPLOS '19: Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating SystemsPages 673–686https://doi.org/10.1145/3297858.3304062Modern microarchitectures are some of the world's most complex man-made systems. As a consequence, it is increasingly difficult to predict, explain, let alone optimize the performance of software running on such microarchitectures. As a basis for ...
Evaluating Characteristics of CUDA Communication Primitives on High-Bandwidth Interconnects
ICPE '19: Proceedings of the 2019 ACM/SPEC International Conference on Performance EngineeringPages 209–218https://doi.org/10.1145/3297663.3310299Data-intensive applications such as machine learning and analytics have created a demand for faster interconnects to avert the memory bandwidth wall and allow GPUs to be effectively leveraged for lower compute intensity tasks. This has resulted in wide ...
- ArticleJune 2018
Off-Limits: Abusing Legacy x86 Memory Segmentation to Spy on Enclaved Execution
AbstractEnclaved execution environments, such as Intel SGX, enable secure, hardware-enforced isolated execution of critical application components without having to trust the underlying operating system or hypervisor. A recent line of research, however, ...
- research-articleMarch 2018
Statistical Reconstruction of Class Hierarchies in Binaries
ASPLOS '18: Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating SystemsPages 363–376https://doi.org/10.1145/3173162.3173202We address a fundamental problem in reverse engineering of object-oriented code: the reconstruction of a program's class hierarchy from its stripped binary. Existing approaches rely heavily on structural information that is not always available, e.g., ...
Also Published in:
ACM SIGPLAN Notices: Volume 53 Issue 2 - abstractJune 2017
TetrisOS and BreakoutOS: Assembly Language Projects for Computer Organization
ITiCSE '17: Proceedings of the 2017 ACM Conference on Innovation and Technology in Computer Science EducationPages 88–89https://doi.org/10.1145/3059009.3072976TetrisOS and BreakoutOS are projects developed for a sophomore-level computer organization course. Each project teaches a wide range of x86 assembly language topics, including iteration, function calls, data storage, segmentation, communication with ...
- research-articleMay 2017
Hardware is the new Software
HotOS '17: Proceedings of the 16th Workshop on Hot Topics in Operating SystemsPages 132–137https://doi.org/10.1145/3102980.3103002Moore's Law may be slowing, but, perhaps as a result, other measures of processor complexity are only accelerating. In recent years, Intel's architects have turned to an alphabet soup of instruction set extensions such as MPX, SGX, MPK, and CET as a way ...
- research-articleMarch 2017
A mechanism for energy-efficient reuse of decoding and scheduling of x86 instruction streams
Current superscalar x86 processors decompose each CISC instruction (variable-length and with multiple addressing modes) into multiple RISC-like μops at runtime so they can be pipelined and scheduled for concurrent execution. This challenging and power-...
- research-articleJune 2016
ARM virtualization: performance and architectural implications
ISCA '16: Proceedings of the 43rd International Symposium on Computer ArchitecturePages 304–316https://doi.org/10.1109/ISCA.2016.35ARM servers are becoming increasingly common, making server technologies such as virtualization for ARM of growing importance. We present the first study of ARM virtualization performance on server hardware, including multicore measurements of two ...
Also Published in:
ACM SIGARCH Computer Architecture News: Volume 44 Issue 3 - articleJune 2016
Potential analysis of a superscalar core employing a reconfigurable array for improving instruction-level parallelism
Design Automation for Embedded Systems (DAES), Volume 20, Issue 2Pages 155–169https://doi.org/10.1007/s10617-016-9174-4As technology scaling reduces pace and energy efficiency becomes a new important design constraint, superscalar processor designs are reaching their performance limits due to area and power restrictions. As a result, new microarchitectural paradigms ...