Author: Kim, Jangwoo : Search

Applied Filters

People

Publications

Conferences

Publication Date

59 Results for: Author: Kim, JangwooEdit SearchSave SearchRSS

Searched The ACM Guide to Computing Literature (3,766,563 records)|Limit your search to The ACM Full-Text Collection (759,377 records)

Showing 1 - 20of59 Results

Filters

Select All

Export Citations Save to Binder

per page:

Recency

research-article
Open Access
September 2024
CoolDC: A Cost-Effective Immersion-Cooled Datacenter with Workload-Aware Temperature Scaling
ACM Transactions on Architecture and Code Optimization (TACO), Volume 21, Issue 3Article No.: 51, Pages 1–27https://doi.org/10.1145/3664925
For datacenter architects, it is the most important goal to minimize the datacenter’s total cost of ownership for the target performance (i.e., TCO/performance). As the major component of a datacenter is a server farm, the most effective way of reducing ...
0
282
Metrics
Total Citations0
Total Downloads282
Last 12 Months282
Last 6 weeks110
View online with eReader
PDF
research-article
Open Access
April 2024
A Fault-Tolerant Million Qubit-Scale Distributed Quantum Computer
ASPLOS '24: Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2Pages 1–19https://doi.org/10.1145/3620665.3640388

A million qubit-scale quantum computer is essential to realize the quantum supremacy. Modern large-scale quantum computers integrate multiple quantum computers located in dilution refrigerators (DR) to overcome each DR's unscaling cooling budget. However,...
2
691
Metrics
Total Citations2
Total Downloads691
Last 12 Months691
Last 6 weeks145
View online with eReader
PDF
research-article
Open Access
November 2023
Fast, Light-weight, and Accurate Performance Evaluation using Representative Datacenter Behaviors
Middleware '23: Proceedings of the 24th International Middleware ConferencePages 220–233https://doi.org/10.1145/3590140.3629117

Datacenters rapidly evolve by adopting new features such as new hardware deployment and software patches. Adopting a new feature requires an accurate evaluation of its impact to minimize the risk to the multi-million dollar computing infrastructure. ...
0
460
Metrics
Total Citations0
Total Downloads460
Last 12 Months460
Last 6 weeks32
View online with eReader
PDF
research-article
June 2023
F4T: A Fast and Flexible FPGA-based Full-stack TCP Acceleration Framework
ISCA '23: Proceedings of the 50th Annual International Symposium on Computer ArchitectureArticle No.: 55, Pages 1–13https://doi.org/10.1145/3579371.3589090

As complex workloads that run on many servers are pursuing higher networking throughput, more CPU cycles are consumed to support the TCP stack. To mitigate the high CPU burden from executing the compute-intensive TCP, prior works have proposed to ...
1
1,186
Metrics
Total Citations1
Total Downloads1,186
Last 12 Months644
Last 6 weeks57
Get Access
research-article
June 2023
QIsim: Architecting 10+K Qubit QC Interfaces Toward Quantum Supremacy
ISCA '23: Proceedings of the 50th Annual International Symposium on Computer ArchitectureArticle No.: 1, Pages 1–16https://doi.org/10.1145/3579371.3589036

A 10+K qubit Quantum-Classical Interface (QCI) is essential to realize the quantum supremacy. However, it is extremely challenging to architect scalable QCIs due to the complex scalability trade-offs regarding operating temperatures, device and wire ...
0
972
Metrics
Total Citations0
Total Downloads972
Last 12 Months567
Last 6 weeks54
Get Access
Upcoming Conferences
Skip slideshow

PACT '24

October 13 - 16, 2024

Hilton Long Beach, Long Beach, CA, USA

PACT '24 Website

MICRO '24

November 2 - 6, 2024

TBD, Austin, TX, USA

MIDDLEWARE '24

December 2 - 6, 2024

The Hong Kong Polytechnic University, Hong Kong, Hong Kong

MIDDLEWARE '24 Website

ASPLOS '25

March 30 - April 3, 2025

Postillion Hotel and Convention Centre WTC Rotterdam, Rotterdam, Netherlands

EuroSys '25

March 30 - April 3, 2025

World Trade Center, Rotterdam, Netherlands

EuroSys '25 Website

ISCA '25

June 21 - 25, 2025

Waseda University & RIHGA Royal Hotel Tokyo, Tokyo, Japan

ISCA '25 Website

DAC '25

June 22 - 26, 2025

Moscone Center, San Francisco, CA, USA

DAC '25 Website
research-article
April 2023
STfusion: Fast and Flexible Multi-NN Execution Using Spatio-Temporal Block Fusion and Memory Management
IEEE Transactions on Computers (ITCO), Volume 72, Issue 4Pages 1194–1207https://doi.org/10.1109/TC.2022.3218428
To maximize the cost-effectiveness of neural network (NN) accelerators, architects are actively developing single-chip accelerators which can execute many NNs simultaneously. However, previous approaches fail to achieve full performance potential by ...
0
Metrics
Total Citations0
research-article
Open Access
February 2023
A Fast and Flexible FPGA-based Accelerator for Natural Language Processing Neural Networks
ACM Transactions on Architecture and Code Optimization (TACO), Volume 20, Issue 1Article No.: 11, Pages 1–24https://doi.org/10.1145/3564606
Deep neural networks (DNNs) have become key solutions in the natural language processing (NLP) domain. However, the existing accelerators customized for their narrow target models cannot support diverse NLP models. Therefore, naively running complex NLP ...
6
5,695
Metrics
Total Citations6
Total Downloads5,695
Last 12 Months3,282
Last 6 weeks339
View online with eReader
PDF
research-article
December 2023
3D-FPIM: An Extreme Energy-Efficient DNN Acceleration System Using 3D NAND Flash-Based In-Situ PIM Unit
MICRO '22: Proceedings of the 55th Annual IEEE/ACM International Symposium on MicroarchitecturePages 1359–1376https://doi.org/10.1109/MICRO56248.2022.00093

The crossbar structure of the nonvolatile memory enables highly parallel and energy-efficient analog matrix-vector-multiply (MVM) operations. To exploit its efficiency, existing works design a mixed-signal deep neural network (DNN) accelerator, which ...
0
37
Metrics
Total Citations0
Total Downloads37
Last 12 Months37
Last 6 weeks1
Get Access
research-article
June 2022
XQsim: modeling cross-technology control processors for 10+K qubit quantum computers
ISCA '22: Proceedings of the 49th Annual International Symposium on Computer ArchitecturePages 366–382https://doi.org/10.1145/3470496.3527417

10+K qubit quantum computer is essential to achieve a true sense of quantum supremacy. With the recent effort towards the large-scale quantum computer, architects have revealed various scalability issues including the constraints in a quantum control ...
2
1,187
Metrics
Total Citations2
Total Downloads1,187
Last 12 Months335
Last 6 weeks41
Get Access
research-article
April 2022
SmartFVM: A Fast, Flexible, and Scalable Hardware-based Virtualization for Commodity Storage Devices
ACM Transactions on Storage (TOS), Volume 18, Issue 2Article No.: 12, Pages 1–27https://doi.org/10.1145/3511213
A computational storage device incorporating a computation unit inside or near its storage unit is a highly promising technology to maximize a storage server’s performance. However, to apply such computational storage devices and take their full potential ...
3
921
Metrics
Total Citations3
Total Downloads921
Last 12 Months206
Last 6 weeks19
Get Access
research-article
February 2022
CryoWire: wire-driven microarchitecture designs for cryogenic computing
ASPLOS '22: Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating SystemsPages 903–917https://doi.org/10.1145/3503222.3507749

Cryogenic computing, which runs a computer device at an extremely low temperature, is promising thanks to its significant reduction of wire resistance as well as leakage current. Recent studies on cryogenic computing have focused on various architectural ...
4
608
Metrics
Total Citations4
Total Downloads608
Last 12 Months123
Last 6 weeks3
Get Access
research-article
January 2022
LSim: Fine-Grained Simulation Framework for Large-Scale Performance Evaluation
IEEE Computer Architecture Letters (ICAL), Volume 21, Issue 1Pages 25–28https://doi.org/10.1109/LCA.2022.3168831
As large-scale workloads with massive parallelism emerge, the demand for large-scale systems such as datacenters and supercomputers is rising sharply. To accurately design a large-scale system, architects heavily rely on performance modeling at design ...
0
Metrics
Total Citations0
research-article
October 2021
UC-Check: Characterizing Micro-operation Caches in x86 Processors and Implications in Security and Performance
MICRO '21: MICRO-54: 54th Annual IEEE/ACM International Symposium on MicroarchitecturePages 550–564https://doi.org/10.1145/3466752.3480079

The modern x86 processor (e.g., Intel, AMD) translates CISC-style x86 instructions to RISC-style micro operations (uops) as RISC pipelines are more efficient than CISC pipelines. However, this x86 decoding process requires complex hardware logic (i.e., ...
3
948
Metrics
Total Citations3
Total Downloads948
Last 12 Months228
Last 6 weeks21
Get Access
research-article
September 2021
An accurate and fair evaluation methodology for SNN-based inferencing with full-stack hardware design space explorations
Neurocomputing (NEUROC), Volume 455, Issue CPages 125–138https://doi.org/10.1016/j.neucom.2021.05.020
Highlights

Existing SNN evaluations are inaccurate due to limited design point considerations.
Abstract
Artificial Neural Networks (ANNs) achieve high accuracy in various cognitive tasks (i.e., inferences), but often fail to meet power and latency budgets due to intensive computational overheads. To address the challenge, Spiking Neural ...
0
Metrics
Total Citations0
research-article
November 2021
CryoGuard: a near refresh-free robust DRAM design for cryogenic computing
ISCA '21: Proceedings of the 48th Annual International Symposium on Computer ArchitecturePages 637–650https://doi.org/10.1109/ISCA52012.2021.00056

Cryogenic computing, which runs a computer device at an extremely low temperature, is highly promising thanks to the significant reduction of the wire latency and leakage current. A recently proposed cryogenic DRAM design achieved the promising ...
2
110
Metrics
Total Citations2
Total Downloads110
Last 12 Months17
Last 6 weeks1
Get Access
research-article
June 2021
Performance Modeling and Practical Use Cases for Black-Box SSDs
ACM Transactions on Storage (TOS), Volume 17, Issue 2Article No.: 14, Pages 1–38https://doi.org/10.1145/3440022
Modern servers are actively deploying Solid-State Drives (SSDs) thanks to their high throughput and low latency. However, current server architects cannot achieve the full performance potential of commodity SSDs, as SSDs are complex devices designed for ...
6
528
Metrics
Total Citations6
Total Downloads528
Last 12 Months117
Last 6 weeks18
Get Access
research-article
April 2021
NeuroEngine: a hardware-based event-driven simulation system for advanced brain-inspired computing
ASPLOS '21: Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating SystemsPages 975–989https://doi.org/10.1145/3445814.3446738

Brain-inspired computing aims to understand the cognitive mechanisms of a brain and apply them to advance various areas in computer science. Deep learning is an example to greatly improve the field of pattern recognition and classification by utilizing ...
4
840
Metrics
Total Citations4
Total Downloads840
Last 12 Months152
Last 6 weeks11
Get Access
research-article
Free
November 2020
FVM: FPGA-assisted virtual device emulation for fast, scalable, and flexible storage virtualization
OSDI'20: Proceedings of the 14th USENIX Conference on Operating Systems Design and ImplementationArticle No.: 54, Pages 955–971

Emerging big-data workloads with massive I/O processing require fast, scalable, and flexible storage virtualization support. Hardware-assisted virtualization can achieve reasonable performance for fast storage devices, but it comes at the expense of ...
1
84
Metrics
Total Citations1
Total Downloads84
Last 12 Months39
Last 6 weeks13
View online with eReader
PDF
research-article
November 2020
Scalable multi-FPGA acceleration for large RNNs with full parallelism levels
DAC '20: Proceedings of the 57th ACM/EDAC/IEEE Design Automation ConferenceArticle No.: 193, Pages 1–6

The increasing size of recurrent neural networks (RNNs) makes it hard to meet the growing demand for real-time AI services. For low-latency RNN serving, FPGA-based accelerators can leverage specialized architectures with optimized dataflow. However, ...
2
60
Metrics
Total Citations2
Total Downloads60
Last 12 Months8
Last 6 weeks1
Get Access
research-article
September 2020
A multi-neural network acceleration architecture
ISCA '20: Proceedings of the ACM/IEEE 47th Annual International Symposium on Computer ArchitecturePages 940–953https://doi.org/10.1109/ISCA45697.2020.00081

A cost-effective multi-tenant neural network execution is becoming one of the most important design goals for modern neural network accelerators. For example, as emerging AI services consist of many heterogeneous neural network executions, a cloud ...
20
518
Metrics
Total Citations20
Total Downloads518
Last 12 Months82
Last 6 weeks9
Get Access

Applied Filters

People

Names

Institutions

Authors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

CoolDC: A Cost-Effective Immersion-Cooled Datacenter with Workload-Aware Temperature Scaling

A Fault-Tolerant Million Qubit-Scale Distributed Quantum Computer

Fast, Light-weight, and Accurate Performance Evaluation using Representative Datacenter Behaviors

F4T: A Fast and Flexible FPGA-based Full-stack TCP Acceleration Framework

QIsim: Architecting 10+K Qubit QC Interfaces Toward Quantum Supremacy

Upcoming Conferences

STfusion: Fast and Flexible Multi-NN Execution Using Spatio-Temporal Block Fusion and Memory Management

A Fast and Flexible FPGA-based Accelerator for Natural Language Processing Neural Networks

3D-FPIM: An Extreme Energy-Efficient DNN Acceleration System Using 3D NAND Flash-Based In-Situ PIM Unit

XQsim: modeling cross-technology control processors for 10+K qubit quantum computers

SmartFVM: A Fast, Flexible, and Scalable Hardware-based Virtualization for Commodity Storage Devices

CryoWire: wire-driven microarchitecture designs for cryogenic computing

LSim: Fine-Grained Simulation Framework for Large-Scale Performance Evaluation

UC-Check: Characterizing Micro-operation Caches in x86 Processors and Implications in Security and Performance

An accurate and fair evaluation methodology for SNN-based inferencing with full-stack hardware design space explorations

CryoGuard: a near refresh-free robust DRAM design for cryogenic computing

Performance Modeling and Practical Use Cases for Black-Box SSDs

NeuroEngine: a hardware-based event-driven simulation system for advanced brain-inspired computing

FVM: FPGA-assisted virtual device emulation for fast, scalable, and flexible storage virtualization

Scalable multi-FPGA acceleration for large RNNs with full parallelism levels

A multi-neural network acceleration architecture