Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleApril 2024
HiHGNN: Accelerating HGNNs Through Parallelism and Data Reusability Exploitation
- Runzhen Xue,
- Dengke Han,
- Mingyu Yan,
- Mo Zou,
- Xiaocheng Yang,
- Duo Wang,
- Wenming Li,
- Zhimin Tang,
- John Kim,
- Xiaochun Ye,
- Dongrui Fan
IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 35, Issue 7July 2024, Pages 1122–1138https://doi.org/10.1109/TPDS.2024.3394841Heterogeneous graph neural networks (HGNNs) have emerged as powerful algorithms for processing heterogeneous graphs (HetGs), widely used in many critical fields. To capture both structural and semantic information in HetGs, HGNNs first aggregate the ...
- research-articleFebruary 2024
Improving Utilization of Dataflow Unit for Multi-Batch Processing
ACM Transactions on Architecture and Code Optimization (TACO), Volume 21, Issue 1Article No.: 17, Pages 1–26https://doi.org/10.1145/3637906Dataflow architectures can achieve much better performance and higher efficiency than general-purpose core, approaching the performance of a specialized design while retaining programmability. However, advanced application scenarios place higher demands ...
- research-articleDecember 2023
MoDSE: A High-Accurate Multiobjective Design Space Exploration Framework for CPU Microarchitectures
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCADICS), Volume 43, Issue 5May 2024, Pages 1525–1537https://doi.org/10.1109/TCAD.2023.3340059To accelerate time-consuming multiobjective design space exploration of CPU microarchitecture, previous work trains prediction models using a set of performance metrics derived from a few simulations, then predicts the rest. Unfortunately, the low ...
- research-articleOctober 2023
Accelerating Convolutional Neural Networks by Exploiting the Sparsity of Output Activation
- Zhihua Fan,
- Wenming Li,
- Zhen Wang,
- Tianyu Liu,
- Haibin Wu,
- Yanhuan Liu,
- Meng Wu,
- Xinxin Wu,
- Xiaochun Ye,
- Dongrui Fan,
- Ninghui Sun,
- Xuejun An
IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 34, Issue 12Dec. 2023, Pages 3253–3265https://doi.org/10.1109/TPDS.2023.3324934Deep Convolutional Neural Networks (CNNs) are the most widely used family of machine learning methods that have had a transformative effect on a wide range of applications. Previous studies have made great breakthroughs in accelerating CNNs, but they only ...
-
- research-articleAugust 2023
Characterizing and Understanding Defense Methods for GNNs on GPUs
IEEE Computer Architecture Letters (ICAL), Volume 22, Issue 2July-Dec. 2023, Pages 137–140https://doi.org/10.1109/LCA.2023.3304638Graph neural networks (GNNs) are widely deployed in many vital fields, but suffer from adversarial attacks, which seriously compromise the security in these fields. Plenty of defense methods have been proposed to mitigate the impact of these attacks, ...
- short-paperJune 2023
A High-accurate Multi-objective Ensemble Exploration Framework for Design Space of CPU Microarchitecture
GLSVLSI '23: Proceedings of the Great Lakes Symposium on VLSI 2023June 2023, Pages 379–383https://doi.org/10.1145/3583781.3590280To accelerate the time-consuming multi-objective design space exploration of CPU, previous work trains prediction models using a set of cycle per instruction and power performance metrics derived from a few simulations for sampled design points, then ...
- research-articleJune 2023
JRouter: A Multi-Terminal Hierarchical Length-Matching Router under Planar Manhattan Routing Model for RSFQ Circuits
GLSVLSI '23: Proceedings of the Great Lakes Symposium on VLSI 2023June 2023, Pages 515–520https://doi.org/10.1145/3583781.3590267Superconducting rapid single-flux-quantum (RSFQ) logic has shown great potential for high-energy-efficient computing systems. To ensure correct operations at ultra-high frequencies, it is necessary to incorporate length-matching constraints into the ...
- research-articleFebruary 2023
Simple and efficient heterogeneous graph neural network
AAAI'23/IAAI'23/EAAI'23: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial IntelligenceFebruary 2023, Article No.: 1214, Pages 10816–10824https://doi.org/10.1609/aaai.v37i9.26283Heterogeneous graph neural networks (HGNNs) have the powerful capability to embed rich structural and semantic information of a heterogeneous graph into node representations. Existing HGNNs inherit many mechanisms from graph neural networks (GNNs) ...
- ArticleJanuary 2023
MatGraph: An Energy-Efficient and Flexible CGRA Engine for Matrix-Based Graph Analytics
Algorithms and Architectures for Parallel ProcessingOct 2022, Pages 351–372https://doi.org/10.1007/978-3-031-22677-9_19AbstractGraph analytics is increasingly important for solving problems in various fields. Matrix-based graph analytics has obtained much attention due to its high performance and ease of optimization. In the general architecture, due to the extremely high ...
- ArticleJanuary 2023
- ArticleSeptember 2022
- ArticleMarch 2023
GNNSampler: Bridging the Gap Between Sampling Algorithms of GNN and Hardware
Machine Learning and Knowledge Discovery in DatabasesSep 2022, Pages 498–514https://doi.org/10.1007/978-3-031-26419-1_30AbstractSampling is a critical operation in Graph Neural Network (GNN) training that helps reduce the cost. Previous literature has explored improving sampling algorithms via mathematical and statistical methods. However, there is a gap between sampling ...
- research-articleJuly 2022
Domain adaptive person re-identification with memory-based circular ranking
Applied Intelligence (KLU-APIN), Volume 53, Issue 6Mar 2023, Pages 7007–7021https://doi.org/10.1007/s10489-022-03602-4AbstractDespite the impressive achievement in supervised person re-identification (re-id), existing supervised approaches mainly focus on exhaustive identity annotations of each image. Their performance will degrade significantly when the test dataset’s ...
- research-articleAugust 2022
Alleviating datapath conflicts and design centralization in graph analytics acceleration
DAC '22: Proceedings of the 59th ACM/IEEE Design Automation ConferenceJuly 2022, Pages 901–906https://doi.org/10.1145/3489517.3530524Previous graph analytics accelerators have achieved great improvement on throughput by alleviating irregular off-chip memory accesses. However, on-chip side datapath conflicts and design centralization have become the critical issues hindering further ...
- research-articleJuly 2022
Characterization and Implementation of Radar System Applications on a Reconfigurable Dataflow Architecture
- Yinshen Wang,
- Wenming Li,
- Tianyu Liu,
- Liangjiang Zhou,
- Bingnan Wang,
- Zhihua Fan,
- Xiaochun Ye,
- Dongrui Fan,
- Chibiao Ding
IEEE Computer Architecture Letters (ICAL), Volume 21, Issue 2July-Dec. 2022, Pages 121–124https://doi.org/10.1109/LCA.2022.3215595The fast developed and widely used radar system techniques call for novel solutions on hardware design. Under a massive data source and the real-time requirements in radar signal processing scenarios (e.g., Synthetic Aperture Radar (SAR)), reconfigurable ...
- research-articleJuly 2022
Characterizing and Understanding HGNNs on GPUs
IEEE Computer Architecture Letters (ICAL), Volume 21, Issue 2July-Dec. 2022, Pages 69–72https://doi.org/10.1109/LCA.2022.3198281Heterogeneous graph neural networks (HGNNs) deliver powerful capacity in heterogeneous graph representation learning. The execution of HGNNs is usually accelerated by GPUs. Therefore, characterizing and understanding the execution pattern of HGNNs on GPUs ...
- research-articleJuly 2022
A synergistic reinforcement learning-based framework design in driving automation
Computers and Electrical Engineering (CENG), Volume 101, Issue CJul 2022https://doi.org/10.1016/j.compeleceng.2022.107989AbstractAutonomous driving, which integrates artificial intelligence and the Internet of Things, has piqued the interest of both academics and industry because of its economic and societal benefits. Rigorous accuracy and latency requirements ...
- research-articleJuly 2022
Accelerating Data Transfer in Dataflow Architectures Through a Look-Ahead Acknowledgment Mechanism
Journal of Computer Science and Technology (JCST), Volume 37, Issue 4Jul 2022, Pages 942–959https://doi.org/10.1007/s11390-020-0555-6AbstractThe dataflow architecture, which is characterized by a lack of a redundant unified control logic, has been shown to have an advantage over the control-flow architecture as it improves the computational performance and power efficiency, especially ...
- short-paperJune 2022
HetGraph: A High Performance CPU-CGRA Architecture for Matrix-based Graph Analytics
GLSVLSI '22: Proceedings of the Great Lakes Symposium on VLSI 2022June 2022, Pages 387–391https://doi.org/10.1145/3526241.3530382In this paper, we explore graph analytics on a heterogeneous platform named HetGraph integrating with CPU and a flexible CGRA accelerator called RFU for matrix-based paradigm in this paper. RFU utilizes the lightweight pipeline without data hazards to ...