Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleDecember 2024
Enhancing HLS Performance Prediction on FPGAs Through Multimodal Representation Learning
IEEE Embedded Systems Letters (IESL), Volume 16, Issue 4Pages 385–388https://doi.org/10.1109/LES.2024.3446797The emergence of design space exploration (DSE) technology has reduced the cost of searching for pragma configurations that lead to optimal performance microarchitecture. However, obtaining synthesis reports for a single design candidate can be time-...
- research-articleDecember 2024
Advancing tracking-by-detection with MultiMap: Towards occlusion-resilient online multiclass strawberry counting
Expert Systems with Applications: An International Journal (EXWA), Volume 255, Issue PBhttps://doi.org/10.1016/j.eswa.2024.124587AbstractDespite the economic importance and research relevance of strawberries, advances in agricultural engineering for this crop have been hampered by pervasive occlusion challenges. Accurate fruit counting is crucial for both yield prediction and ...
Highlights- Automated occlusion-robust counting for accurate strawberry yield assessment.
- Enhanced strawberry detection with improved YOLOv5s and attention.
- Developed the MultiMap algorithm for precise counting from tracking results.
- ...
- ArticleNovember 2024
MFNAS: Multi-fidelity Exploration in Neural Architecture Search with Stable Zero-Shot Proxy
PRICAI 2024: Trends in Artificial IntelligencePages 348–360https://doi.org/10.1007/978-981-96-0116-5_29AbstractNeural architecture search (NAS) automates the design of neural networks for specific tasks. Recently, zero-shot NAS has attracted much attention. Unlike traditional NAS, which relies on training to rank architectures, zero-shot NAS uses gradients ...
- research-articleNovember 2024
NebulaFL: Self-Organizing Efficient Multilayer Federated Learning Framework With Adaptive Load Tuning in Heterogeneous Edge Systems
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCADICS), Volume 43, Issue 11Pages 3358–3369https://doi.org/10.1109/TCAD.2024.3443715As a promising edge intelligence technology, federated learning (FL) enables Internet of Things (IoT) devices to train the models collaboratively while ensuring the data privacy and security. Recently, hierarchical FL (HFL) has been designed to promote ...
- research-articleNovember 2024
Arch2End: Two-Stage Unified System-Level Modeling for Heterogeneous Intelligent Devices
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCADICS), Volume 43, Issue 11Pages 4154–4165https://doi.org/10.1109/TCAD.2024.3443706The surge in intelligent edge computing has propelled the adoption and expansion of the distributed embedded systems (DESs). Numerous scheduling strategies are introduced to improve the DES throughput, such as latency-aware and group-based hierarchical ...
-
- research-articleNovember 2024
FlexBCM: Hybrid Block-Circulant Neural Network and Accelerator Co-Search on FPGAs
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCADICS), Volume 43, Issue 11Pages 3852–3863https://doi.org/10.1109/TCAD.2024.3439488Block-circulant matrix (BCM) compression has garnered much attention in the hardware acceleration of convolutional neural networks (CNNs) due to its regularity and efficiency. However, constrained by the difficulty of exploring the compression parameter ...
- research-articleOctober 2024
Unleashing Network/Accelerator Co-Exploration Potential on FPGAs: A Deeper Joint Search
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCADICS), Volume 43, Issue 10Pages 3041–3054https://doi.org/10.1109/TCAD.2024.3391688Recently, algorithm-hardware (HW) co-exploration for neural networks (NNs) has become the key to obtaining high-quality solutions. However, previous efforts for field-programmable gate arrays (FPGAs) focus on neural architecture search (NAS) while lacking ...
- research-articleNovember 2024
PowerLens: An Adaptive DVFS Framework for Optimizing Energy Efficiency in Deep Neural Networks
DAC '24: Proceedings of the 61st ACM/IEEE Design Automation ConferenceArticle No.: 228, Pages 1–6https://doi.org/10.1145/3649329.3655956To address the power management challenges in deep neural networks (DNNs), dynamic voltage and frequency scaling (DVFS) technology is garnering attention for its ability to enhance energy efficiency without modifying the structure of DNNs. However, ...
- short-paperJune 2024
Enhancing Long Sequence Input Processing in FPGA-Based Transformer Accelerators through Attention Fusion
GLSVLSI '24: Proceedings of the Great Lakes Symposium on VLSI 2024Pages 599–603https://doi.org/10.1145/3649476.3658810Attention-based transformers have achieved significant performance breakthroughs in natural language processing (NLP) and computer vision (CV) tasks. Meanwhile, the ever-increasing length of today’s input sequences puts much pressure on computing ...
- research-articleJune 2024
FedStar: Efficient Federated Learning on Heterogeneous Communication Networks
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCADICS), Volume 43, Issue 6Pages 1848–1861https://doi.org/10.1109/TCAD.2023.3346274The proliferation of multimedia applications and increased computing power of mobile devices have led to the development of personalized artificial intelligent (AI) applications that utilize the massive user-information residing on them. However, the ...
- research-articleMarch 2024
Enhancing Graph Random Walk Acceleration via Efficient Dataflow and Hybrid Memory Architecture
IEEE Transactions on Computers (ITCO), Volume 73, Issue 3Pages 887–901https://doi.org/10.1109/TC.2023.3347674Graph random walk sampling is becoming increasingly important with the widespread popularity of graph applications. It aims to capture the desirable graph properties by launching multiple walkers to collect feature paths. However, previous research ...
- research-articleFebruary 2024
Emergent communication for numerical concepts generalization
- Enshuai Zhou,
- Yifan Hao,
- Rui Zhang,
- Yuxuan Guo,
- Zidong Du,
- Xishan Zhang,
- Xinkai Song,
- Chao Wang,
- Xuehai Zhou,
- Jiaming Guo,
- Qi Yi,
- Shaohui Peng,
- Di Huang,
- Ruizhi Chen,
- Qi Guo,
- Yunji Chen
AAAI'24/IAAI'24/EAAI'24: Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial IntelligenceArticle No.: 1964, Pages 17609–17617https://doi.org/10.1609/aaai.v38i16.29712Research on emergent communication has recently gained significant traction as a promising avenue for the linguistic community to unravel human language's origins and explore artificial intelligence's generalization capabilities. Current research has ...
- research-articleFebruary 2024
Ace-Sniper: Cloud–Edge Collaborative Scheduling Framework With DNN Inference Latency Modeling on Heterogeneous Devices
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCADICS), Volume 43, Issue 2Pages 534–547https://doi.org/10.1109/TCAD.2023.3314388The cloud–edge collaborative inference requires efficient scheduling of artificial intelligence (AI) tasks to the appropriate edge intelligence devices. Gls DNN inference latency has become a vital basis for improving scheduling efficiency. However,...
- research-articleJanuary 2024
Flexible and Efficient Memory Swapping Across Mobile Devices With LegoSwap
IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 35, Issue 1Pages 140–153https://doi.org/10.1109/TPDS.2023.3331703This article presents LegoSwap, a cross-device memory swapping mechanism for mobile devices. It exploits the unbalanced utilization of memory resources across devices. With LegoSwap, remote memory is utilized in a seamless plug-and-play manner. It ...
- research-articleJanuary 2024
Heter-Train: A Distributed Training Framework Based on Semi-Asynchronous Parallel Mechanism for Heterogeneous Intelligent Transportation Systems
- Jiawei Geng,
- Jing Cao,
- Haipeng Jia,
- Zongwei Zhu,
- Hai Fang,
- Chengxi Gao,
- Cheng Ji,
- Gangyong Jia,
- Guangjie Han,
- Xuehai Zhou
IEEE Transactions on Intelligent Transportation Systems (ITS-TRANSACTIONS), Volume 25, Issue 1Pages 959–972https://doi.org/10.1109/TITS.2023.3286400Transportation big data (TBD) are increasingly combined with artificial intelligence to mine novel patterns and information due to the powerful representational capabilities of deep neural networks (DNNs), especially for anti-COVID19 applications. The ...
- research-articleDecember 2023
Emergent communication for rules reasoning
- Yuxuan Guo,
- Yifan Hao,
- Rui Zhang,
- Enshuai Zhou,
- Zidong Du,
- Xishan Zhang,
- Xinkai Song,
- Yuanbo Wen,
- Yongwei Zhao,
- Xuehai Zhou,
- Jiaming Guo,
- Qi Yi,
- Shaohui Peng,
- Di Huang,
- Ruizhi Chen,
- Qi Guo,
- Yunji Chen
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 3004, Pages 68655–68672Research on emergent communication between deep-learning-based agents has received extensive attention due to its inspiration for linguistics and artificial intelligence. However, previous attempts have hovered around emerging communication under ...
- research-articleDecember 2023
Algorithm/Hardware Co-Optimization for Sparsity-Aware SpMM Acceleration of GNNs
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCADICS), Volume 42, Issue 12Pages 4763–4776https://doi.org/10.1109/TCAD.2023.3281714In recent years, graph neural networks (GNNs) have achieved impressive performance in various application fields by extracting information from graph-structured data. It contains extensive feature aggregation operations and has become a performance ...
- ArticleDecember 2023
NeuralMAE: Data-Efficient Neural Architecture Predictor with Masked Autoencoder
AbstractPredictor-based Neural Architecture Search (NAS) offers a promising solution for enhancing the efficiency of traditional NAS methods. However, it is non-trivial to train the predictor with limited architecture evaluations for efficient NAS. While ...
- Work in ProgressJanuary 2024
Work-in-Progress: NAPMAE: Generalized Data-Efficient Neural Architecture Predictor with Masked Autoencoder
CODES/ISSS '23 Companion: Proceedings of the 2023 International Conference on Hardware/Software Codesign and System SynthesisPages 48–49https://doi.org/10.1145/3607888.3608586Predictor-based Neural Architecture Search (NAS) offers a promising solution for enhancing the efficiency of traditional NAS methods. However, it is non-trivial to train the predictor with limited architecture evaluations for efficient NAS. In this paper,...
- research-articleApril 2023
Enabling Fast and Memory-Efficient Acceleration for Pattern Matching Workloads: The Lightweight Automata Processing Engine
IEEE Transactions on Computers (ITCO), Volume 72, Issue 4Pages 1011–1025https://doi.org/10.1109/TC.2022.3187338Growing pattern matching applications are employing finite automata as their basic processing model. These applications match tens to thousands of patterns on a large amount of data, which brings a great challenge to conventional processors. Therefore ...