Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 88 results for author: Hao, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.13185  [pdf, other

    cs.CL

    Learnable In-Context Vector for Visual Question Answering

    Authors: Yingzhe Peng, Chenduo Hao, Xu Yang, Jiawei Peng, Xinting Hu, Xin Geng

    Abstract: As language models continue to scale, Large Language Models (LLMs) have exhibited emerging capabilities in In-Context Learning (ICL), enabling them to solve language tasks by prefixing a few in-context demonstrations (ICDs) as context. Inspired by these advancements, researchers have extended these techniques to develop Large Multimodal Models (LMMs) with ICL capabilities. However, applying ICL us… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  2. arXiv:2406.09767  [pdf, other

    cs.RO

    Language-Guided Manipulation with Diffusion Policies and Constrained Inpainting

    Authors: Ce Hao, Kelvin Lin, Siyuan Luo, Harold Soh

    Abstract: Diffusion policies have demonstrated robust performance in generative modeling, prompting their application in robotic manipulation controlled via language descriptions. In this paper, we introduce a zero-shot, open-vocabulary diffusion policy method for robot manipulation. Using Vision-Language Models (VLMs), our method transforms linguistic task descriptions into actionable keyframes in 3D space… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  3. arXiv:2405.15225  [pdf, other

    cs.CV

    Unbiased Faster R-CNN for Single-source Domain Generalized Object Detection

    Authors: Yajing Liu, Shijun Zhou, Xiyao Liu, Chunhui Hao, Baojie Fan, Jiandong Tian

    Abstract: Single-source domain generalization (SDG) for object detection is a challenging yet essential task as the distribution bias of the unseen domain degrades the algorithm performance significantly. However, existing methods attempt to extract domain-invariant features, neglecting that the biased data leads the network to learn biased features that are non-causal and poorly generalizable. To this end,… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: CVPR 2024

  4. arXiv:2405.00820  [pdf, other

    cs.AR cs.LG

    HLSFactory: A Framework Empowering High-Level Synthesis Datasets for Machine Learning and Beyond

    Authors: Stefan Abi-Karam, Rishov Sarkar, Allison Seigler, Sean Lowe, Zhigang Wei, Hanqiu Chen, Nanditha Rao, Lizy John, Aman Arora, Cong Hao

    Abstract: Machine learning (ML) techniques have been applied to high-level synthesis (HLS) flows for quality-of-result (QoR) prediction and design space exploration (DSE). Nevertheless, the scarcity of accessible high-quality HLS datasets and the complexity of building such datasets present challenges. Existing datasets have limitations in terms of benchmark coverage, design space enumeration, vendor extens… ▽ More

    Submitted 17 May, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

    Comments: Edit to "Section V.E" for proper attribution of open-source HLSyn, AutoDSE, and the Merlin compiler

  5. arXiv:2404.16192  [pdf, other

    cs.CL cs.CV

    Fusion of Domain-Adapted Vision and Language Models for Medical Visual Question Answering

    Authors: Cuong Nhat Ha, Shima Asaadi, Sanjeev Kumar Karn, Oladimeji Farri, Tobias Heimann, Thomas Runkler

    Abstract: Vision-language models, while effective in general domains and showing strong performance in diverse multi-modal applications like visual question-answering (VQA), struggle to maintain the same level of effectiveness in more specialized domains, e.g., medical. We propose a medical vision-language model that integrates large vision and language models adapted for the medical domain. This model goes… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: Clinical NLP @ NAACL 2024

  6. arXiv:2404.09471  [pdf, other

    cs.PF cs.AR

    LightningSimV2: Faster and Scalable Simulation for High-Level Synthesis via Graph Compilation and Optimization

    Authors: Rishov Sarkar, Rachel Paul, Cong Hao

    Abstract: High-Level Synthesis (HLS) enables rapid prototyping of complex hardware designs by translating C or C++ code to low-level RTL code. However, the testing and evaluation of HLS designs still typically rely on slow RTL-level simulators that can take hours to provide feedback, especially for complex designs. A recent work, LightningSim, helps to solve this problem by providing a simulation workflow o… ▽ More

    Submitted 16 April, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

    Comments: 11 pages, 6 figures. Accepted at FCCM 2024

  7. arXiv:2404.03216  [pdf, other

    cs.CR

    Accurate Low-Degree Polynomial Approximation of Non-polynomial Operators for Fast Private Inference in Homomorphic Encryption

    Authors: Jianming Tong, Jingtian Dang, Anupam Golder, Callie Hao, Arijit Raychowdhury, Tushar Krishna

    Abstract: As machine learning (ML) permeates fields like healthcare, facial recognition, and blockchain, the need to protect sensitive data intensifies. Fully Homomorphic Encryption (FHE) allows inference on encrypted data, preserving the privacy of both data and the ML model. However, it slows down non-secure inference by up to five magnitudes, with a root cause of replacing non-polynomial operators (ReLU… ▽ More

    Submitted 7 May, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

    Comments: Proceedings of the 5th MLSys Conference, Santa Clara, CA, USA, 2024. Copyright 2024 by the author(s)

  8. arXiv:2403.13365  [pdf, other

    cs.RO cs.CV

    ManiPose: A Comprehensive Benchmark for Pose-aware Object Manipulation in Robotics

    Authors: Qiaojun Yu, Ce Hao, Junbo Wang, Wenhai Liu, Liu Liu, Yao Mu, Yang You, Hengxu Yan, Cewu Lu

    Abstract: Robotic manipulation in everyday scenarios, especially in unstructured environments, requires skills in pose-aware object manipulation (POM), which adapts robots' grasping and handling according to an object's 6D pose. Recognizing an object's position and orientation is crucial for effective manipulation. For example, if a mug is lying on its side, it's more effective to grasp it by the rim rather… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: 8 pages, 7 figures, submitted to 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)

  9. arXiv:2403.03432  [pdf, other

    cs.CL cs.AI

    Mixture-of-LoRAs: An Efficient Multitask Tuning for Large Language Models

    Authors: Wenfeng Feng, Chuzhan Hao, Yuewei Zhang, Yu Han, Hao Wang

    Abstract: Instruction Tuning has the potential to stimulate or enhance specific capabilities of large language models (LLMs). However, achieving the right balance of data is crucial to prevent catastrophic forgetting and interference between tasks. To address these limitations and enhance training flexibility, we propose the Mixture-of-LoRAs (MoA) architecture which is a novel and parameter-efficient tuning… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: 10 pages, COLING24 Accepted

  10. arXiv:2402.18922  [pdf, other

    cs.CV

    A Simple yet Effective Network based on Vision Transformer for Camouflaged Object and Salient Object Detection

    Authors: Chao Hao, Zitong Yu, Xin Liu, Jun Xu, Huanjing Yue, Jingyu Yang

    Abstract: Camouflaged object detection (COD) and salient object detection (SOD) are two distinct yet closely-related computer vision tasks widely studied during the past decades. Though sharing the same purpose of segmenting an image into binary foreground and background regions, their distinction lies in the fact that COD focuses on concealed objects hidden in the image, while SOD concentrates on the most… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: submitted to IEEE TIP

  11. arXiv:2402.14194  [pdf, other

    cs.LG cs.RO

    BeTAIL: Behavior Transformer Adversarial Imitation Learning from Human Racing Gameplay

    Authors: Catherine Weaver, Chen Tang, Ce Hao, Kenta Kawamoto, Masayoshi Tomizuka, Wei Zhan

    Abstract: Imitation learning learns a policy from demonstrations without requiring hand-designed reward functions. In many robotic tasks, such as autonomous racing, imitated policies must model complex environment dynamics and human decision-making. Sequence modeling is highly effective in capturing intricate patterns of motion sequences but struggles to adapt to new environments or distribution shifts that… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: Preprint

  12. arXiv:2402.02457  [pdf, other

    cs.RO

    A Risk-aware Planning Framework of UGVs in Off-Road Environment

    Authors: Junkai Jiang, Zhenhua Hu, Zihan Xie, Changlong Hao, Hongyu Liu, Wenliang Xu, Yuning Wang, Lei He, Shaobing Xu, Jianqiang Wang

    Abstract: Planning module is an essential component of intelligent vehicle study. In this paper, we address the risk-aware planning problem of UGVs through a global-local planning framework which seamlessly integrates risk assessment methods. In particular, a global planning algorithm named Coarse2fine A* is proposed, which incorporates a potential field approach to enhance the safety of the planning result… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

    Comments: 15 pages, 18 figures, submit to T-IV

  13. arXiv:2401.02701  [pdf, ps, other

    cs.IT eess.SP

    Joint User Association and Power Control for Cell-Free Massive MIMO

    Authors: Chongzheng Hao, Tung Thanh Vu, Hien Quoc Ngo, Minh N. Dao, Xiaoyu Dang, Chenghua Wang, Michail Matthaiou

    Abstract: This work proposes novel approaches that jointly design user equipment (UE) association and power control (PC) in a downlink user-centric cell-free massive multiple-input multiple-output (CFmMIMO) network, where each UE is only served by a set of access points (APs) for reducing the fronthaul signalling and computational complexity. In order to maximize the sum spectral efficiency (SE) of the UEs,… ▽ More

    Submitted 20 May, 2024; v1 submitted 5 January, 2024; originally announced January 2024.

    Comments: minor revision of the previous version

  14. arXiv:2310.19056  [pdf, other

    cs.IR cs.AI cs.CL

    MILL: Mutual Verification with Large Language Models for Zero-Shot Query Expansion

    Authors: Pengyue Jia, Yiding Liu, Xiangyu Zhao, Xiaopeng Li, Changying Hao, Shuaiqiang Wang, Dawei Yin

    Abstract: Query expansion, pivotal in search engines, enhances the representation of user information needs with additional terms. While existing methods expand queries using retrieved or generated contextual documents, each approach has notable limitations. Retrieval-based methods often fail to accurately capture search intent, particularly with brief or ambiguous queries. Generation-based methods, utilizi… ▽ More

    Submitted 28 March, 2024; v1 submitted 29 October, 2023; originally announced October 2023.

    Comments: Accepted to NAACL 2024

  15. arXiv:2309.16264  [pdf, other

    cs.RO cs.CV

    GAMMA: Generalizable Articulation Modeling and Manipulation for Articulated Objects

    Authors: Qiaojun Yu, Junbo Wang, Wenhai Liu, Ce Hao, Liu Liu, Lin Shao, Weiming Wang, Cewu Lu

    Abstract: Articulated objects like cabinets and doors are widespread in daily life. However, directly manipulating 3D articulated objects is challenging because they have diverse geometrical shapes, semantic categories, and kinetic constraints. Prior works mostly focused on recognizing and manipulating articulated objects with specific joint types. They can either estimate the joint parameters or distinguis… ▽ More

    Submitted 1 March, 2024; v1 submitted 28 September, 2023; originally announced September 2023.

    Comments: 8 pages, 5 figures, ICRA 2024

  16. arXiv:2308.12028  [pdf, other

    cs.IR cs.AI

    LKPNR: LLM and KG for Personalized News Recommendation Framework

    Authors: Chen hao, Xie Runfeng, Cui Xiangyang, Yan Zhou, Wang Xin, Xuan Zhanwei, Zhang Kai

    Abstract: Accurately recommending candidate news articles to users is a basic challenge faced by personalized news recommendation systems. Traditional methods are usually difficult to grasp the complex semantic information in news texts, resulting in unsatisfactory recommendation results. Besides, these traditional methods are more friendly to active users with rich historical behaviors. However, they can n… ▽ More

    Submitted 23 August, 2023; originally announced August 2023.

  17. arXiv:2308.05930  [pdf, other

    cs.AR cs.DC cs.LG

    INR-Arch: A Dataflow Architecture and Compiler for Arbitrary-Order Gradient Computations in Implicit Neural Representation Processing

    Authors: Stefan Abi-Karam, Rishov Sarkar, Dejia Xu, Zhiwen Fan, Zhangyang Wang, Cong Hao

    Abstract: An increasing number of researchers are finding use for nth-order gradient computations for a wide variety of applications, including graphics, meta-learning (MAML), scientific computing, and most recently, implicit neural representations (INRs). Recent work shows that the gradient of an INR can be used to edit the data it represents directly without needing to convert it back to a discrete repres… ▽ More

    Submitted 11 August, 2023; originally announced August 2023.

    Comments: 9 pages, 8 figures, 4 tables

  18. arXiv:2306.17727   

    cs.CL

    Improved NL2SQL based on Multi-layer Expert Network

    Authors: Chenduo Hao, Xu Zhang

    Abstract: The Natural Language to SQL (NL2SQL) technique is used to convert natural language queries into executable SQL statements. Typically, slot-filling is employed as a classification method for multi-task cases to achieve this goal. However, slot-filling can result in inaccurate SQL statement generation due to negative migration issues arising from different classification tasks. To overcome this limi… ▽ More

    Submitted 17 September, 2023; v1 submitted 30 June, 2023; originally announced June 2023.

    Comments: our paper need to be repaired

  19. arXiv:2306.17646  [pdf, other

    cs.CL

    Feature Representation Learning for NL2SQL Generation Based on Coupling and Decoupling

    Authors: Chenduo Hao, Xu Zhang, Chuanbao Gao, Deyu Zhou

    Abstract: The NL2SQL task involves parsing natural language statements into SQL queries. While most state-of-the-art methods treat NL2SQL as a slot-filling task and use feature representation learning techniques, they overlook explicit correlation features between the SELECT and WHERE clauses and implicit correlation features between sub-tasks within a single clause. To address this issue, we propose the Cl… ▽ More

    Submitted 30 June, 2023; originally announced June 2023.

    Comments: 12pages, 4figures International Conference on Artificial Neural Networks

  20. arXiv:2306.16699  [pdf, other

    cs.CV cs.AI cs.AR cs.LG

    Rapid-INR: Storage Efficient CPU-free DNN Training Using Implicit Neural Representation

    Authors: Hanqiu Chen, Hang Yang, Stephen Fitzmeyer, Cong Hao

    Abstract: Implicit Neural Representation (INR) is an innovative approach for representing complex shapes or objects without explicitly defining their geometry or surface structure. Instead, INR represents objects as continuous functions. Previous research has demonstrated the effectiveness of using neural networks as INR for image compression, showcasing comparable performance to traditional methods such as… ▽ More

    Submitted 23 April, 2024; v1 submitted 29 June, 2023; originally announced June 2023.

    Comments: Accepted by ICCAD 2023

    Journal ref: ICCAD 2023

  21. arXiv:2306.08388  [pdf, other

    cs.LG cs.AI

    Skill-Critic: Refining Learned Skills for Reinforcement Learning

    Authors: Ce Hao, Catherine Weaver, Chen Tang, Kenta Kawamoto, Masayoshi Tomizuka, Wei Zhan

    Abstract: Hierarchical reinforcement learning (RL) can accelerate long-horizon decision-making by temporally abstracting a policy into multiple levels. Promising results in sparse reward environments have been seen with skills, i.e. sequences of primitive actions. Typically, a skill latent space and policy are discovered from offline data, but the resulting low-level policy can be unreliable due to low-cove… ▽ More

    Submitted 15 June, 2023; v1 submitted 14 June, 2023; originally announced June 2023.

    Comments: Preprint

  22. arXiv:2306.00704  [pdf, other

    cs.CV

    DAM-Net: Global Flood Detection from SAR Imagery Using Differential Attention Metric-Based Vision Transformers

    Authors: Tamer Saleh, Xingxing Weng, Shimaa Holail, Chen Hao, Gui-Song Xia

    Abstract: The detection of flooded areas using high-resolution synthetic aperture radar (SAR) imagery is a critical task with applications in crisis and disaster management, as well as environmental resource planning. However, the complex nature of SAR images presents a challenge that often leads to an overestimation of the flood extent. To address this issue, we propose a novel differential attention metri… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: 16 pages, 11 figures

  23. arXiv:2305.18691  [pdf, other

    cs.AR cs.CV

    Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-Experts

    Authors: Rishov Sarkar, Hanxue Liang, Zhiwen Fan, Zhangyang Wang, Cong Hao

    Abstract: Computer vision researchers are embracing two promising paradigms: Vision Transformers (ViTs) and Multi-task Learning (MTL), which both show great performance but are computation-intensive, given the quadratic complexity of self-attention in ViT and the need to activate an entire large MTL model for one task. M$^3$ViT is the latest multi-task ViT model that introduces mixture-of-experts (MoE), whe… ▽ More

    Submitted 13 September, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

    Comments: 11 pages, 12 figures. Accepted at ICCAD 2023

  24. Patch-Mix Contrastive Learning with Audio Spectrogram Transformer on Respiratory Sound Classification

    Authors: Sangmin Bae, June-Woo Kim, Won-Yang Cho, Hyerim Baek, Soyoun Son, Byungjo Lee, Changwan Ha, Kyongpil Tae, Sungnyun Kim, Se-Young Yun

    Abstract: Respiratory sound contains crucial information for the early diagnosis of fatal lung diseases. Since the COVID-19 pandemic, there has been a growing interest in contact-free medical care based on electronic stethoscopes. To this end, cutting-edge deep learning models have been developed to diagnose lung diseases; however, it is still challenging due to the scarcity of medical data. In this study,… ▽ More

    Submitted 22 November, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: INTERSPEECH 2023, Code URL: https://github.com/raymin0223/patch-mix_contrastive_learning

  25. arXiv:2305.07740  [pdf, other

    cs.RO eess.SY

    Double-Iterative Gaussian Process Regression for Modeling Error Compensation in Autonomous Racing

    Authors: Shaoshu Su, Ce Hao, Catherine Weaver, Chen Tang, Wei Zhan, Masayoshi Tomizuka

    Abstract: Autonomous racing control is a challenging research problem as vehicles are pushed to their limits of handling to achieve an optimal lap time; therefore, vehicles exhibit highly nonlinear and complex dynamics. Difficult-to-model effects, such as drifting, aerodynamics, chassis weight transfer, and suspension can lead to infeasible and suboptimal trajectories. While offline planning allows optimizi… ▽ More

    Submitted 26 June, 2023; v1 submitted 12 May, 2023; originally announced May 2023.

    Comments: 8 Pages, 6 Figures, Accepted by IFAC 2023 (The 22nd World Congress of the International Federation of Automatic Control)

  26. arXiv:2304.11219  [pdf, other

    cs.PF cs.AR

    LightningSim: Fast and Accurate Trace-Based Simulation for High-Level Synthesis

    Authors: Rishov Sarkar, Cong Hao

    Abstract: High-Level Synthesis allows hardware designers to create complex RTL designs using C/C++. The traditional HLS workflow involves iterations of C/C++ simulation for partial functional verification and HLS synthesis for coarse timing estimates. However, neither C/C++ simulation nor HLS synthesis estimates can account for complex behaviors like FIFO interactions and pipeline stalls, thereby obscuring… ▽ More

    Submitted 21 April, 2023; originally announced April 2023.

    Comments: 11 pages, 7 figures. Accepted at FCCM 2023

  27. arXiv:2304.06831  [pdf

    cs.AR cs.AI cs.LG

    DGNN-Booster: A Generic FPGA Accelerator Framework For Dynamic Graph Neural Network Inference

    Authors: Hanqiu Chen, Cong Hao

    Abstract: Dynamic Graph Neural Networks (DGNNs) are becoming increasingly popular due to their effectiveness in analyzing and predicting the evolution of complex interconnected graph-based systems. However, hardware deployment of DGNNs still remains a challenge. First, DGNNs do not fully utilize hardware resources because temporal data dependencies cause low hardware parallelism. Additionally, there is curr… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

    Comments: This paper is accepted by FCCM 2023

  28. arXiv:2303.16459  [pdf, other

    cs.AR cs.AI cs.LG

    GNNBuilder: An Automated Framework for Generic Graph Neural Network Accelerator Generation, Simulation, and Optimization

    Authors: Stefan Abi-Karam, Cong Hao

    Abstract: There are plenty of graph neural network (GNN) accelerators being proposed. However, they highly rely on users' hardware expertise and are usually optimized for one specific GNN model, making them challenging for practical use. Therefore, in this work, we propose GNNBuilder, the first automated, generic, end-to-end GNN accelerator generation framework. It features four advantages: (1) GNNBuilder c… ▽ More

    Submitted 7 August, 2023; v1 submitted 29 March, 2023; originally announced March 2023.

    Comments: 10 pages, 7 figures, 4 tables, 3 listings

  29. arXiv:2303.08256  [pdf, other

    cs.AR

    Gamora: Graph Learning based Symbolic Reasoning for Large-Scale Boolean Networks

    Authors: Nan Wu, Yingjie Li, Cong Hao, Steve Dai, Cunxi Yu, Yuan Xie

    Abstract: Reasoning high-level abstractions from bit-blasted Boolean networks (BNs) such as gate-level netlists can significantly benefit functional verification, logic minimization, datapath synthesis, malicious logic identification, etc. Mostly, conventional reasoning approaches leverage structural hashing and functional propagation, suffering from limited scalability and inefficient usage of modern compu… ▽ More

    Submitted 12 June, 2023; v1 submitted 14 March, 2023; originally announced March 2023.

    Comments: This work will appear at 60th Design Automation Conference (DAC'23)

  30. arXiv:2211.09378  [pdf, other

    cs.RO eess.SY

    Outracing Human Racers with Model-based Planning and Control for Time-trial Racing

    Authors: Ce Hao, Chen Tang, Eric Bergkvist, Catherine Weaver, Liting Sun, Wei Zhan, Masayoshi Tomizuka

    Abstract: Autonomous racing has become a popular sub-topic of autonomous driving in recent years. The goal of autonomous racing research is to develop software to control the vehicle at its limit of handling and achieve human-level racing performance. In this work, we investigate how to approach human expert-level racing performance with model-based planning and control methods using the high-fidelity racin… ▽ More

    Submitted 25 October, 2023; v1 submitted 17 November, 2022; originally announced November 2022.

    Comments: 16 pages, 13 figures, 3 tables

  31. arXiv:2210.14793  [pdf, other

    cs.CV

    M$^3$ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design

    Authors: Hanxue Liang, Zhiwen Fan, Rishov Sarkar, Ziyu Jiang, Tianlong Chen, Kai Zou, Yu Cheng, Cong Hao, Zhangyang Wang

    Abstract: Multi-task learning (MTL) encapsulates multiple learned tasks in a single model and often lets those tasks learn better jointly. However, when deploying MTL onto those real-world systems that are often resource-constrained or latency-sensitive, two prominent challenges arise: (i) during training, simultaneously optimizing all tasks is often difficult due to gradient conflicts across tasks; (ii) at… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

  32. Data-Model-Circuit Tri-Design for Ultra-Light Video Intelligence on Edge Devices

    Authors: Yimeng Zhang, Akshay Karkal Kamath, Qiucheng Wu, Zhiwen Fan, Wuyang Chen, Zhangyang Wang, Shiyu Chang, Sijia Liu, Cong Hao

    Abstract: In this paper, we propose a data-model-hardware tri-design framework for high-throughput, low-cost, and high-accuracy multi-object tracking (MOT) on High-Definition (HD) video stream. First, to enable ultra-light video intelligence, we propose temporal frame-filtering and spatial saliency-focusing approaches to reduce the complexity of massive video data. Second, we exploit structure-aware weight… ▽ More

    Submitted 17 October, 2022; v1 submitted 16 October, 2022; originally announced October 2022.

    Comments: Accepted to ASP-DAC'23

  33. arXiv:2210.03900  [pdf

    cs.AR cs.LG

    Bottleneck Analysis of Dynamic Graph Neural Network Inference on CPU and GPU

    Authors: Hanqiu Chen, Yahya Alhinai, Yihan Jiang, Eunjee Na, Cong Hao

    Abstract: Dynamic graph neural network (DGNN) is becoming increasingly popular because of its widespread use in capturing dynamic features in the real world. A variety of dynamic graph neural networks designed from algorithmic perspectives have succeeded in incorporating temporal information into graph processing. Despite the promising algorithmic performance, deploying DGNNs on hardware presents additional… ▽ More

    Submitted 13 April, 2023; v1 submitted 7 October, 2022; originally announced October 2022.

    Comments: 14 pages main text, 2 pages appendix, 10 figures, accepted by IISWC2022

  34. arXiv:2209.08763  [pdf

    cs.RO cs.CV

    Decentralized Vehicle Coordination: The Berkeley DeepDrive Drone Dataset

    Authors: Fangyu Wu, Dequan Wang, Minjune Hwang, Chenhui Hao, Jiawei Lu, Jiamu Zhang, Christopher Chou, Trevor Darrell, Alexandre Bayen

    Abstract: Decentralized multiagent planning has been an important field of research in robotics. An interesting and impactful application in the field is decentralized vehicle coordination in understructured road environments. For example, in an intersection, it is useful yet difficult to deconflict multiple vehicles of intersecting paths in absence of a central coordinator. We learn from common sense that,… ▽ More

    Submitted 22 September, 2022; v1 submitted 19 September, 2022; originally announced September 2022.

    Comments: 6 pages, 10 figures, 1 table

  35. Approximated Doubly Robust Search Relevance Estimation

    Authors: Lixin Zou, Changying Hao, Hengyi Cai, Suqi Cheng, Shuaiqiang Wang, Wenwen Ye, Zhicong Cheng, Simiu Gu, Dawei Yin

    Abstract: Extracting query-document relevance from the sparse, biased clickthrough log is among the most fundamental tasks in the web search system. Prior art mainly learns a relevance judgment model with semantic features of the query and document and ignores directly counterfactual relevance evaluation from the clicking log. Though the learned semantic matching models can provide relevance signals for tai… ▽ More

    Submitted 16 August, 2022; originally announced August 2022.

    Comments: 10 pages

    Journal ref: CIKM 2022

  36. arXiv:2207.05984  [pdf, other

    cs.LG cs.AR math.OC

    Unsupervised Learning for Combinatorial Optimization with Principled Objective Relaxation

    Authors: Haoyu Wang, Nan Wu, Hang Yang, Cong Hao, Pan Li

    Abstract: Using machine learning to solve combinatorial optimization (CO) problems is challenging, especially when the data is unlabeled. This work proposes an unsupervised learning framework for CO problems. Our framework follows a standard relaxation-plus-rounding approach and adopts neural networks to parameterize the relaxed solutions so that simple back-propagation can train the model end-to-end. Our k… ▽ More

    Submitted 22 October, 2022; v1 submitted 13 July, 2022; originally announced July 2022.

    Comments: Code: https://github.com/Graph-COM/CO_ProxyDesign

  37. arXiv:2206.04682  [pdf, other

    eess.IV cs.CV cs.LG

    RT-DNAS: Real-time Constrained Differentiable Neural Architecture Search for 3D Cardiac Cine MRI Segmentation

    Authors: Qing Lu, Xiaowei Xu, Shunjie Dong, Cong Hao, Lei Yang, Cheng Zhuo, Yiyu Shi

    Abstract: Accurately segmenting temporal frames of cine magnetic resonance imaging (MRI) is a crucial step in various real-time MRI guided cardiac interventions. To achieve fast and accurate visual assistance, there are strict requirements on the maximum latency and minimum throughput of the segmentation framework. State-of-the-art neural networks on this task are mostly hand-crafted to satisfy these constr… ▽ More

    Submitted 13 June, 2022; v1 submitted 8 June, 2022; originally announced June 2022.

  38. arXiv:2206.03326  [pdf, other

    cs.LG cs.AR

    Compilation and Optimizations for Efficient Machine Learning on Embedded Systems

    Authors: Xiaofan Zhang, Yao Chen, Cong Hao, Sitao Huang, Yuhong Li, Deming Chen

    Abstract: Deep Neural Networks (DNNs) have achieved great success in a variety of machine learning (ML) applications, delivering high-quality inferencing solutions in computer vision, natural language processing, and virtual reality, etc. However, DNN-based ML applications also bring much increased computational and storage requirements, which are particularly challenging for embedded systems with limited c… ▽ More

    Submitted 25 August, 2022; v1 submitted 5 June, 2022; originally announced June 2022.

    Comments: This article will appear as a book chapter in a new book: Embedded Machine Learning for Cyber-Physical, IoT, and Edge Computing, Springer Nature

  39. arXiv:2205.07149  [pdf, other

    cs.RO

    Robotic Computing on FPGAs: Current Progress, Research Challenges, and Opportunities

    Authors: Zishen Wan, Ashwin Lele, Bo Yu, Shaoshan Liu, Yu Wang, Vijay Janapa Reddi, Cong Hao, Arijit Raychowdhury

    Abstract: Robotic computing has reached a tipping point, with a myriad of robots (e.g., drones, self-driving cars, logistic robots) being widely applied in diverse scenarios. The continuous proliferation of robotics, however, critically depends on efficient computing substrates, driven by real-time requirements, robotic size-weight-and-power constraints, cybersecurity considerations, and dynamically changin… ▽ More

    Submitted 14 May, 2022; originally announced May 2022.

    Comments: 2022 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS), June 13-15, 2022, Incheon, Korea

  40. arXiv:2204.13852  [pdf, other

    cs.LG cs.AI cs.DC

    H2H: Heterogeneous Model to Heterogeneous System Mapping with Computation and Communication Awareness

    Authors: Xinyi Zhang, Cong Hao, Peipei Zhou, Alex Jones, Jingtong Hu

    Abstract: The complex nature of real-world problems calls for heterogeneity in both machine learning (ML) models and hardware systems. The heterogeneity in ML models comes from multi-sensor perceiving and multi-task learning, i.e., multi-modality multi-task (MMMT), resulting in diverse deep neural network (DNN) layers and computation patterns. The heterogeneity in systems comes from diverse processing compo… ▽ More

    Submitted 28 April, 2022; originally announced April 2022.

    Comments: 6 pages

  41. arXiv:2204.13103  [pdf, other

    cs.DC cs.LG

    FlowGNN: A Dataflow Architecture for Real-Time Workload-Agnostic Graph Neural Network Inference

    Authors: Rishov Sarkar, Stefan Abi-Karam, Yuqi He, Lakshmi Sathidevi, Cong Hao

    Abstract: Graph neural networks (GNNs) have recently exploded in popularity thanks to their broad applicability to graph-related problems such as quantum chemistry, drug discovery, and high energy physics. However, meeting demand for novel GNN models and fast inference simultaneously is challenging due to the gap between developing efficient accelerators and the rapid creation of new GNN models. Prior art f… ▽ More

    Submitted 19 October, 2022; v1 submitted 27 April, 2022; originally announced April 2022.

    Comments: 13 pages, 10 figures. Accepted at HPCA 2023

  42. arXiv:2201.08916  [pdf, other

    cs.AR

    Enabling Flexibility for Sparse Tensor Acceleration via Heterogeneity

    Authors: Eric Qin, Raveesh Garg, Abhimanyu Bambhaniya, Michael Pellauer, Angshuman Parashar, Sivasankaran Rajamanickam, Cong Hao, Tushar Krishna

    Abstract: Recently, numerous sparse hardware accelerators for Deep Neural Networks (DNNs), Graph Neural Networks (GNNs), and scientific computing applications have been proposed. A common characteristic among all of these accelerators is that they target tensor algebra (typically matrix multiplications); yet dozens of new accelerators are proposed for every new application. The motivation is that the size a… ▽ More

    Submitted 21 January, 2022; originally announced January 2022.

  43. arXiv:2201.08475  [pdf, other

    cs.LG cs.DC

    GenGNN: A Generic FPGA Framework for Graph Neural Network Acceleration

    Authors: Stefan Abi-Karam, Yuqi He, Rishov Sarkar, Lakshmi Sathidevi, Zihang Qiao, Cong Hao

    Abstract: Graph neural networks (GNNs) have recently exploded in popularity thanks to their broad applicability to ubiquitous graph-related problems such as quantum chemistry, drug discovery, and high energy physics. However, meeting demand for novel GNN models and fast inference simultaneously is challenging because of the gap between the difficulty in developing efficient FPGA accelerators and the rapid p… ▽ More

    Submitted 20 January, 2022; originally announced January 2022.

    Comments: 10 pages, 9 figures. The first three authors contributed equally. Submitted to FCCM 2022

  44. arXiv:2201.08455  [pdf, other

    cs.LG cs.AR

    LOSTIN: Logic Optimization via Spatio-Temporal Information with Hybrid Graph Models

    Authors: Nan Wu, Jiwon Lee, Yuan Xie, Cong Hao

    Abstract: Despite the stride made by machine learning (ML) based performance modeling, two major concerns that may impede production-ready ML applications in EDA are stringent accuracy requirements and generalization capability. To this end, we propose hybrid graph neural network (GNN) based approaches towards highly accurate quality-of-result (QoR) estimations with great generalization capability, specific… ▽ More

    Submitted 19 October, 2022; v1 submitted 20 January, 2022; originally announced January 2022.

  45. High-Level Synthesis Performance Prediction using GNNs: Benchmarking, Modeling, and Advancing

    Authors: Nan Wu, Hang Yang, Yuan Xie, Pan Li, Cong Hao

    Abstract: Agile hardware development requires fast and accurate circuit quality evaluation from early design stages. Existing work of high-level synthesis (HLS) performance prediction usually needs extensive feature engineering after the synthesis process. To expedite circuit evaluation from as earlier design stage as possible, we propose a rapid and accurate performance modeling, exploiting the representat… ▽ More

    Submitted 18 January, 2022; originally announced January 2022.

    Journal ref: Proceedings of the 59th ACM/IEEE Design Automation Conference, July 2022

  46. arXiv:2109.06265  [pdf, other

    cs.LG

    Program-to-Circuit: Exploiting GNNs for Program Representation and Circuit Translation

    Authors: Nan Wu, Huake He, Yuan Xie, Pan Li, Cong Hao

    Abstract: Circuit design is complicated and requires extensive domain-specific expertise. One major obstacle stuck on the way to hardware agile development is the considerably time-consuming process of accurate circuit quality evaluation. To significantly expedite the circuit evaluation during the translation from behavioral languages to circuit designs, we formulate it as a Program-to-Circuit problem, aimi… ▽ More

    Submitted 13 September, 2021; originally announced September 2021.

  47. arXiv:2109.03564  [pdf, other

    cs.CL cs.AI

    NSP-BERT: A Prompt-based Few-Shot Learner Through an Original Pre-training Task--Next Sentence Prediction

    Authors: Yi Sun, Yu Zheng, Chao Hao, Hangping Qiu

    Abstract: Using prompts to utilize language models to perform various downstream tasks, also known as prompt-based learning or prompt-learning, has lately gained significant success in comparison to the pre-train and fine-tune paradigm. Nonetheless, virtually all prompt-based methods are token-level, meaning they all utilize GPT's left-to-right language model or BERT's masked language model to perform cloze… ▽ More

    Submitted 18 October, 2022; v1 submitted 8 September, 2021; originally announced September 2021.

    Comments: Published at COLING2022, long paper

  48. arXiv:2108.02656  [pdf

    eess.IV cs.CV

    A Computer-Aided Diagnosis System for Breast Pathology: A Deep Learning Approach with Model Interpretability from Pathological Perspective

    Authors: Wei-Wen Hsu, Yongfang Wu, Chang Hao, Yu-Ling Hou, Xiang Gao, Yun Shao, Xueli Zhang, Tao He, Yanhong Tai

    Abstract: Objective: We develop a computer-aided diagnosis (CAD) system using deep learning approaches for lesion detection and classification on whole-slide images (WSIs) with breast cancer. The deep features being distinguishing in classification from the convolutional neural networks (CNN) are demonstrated in this study to provide comprehensive interpretability for the proposed CAD system using pathologi… ▽ More

    Submitted 5 August, 2021; originally announced August 2021.

  49. arXiv:2108.01899  [pdf, other

    cs.LG cs.AI cs.CV cs.NE

    Generic Neural Architecture Search via Regression

    Authors: Yuhong Li, Cong Hao, Pan Li, Jinjun Xiong, Deming Chen

    Abstract: Most existing neural architecture search (NAS) algorithms are dedicated to and evaluated by the downstream tasks, e.g., image classification in computer vision. However, extensive experiments have shown that, prominent neural architectures, such as ResNet in computer vision and LSTM in natural language processing, are generally good at extracting patterns from the input data and perform well on di… ▽ More

    Submitted 17 November, 2021; v1 submitted 4 August, 2021; originally announced August 2021.

    Comments: To appear at NeurIPS 2021, 10 pages + Reference + Appendix

  50. arXiv:2107.11673  [pdf, other

    cs.PL cs.AR

    ScaleHLS: A New Scalable High-Level Synthesis Framework on Multi-Level Intermediate Representation

    Authors: Hanchen Ye, Cong Hao, Jianyi Cheng, Hyunmin Jeong, Jack Huang, Stephen Neuendorffer, Deming Chen

    Abstract: High-level synthesis (HLS) has been widely adopted as it significantly improves the hardware design productivity and enables efficient design space exploration (DSE). Existing HLS tools are built using compiler infrastructures largely based on a single-level abstraction, such as LLVM. However, as HLS designs typically come with intrinsic structural or functional hierarchies, different HLS optimiza… ▽ More

    Submitted 22 December, 2021; v1 submitted 24 July, 2021; originally announced July 2021.

    Comments: Accepted as a conference paper at HPCA'22