Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 128 results for author: Cheng, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.15165  [pdf, other

    cs.LG cond-mat.mtrl-sci physics.chem-ph physics.comp-ph

    Latent Ewald summation for machine learning of long-range interactions

    Authors: Bingqing Cheng

    Abstract: Machine learning interatomic potentials (MLIPs) often neglect long-range interactions, such as electrostatic and dispersion forces. In this work, we introduce a straightforward and efficient method to account for long-range interactions by learning a latent variable from local atomic descriptors and applying an Ewald summation to this variable. We demonstrate that in systems including charged, pol… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

  2. arXiv:2408.14074  [pdf, other

    cs.SE

    Abstraction Engineering

    Authors: Nelly Bencomo, Jordi Cabot, Marsha Chechik, Betty H. C. Cheng, Benoit Combemale, Andrzej Wąsowski, Steffen Zschaler

    Abstract: Modern software-based systems operate under rapidly changing conditions and face ever-increasing uncertainty. In response, systems are increasingly adaptive and reliant on artificial-intelligence methods. In addition to the ubiquity of software with respect to users and application areas (e.g., transportation, smart grids, medicine, etc.), these high-impact software systems necessarily draw from m… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

  3. arXiv:2408.08189  [pdf, other

    cs.CV

    FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance

    Authors: Jiasong Feng, Ao Ma, Jing Wang, Bo Cheng, Xiaodan Liang, Dawei Leng, Yuhui Yin

    Abstract: Synthesizing motion-rich and temporally consistent videos remains a challenge in artificial intelligence, especially when dealing with extended durations. Existing text-to-video (T2V) models commonly employ spatial cross-attention for text control, equivalently guiding different frame generations without frame-specific textual guidance. Thus, the model's capacity to comprehend the temporal logic c… ▽ More

    Submitted 16 August, 2024; v1 submitted 15 August, 2024; originally announced August 2024.

  4. arXiv:2408.07957  [pdf, other

    cs.MM

    Joint Optimization of Buffer Delay and HARQ for Video Communications

    Authors: Baoping Cheng, Peng Lei, Xiaoyan Xie, Tao Fu, Yukun Zhang, Xiaoming Tao

    Abstract: To improve the quality of experience (QoE) in video communication over lossy networks, this paper presents a transmission method that jointly optimizes buffer delay and Hybrid Automatic Repeat request (HARQ), referred to as BD-HARQ. This method operates on packet group and employs dynamic buffer delay combined with HARQ strategy for transmission. By defining the QoE based on metrics such as buffer… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

    Comments: 6 pages, 5figures

  5. arXiv:2408.07397  [pdf, other

    cs.MA

    Bridging Training and Execution via Dynamic Directed Graph-Based Communication in Cooperative Multi-Agent Systems

    Authors: Zhuohui Zhang, Bin He, Bin Cheng, Gang Li

    Abstract: Multi-agent systems must learn to communicate and understand interactions between agents to achieve cooperative goals in partially observed tasks. However, existing approaches lack a dynamic directed communication mechanism and rely on global states, thus diminishing the role of communication in centralized training. Thus, we propose the transformer-based graph coarsening network (TGCNet), a novel… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

    Comments: 9 pages, 7 figures

  6. arXiv:2408.04344  [pdf, other

    cs.SE

    Semantic-Enhanced Indirect Call Analysis with Large Language Models

    Authors: Baijun Cheng, Cen Zhang, Kailong Wang, Ling Shi, Yang Liu, Haoyu Wang, Yao Guo, Xiangqun Chen

    Abstract: In contemporary software development, the widespread use of indirect calls to achieve dynamic features poses challenges in constructing precise control flow graphs (CFGs), which further impacts the performance of downstream static analysis tasks. To tackle this issue, various types of indirect call analyzers have been proposed. However, they do not fully leverage the semantic information of the pr… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    Comments: Accepted by ASE'24

  7. arXiv:2408.00290  [pdf, other

    cs.CV cs.AI

    Multi-Modal Parameter-Efficient Fine-tuning via Graph Neural Network

    Authors: Bin Cheng, Jiaxuan Lu

    Abstract: With the advent of the era of foundation models, pre-training and fine-tuning have become common paradigms. Recently, parameter-efficient fine-tuning has garnered widespread attention due to its better balance between the number of learnable parameters and performance. However, some current parameter-efficient fine-tuning methods only model a single modality and lack the utilization of structural… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

  8. arXiv:2407.13664  [pdf, other

    cs.LG

    Decision Focused Causal Learning for Direct Counterfactual Marketing Optimization

    Authors: Hao Zhou, Rongxiao Huang, Shaoming Li, Guibin Jiang, Jiaqi Zheng, Bing Cheng, Wei Lin

    Abstract: Marketing optimization plays an important role to enhance user engagement in online Internet platforms. Existing studies usually formulate this problem as a budget allocation problem and solve it by utilizing two fully decoupled stages, i.e., machine learning (ML) and operation research (OR). However, the learning objective in ML does not take account of the downstream optimization task in OR, whi… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: Accepted by KDD 2024

  9. arXiv:2406.17257  [pdf, other

    cs.CL cs.SD eess.AS

    Leveraging Parameter-Efficient Transfer Learning for Multi-Lingual Text-to-Speech Adaptation

    Authors: Yingting Li, Ambuj Mehrish, Bryan Chew, Bo Cheng, Soujanya Poria

    Abstract: Different languages have distinct phonetic systems and vary in their prosodic features making it challenging to develop a Text-to-Speech (TTS) model that can effectively synthesise speech in multilingual settings. Furthermore, TTS architecture needs to be both efficient enough to capture nuances in multiple languages and efficient enough to be practical for deployment. The standard approach is to… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  10. arXiv:2406.03177  [pdf, other

    cs.CV

    FAPNet: An Effective Frequency Adaptive Point-based Eye Tracker

    Authors: Xiaopeng Lin, Hongwei Ren, Bojun Cheng

    Abstract: Eye tracking is crucial for human-computer interaction in different domains. Conventional cameras encounter challenges such as power consumption and image quality during different eye movements, prompting the need for advanced solutions with ultra-fast, low-power, and accurate eye trackers. Event cameras, fundamentally designed to capture information about moving objects, exhibit low power consump… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted by CVPRW 2024 (AIS)

  11. arXiv:2405.19688  [pdf, other

    cs.CV

    DNPM: A Neural Parametric Model for the Synthesis of Facial Geometric Details

    Authors: Haitao Cao, Baoping Cheng, Qiran Pu, Haocheng Zhang, Bin Luo, Yixiang Zhuang, Juncong Lin, Liyan Chen, Xuan Cheng

    Abstract: Parametric 3D models have enabled a wide variety of computer vision and graphics tasks, such as modeling human faces, bodies and hands. In 3D face modeling, 3DMM is the most widely used parametric model, but can't generate fine geometric details solely from identity and expression inputs. To tackle this limitation, we propose a neural parametric model named DNPM for the facial geometric details, w… ▽ More

    Submitted 13 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

  12. arXiv:2405.14159  [pdf, other

    cs.CL cs.AI

    Super Tiny Language Models

    Authors: Dylan Hillier, Leon Guertler, Cheston Tan, Palaash Agrawal, Chen Ruirui, Bobby Cheng

    Abstract: The rapid advancement of large language models (LLMs) has led to significant improvements in natural language processing but also poses challenges due to their high computational and energy demands. This paper introduces a series of research efforts focused on Super Tiny Language Models (STLMs), which aim to deliver high performance with significantly reduced parameter counts. We explore innovativ… ▽ More

    Submitted 26 June, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: 11 pages, 4 figures

    ACM Class: I.2.7

  13. arXiv:2405.09057  [pdf, other

    cs.LG cond-mat.mtrl-sci physics.comp-ph

    Response Matching for generating materials and molecules

    Authors: Bingqing Cheng

    Abstract: Machine learning has recently emerged as a powerful tool for generating new molecular and material structures. The success of state-of-the-art models stems from their ability to incorporate physical symmetries, such as translation, rotation, and periodicity. Here, we present a novel generative method called Response Matching (RM), which leverages the fact that each stable material or molecule exis… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

  14. arXiv:2405.06116  [pdf, other

    cs.CV

    Rethinking Efficient and Effective Point-based Networks for Event Camera Classification and Regression: EventMamba

    Authors: Hongwei Ren, Yue Zhou, Jiadong Zhu, Haotian Fu, Yulong Huang, Xiaopeng Lin, Yuetong Fang, Fei Ma, Hao Yu, Bojun Cheng

    Abstract: Event cameras, drawing inspiration from biological systems, efficiently detect changes in ambient light with low latency and high dynamic range while consuming minimal power. The most current approach to processing event data often involves converting it into frame-based representations, which is well-established in traditional vision. However, this approach neglects the sparsity of event data, lo… ▽ More

    Submitted 2 July, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

    Comments: Extension Journal of TTPOINT and PEPNet, modify the dataset split method

  15. arXiv:2404.18599  [pdf, other

    eess.IV cs.CV

    Self-supervised learning for classifying paranasal anomalies in the maxillary sinus

    Authors: Debayan Bhattacharya, Finn Behrendt, Benjamin Tobias Becker, Lennart Maack, Dirk Beyersdorff, Elina Petersen, Marvin Petersen, Bastian Cheng, Dennis Eggert, Christian Betz, Anna Sophie Hoffmann, Alexander Schlaefer

    Abstract: Purpose: Paranasal anomalies, frequently identified in routine radiological screenings, exhibit diverse morphological characteristics. Due to the diversity of anomalies, supervised learning methods require large labelled dataset exhibiting diverse anomaly morphology. Self-supervised learning (SSL) can be used to learn representations from unlabelled data. However, there are no SSL methods designed… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  16. arXiv:2404.12888  [pdf, other

    cs.CV cs.GR cs.LG

    Learn2Talk: 3D Talking Face Learns from 2D Talking Face

    Authors: Yixiang Zhuang, Baoping Cheng, Yao Cheng, Yuntao Jin, Renshuai Liu, Chengyang Li, Xuan Cheng, Jing Liao, Juncong Lin

    Abstract: Speech-driven facial animation methods usually contain two main classes, 3D and 2D talking face, both of which attract considerable research attention in recent years. However, to the best of our knowledge, the research on 3D talking face does not go deeper as 2D talking face, in the aspect of lip-synchronization (lip-sync) and speech perception. To mind the gap between the two sub-fields, we prop… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  17. arXiv:2404.11770  [pdf, other

    cs.CV cs.AI

    Event-Based Eye Tracking. AIS 2024 Challenge Survey

    Authors: Zuowen Wang, Chang Gao, Zongwei Wu, Marcos V. Conde, Radu Timofte, Shih-Chii Liu, Qinyu Chen, Zheng-jun Zha, Wei Zhai, Han Han, Bohao Liao, Yuliang Wu, Zengyu Wan, Zhong Wang, Yang Cao, Ganchao Tan, Jinze Chen, Yan Ru Pei, Sasskia Brüers, Sébastien Crouzet, Douglas McLelland, Oliver Coenen, Baoheng Zhang, Yizhao Gao, Jingyuan Li , et al. (14 additional authors not shown)

    Abstract: This survey reviews the AIS 2024 Event-Based Eye Tracking (EET) Challenge. The task of the challenge focuses on processing eye movement recorded with event cameras and predicting the pupil center of the eye. The challenge emphasizes efficient eye tracking with event cameras to achieve good task accuracy and efficiency trade-off. During the challenge period, 38 participants registered for the Kaggl… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: Qinyu Chen is the corresponding author

  18. arXiv:2404.09000  [pdf, other

    eess.IV cs.CV cs.LG

    MaSkel: A Model for Human Whole-body X-rays Generation from Human Masking Images

    Authors: Yingjie Xi, Boyuan Cheng, Jingyao Cai, Jian Jun Zhang, Xiaosong Yang

    Abstract: The human whole-body X-rays could offer a valuable reference for various applications, including medical diagnostics, digital animation modeling, and ergonomic design. The traditional method of obtaining X-ray information requires the use of CT (Computed Tomography) scan machines, which emit potentially harmful radiation. Thus it faces a significant limitation for realistic applications because it… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

  19. arXiv:2404.07663  [pdf, other

    cs.DB cs.AI cs.LG

    Interactive Ontology Matching with Cost-Efficient Learning

    Authors: Bin Cheng, Jonathan Fürst, Tobias Jacobs, Celia Garrido-Hidalgo

    Abstract: The creation of high-quality ontologies is crucial for data integration and knowledge-based reasoning, specifically in the context of the rising data economy. However, automatic ontology matchers are often bound to the heuristics they are based on, leaving many matches unidentified. Interactive ontology matching systems involving human experts have been introduced, but they do not solve the fundam… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  20. arXiv:2404.04645  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    HyperTTS: Parameter Efficient Adaptation in Text to Speech using Hypernetworks

    Authors: Yingting Li, Rishabh Bhardwaj, Ambuj Mehrish, Bo Cheng, Soujanya Poria

    Abstract: Neural speech synthesis, or text-to-speech (TTS), aims to transform a signal from the text domain to the speech domain. While developing TTS architectures that train and test on the same set of speakers has seen significant improvements, out-of-domain speaker performance still faces enormous limitations. Domain adaptation on a new set of speakers can be achieved by fine-tuning the whole model for… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

  21. arXiv:2404.00569  [pdf, other

    cs.SD cs.CL eess.AS

    CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers and Consistency Models

    Authors: Xiang Li, Fan Bu, Ambuj Mehrish, Yingting Li, Jiale Han, Bo Cheng, Soujanya Poria

    Abstract: Neural Text-to-Speech (TTS) systems find broad applications in voice assistants, e-learning, and audiobook creation. The pursuit of modern models, like Diffusion Models (DMs), holds promise for achieving high-fidelity, real-time speech synthesis. Yet, the efficiency of multi-step sampling in Diffusion Models presents challenges. Efforts have been made to integrate GANs with DMs, speeding up infere… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: Accepted by Findings of NAACL 2024. Code is available at https://github.com/XiangLi2022/CM-TTS

  22. arXiv:2403.19412  [pdf, other

    cs.CV

    A Simple and Effective Point-based Network for Event Camera 6-DOFs Pose Relocalization

    Authors: Hongwei Ren, Jiadong Zhu, Yue Zhou, Haotian FU, Yulong Huang, Bojun Cheng

    Abstract: Event cameras exhibit remarkable attributes such as high dynamic range, asynchronicity, and low latency, making them highly suitable for vision tasks that involve high-speed motion in challenging lighting conditions. These cameras implicitly capture movement and depth information in events, making them appealing sensors for Camera Pose Relocalization (CPR) tasks. Nevertheless, existing CPR network… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR 2024

  23. arXiv:2403.15917  [pdf, other

    cs.SE

    Who Uses Personas in Requirements Engineering: The Practitioners' Perspective

    Authors: Yi Wang, Chetan Arora, Xiao Liu, Thuong Hoang, Vasudha Malhotra, Ben Cheng, John Grundy

    Abstract: Personas are commonly used in software projects to gain a better understanding of end-users' needs. However, there is a limited understanding of their usage and effectiveness in practice. This paper presents the results of a two-step investigation, comprising interviews with 26 software developers, UI/UX designers, business analysts and product managers and a survey of 203 practitioners, aimed at… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

  24. arXiv:2403.00128  [pdf, other

    cs.RO cs.LG eess.SY

    From Flies to Robots: Inverted Landing in Small Quadcopters with Dynamic Perching

    Authors: Bryan Habas, Bo Cheng

    Abstract: Inverted landing is a routine behavior among a number of animal fliers. However, mastering this feat poses a considerable challenge for robotic fliers, especially to perform dynamic perching with rapid body rotations (or flips) and landing against gravity. Inverted landing in flies have suggested that optical flow senses are closely linked to the precise triggering and control of body flips that l… ▽ More

    Submitted 29 February, 2024; originally announced March 2024.

    Comments: 17 pages, 19 Figures, Journal paper currently under review

  25. arXiv:2402.15713  [pdf, other

    cs.CL cs.AI

    Making Pre-trained Language Models Better Continual Few-Shot Relation Extractors

    Authors: Shengkun Ma, Jiale Han, Yi Liang, Bo Cheng

    Abstract: Continual Few-shot Relation Extraction (CFRE) is a practical problem that requires the model to continuously learn novel relations while avoiding forgetting old ones with few labeled training data. The primary challenges are catastrophic forgetting and overfitting. This paper harnesses prompt learning to explore the implicit capabilities of pre-trained language models to address the above two chal… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: Accepted as COLING2024

  26. arXiv:2402.13252  [pdf, other

    cs.CV

    Improving Robustness for Joint Optimization of Camera Poses and Decomposed Low-Rank Tensorial Radiance Fields

    Authors: Bo-Yu Cheng, Wei-Chen Chiu, Yu-Lun Liu

    Abstract: In this paper, we propose an algorithm that allows joint refinement of camera pose and scene geometry represented by decomposed low-rank tensor, using only 2D images as supervision. First, we conduct a pilot study based on a 1D signal and relate our findings to 3D scenarios, where the naive joint pose optimization on voxel-based NeRFs can easily lead to sub-optimal solutions. Moreover, based on th… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

    Comments: AAAI 2024. Project page: https://alex04072000.github.io/Joint-TensoRF/

  27. arXiv:2402.07472  [pdf, other

    physics.comp-ph cs.LG physics.chem-ph

    Cartesian atomic cluster expansion for machine learning interatomic potentials

    Authors: Bingqing Cheng

    Abstract: Machine learning interatomic potentials are revolutionizing large-scale, accurate atomistic modelling in material science and chemistry. Many potentials use atomic cluster expansion or equivariant message passing frameworks. Such frameworks typically use spherical harmonics as angular basis functions, and then use Clebsch-Gordan contraction to maintain rotational symmetry, which may introduce redu… ▽ More

    Submitted 29 July, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

  28. arXiv:2402.04663  [pdf, other

    cs.NE

    CLIF: Complementary Leaky Integrate-and-Fire Neuron for Spiking Neural Networks

    Authors: Yulong Huang, Xiaopeng Lin, Hongwei Ren, Haotian Fu, Yue Zhou, Zunchang Liu, Biao Pan, Bojun Cheng

    Abstract: Spiking neural networks (SNNs) are promising brain-inspired energy-efficient models. Compared to conventional deep Artificial Neural Networks (ANNs), SNNs exhibit superior efficiency and capability to process temporal information. However, it remains a challenge to train SNNs due to their undifferentiable spiking mechanism. The surrogate gradients method is commonly used to train SNNs, but often c… ▽ More

    Submitted 14 July, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

  29. arXiv:2401.11644  [pdf, other

    cs.CV cs.RO

    Friends Across Time: Multi-Scale Action Segmentation Transformer for Surgical Phase Recognition

    Authors: Bokai Zhang, Jiayuan Meng, Bin Cheng, Dean Biskup, Svetlana Petculescu, Angela Chapman

    Abstract: Automatic surgical phase recognition is a core technology for modern operating rooms and online surgical video assessment platforms. Current state-of-the-art methods use both spatial and temporal information to tackle the surgical phase recognition task. Building on this idea, we propose the Multi-Scale Action Segmentation Transformer (MS-AST) for offline surgical phase recognition and the Multi-S… ▽ More

    Submitted 21 January, 2024; originally announced January 2024.

  30. arXiv:2401.02737  [pdf, other

    cs.SE

    The Vulnerability Is in the Details: Locating Fine-grained Information of Vulnerable Code Identified by Graph-based Detectors

    Authors: Baijun Cheng, Kailong Wang, Cuiyun Gao, Xiapu Luo, Yulei Sui, Li Li, Yao Guo, Xiangqun Chen, Haoyu Wang

    Abstract: Vulnerability detection is a crucial component in the software development lifecycle. Existing vulnerability detectors, especially those based on deep learning (DL) models, have achieved high effectiveness. Despite their capability of detecting vulnerable code snippets from given code fragments, the detectors are typically unable to further locate the fine-grained information pertaining to the vul… ▽ More

    Submitted 21 February, 2024; v1 submitted 5 January, 2024; originally announced January 2024.

  31. arXiv:2401.02686  [pdf, other

    cs.CR cs.LG cs.SE

    Beyond Fidelity: Explaining Vulnerability Localization of Learning-based Detectors

    Authors: Baijun Cheng, Shengming Zhao, Kailong Wang, Meizhen Wang, Guangdong Bai, Ruitao Feng, Yao Guo, Lei Ma, Haoyu Wang

    Abstract: Vulnerability detectors based on deep learning (DL) models have proven their effectiveness in recent years. However, the shroud of opacity surrounding the decision-making process of these detectors makes it difficult for security analysts to comprehend. To address this, various explanation approaches have been proposed to explain the predictions by highlighting important features, which have been… ▽ More

    Submitted 21 February, 2024; v1 submitted 5 January, 2024; originally announced January 2024.

    Comments: Accepted by Tosem

  32. arXiv:2401.01585  [pdf, other

    cs.DC cs.DM

    On Completely Edge-Independent Spanning Trees in Locally Twisted Cubes

    Authors: Xiaorui Li, Baolei Cheng, Jianxi Fan, Yan Wang, Dajin Wang

    Abstract: A network can contain numerous spanning trees. If two spanning trees $T_i,T_j$ do not share any common edges, $T_i$ and $T_j$ are said to be pairwisely edge-disjoint. For spanning trees $T_1, T_2, ..., T_m$, if every two of them are pairwisely edge-disjoint, they are called completely edge-independent spanning trees (CEISTs for short). CEISTs can facilitate many network functionalities, and constr… ▽ More

    Submitted 1 August, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

  33. arXiv:2312.00296  [pdf, other

    cs.LG stat.ML

    Towards Aligned Canonical Correlation Analysis: Preliminary Formulation and Proof-of-Concept Results

    Authors: Biqian Cheng, Evangelos E. Papalexakis, Jia Chen

    Abstract: Canonical Correlation Analysis (CCA) has been widely applied to jointly embed multiple views of data in a maximally correlated latent space. However, the alignment between various data perspectives, which is required by traditional approaches, is unclear in many practical cases. In this work we propose a new framework Aligned Canonical Correlation Analysis (ACCA), to address this challenge by iter… ▽ More

    Submitted 7 December, 2023; v1 submitted 30 November, 2023; originally announced December 2023.

    Comments: 4 pages, 7 figures, KDD SoCal symposium 2023 (extended version)

  34. arXiv:2310.07572  [pdf, other

    cs.CV

    Impact of Label Types on Training SWIN Models with Overhead Imagery

    Authors: Ryan Ford, Kenneth Hutchison, Nicholas Felts, Benjamin Cheng, Jesse Lew, Kyle Jackson

    Abstract: Understanding the impact of data set design on model training and performance can help alleviate the costs associated with generating remote sensing and overhead labeled data. This work examined the impact of training shifted window transformers using bounding boxes and segmentation labels, where the latter are more expensive to produce. We examined classification tasks by comparing models trained… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

  35. arXiv:2310.07189  [pdf, other

    cs.CV

    SpikePoint: An Efficient Point-based Spiking Neural Network for Event Cameras Action Recognition

    Authors: Hongwei Ren, Yue Zhou, Yulong Huang, Haotian Fu, Xiaopeng Lin, Jie Song, Bojun Cheng

    Abstract: Event cameras are bio-inspired sensors that respond to local changes in light intensity and feature low latency, high energy efficiency, and high dynamic range. Meanwhile, Spiking Neural Networks (SNNs) have gained significant attention due to their remarkable efficiency and fault tolerance. By synergistically harnessing the energy efficiency inherent in event cameras and the spike-based processin… ▽ More

    Submitted 23 January, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: Accepted by ICLR 2024 (Spotlight)

  36. arXiv:2310.06470  [pdf, other

    cs.CV

    Focus on Local Regions for Query-based Object Detection

    Authors: Hongbin Xu, Yamei Xia, Shuai Zhao, Bo Cheng

    Abstract: Query-based methods have garnered significant attention in object detection since the advent of DETR, the pioneering query-based detector. However, these methods face challenges like slow convergence and suboptimal performance. Notably, self-attention in object detection often hampers convergence due to its global focus. To address these issues, we propose FoLR, a transformer-like architecture wit… ▽ More

    Submitted 13 December, 2023; v1 submitted 10 October, 2023; originally announced October 2023.

  37. arXiv:2310.00288  [pdf

    cs.AR cs.ET eess.SY physics.app-ph

    Parallel in-memory wireless computing

    Authors: Cong Wang, Gong-Jie Ruan, Zai-Zheng Yang, Xing-Jian Yangdong, Yixiang Li, Liang Wu, Yingmeng Ge, Yichen Zhao, Chen Pan, Wei Wei, Li-Bo Wang, Bin Cheng, Zaichen Zhang, Chuan Zhang, Shi-Jun Liang, Feng Miao

    Abstract: Parallel wireless digital communication with ultralow power consumption is critical for emerging edge technologies such as 5G and Internet of Things. However, the physical separation between digital computing units and analogue transmission units in traditional wireless technology leads to high power consumption. Here we report a parallel in-memory wireless computing scheme. The approach combines… ▽ More

    Submitted 30 September, 2023; originally announced October 2023.

    Journal ref: Nat Electron 6, 381-389 (2023)

  38. TTPOINT: A Tensorized Point Cloud Network for Lightweight Action Recognition with Event Cameras

    Authors: Hongwei Ren, Yue Zhou, Haotian Fu, Yulong Huang, Renjing Xu, Bojun Cheng

    Abstract: Event cameras have gained popularity in computer vision due to their data sparsity, high dynamic range, and low latency. As a bio-inspired sensor, event cameras generate sparse and asynchronous data, which is inherently incompatible with the traditional frame-based method. Alternatively, the point-based method can avoid additional modality transformation and naturally adapt to the sparsity of even… ▽ More

    Submitted 19 August, 2023; originally announced August 2023.

  39. arXiv:2307.14352  [pdf, other

    cs.CV

    General Image-to-Image Translation with One-Shot Image Guidance

    Authors: Bin Cheng, Zuhao Liu, Yunbo Peng, Yue Lin

    Abstract: Large-scale text-to-image models pre-trained on massive text-image pairs show excellent performance in image synthesis recently. However, image can provide more intuitive visual concepts than plain text. People may ask: how can we integrate the desired visual concept into an existing image, such as our portrait? Current methods are inadequate in meeting this demand as they lack the ability to pres… ▽ More

    Submitted 20 September, 2023; v1 submitted 20 July, 2023; originally announced July 2023.

    Comments: accepted by ICCV 2023

  40. arXiv:2307.08178  [pdf

    cs.RO

    Robot motor learning shows emergence of frequency-modulated, robust swimming with an invariant Strouhal-number

    Authors: Hankun Deng, Donghao Li, Colin Nitroy, Andrew Wertz, Shashank Priya, Bo Cheng

    Abstract: Fish locomotion emerges from a diversity of interactions among deformable structures, surrounding fluids and neuromuscular activations, i.e., fluid-structure interactions (FSI) controlled by fish's motor systems. Previous studies suggested that such motor-controlled FSI may possess embodied traits. However, their implications in motor learning, neuromuscular control, gait generation, and swimming… ▽ More

    Submitted 16 July, 2023; originally announced July 2023.

  41. SigOpt Mulch: An Intelligent System for AutoML of Gradient Boosted Trees

    Authors: Aleksei Sorokin, Xinran Zhu, Eric Hans Lee, Bolong Cheng

    Abstract: Gradient boosted trees (GBTs) are ubiquitous models used by researchers, machine learning (ML) practitioners, and data scientists because of their robust performance, interpretable behavior, and ease-of-use. One critical challenge in training GBTs is the tuning of their hyperparameters. In practice, selecting these hyperparameters is often done manually. Recently, the ML community has advocated fo… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

    Journal ref: Knowledge-Based Systems Volume 273, 3 August 2023, 110604

  42. Achieving Diversity in Objective Space for Sample-efficient Search of Multiobjective Optimization Problems

    Authors: Eric Hans Lee, Bolong Cheng, Michael McCourt

    Abstract: Efficiently solving multi-objective optimization problems for simulation optimization of important scientific and engineering applications such as materials design is becoming an increasingly important research topic. This is due largely to the expensive costs associated with said applications, and the resulting need for sample-efficient, multiobjective optimization methods that efficiently explor… ▽ More

    Submitted 23 June, 2023; originally announced June 2023.

    Journal ref: 2022 Winter Simulation Conference

  43. arXiv:2306.01492  [pdf, other

    cs.SE

    Multi-Modal Emotion Recognition for Enhanced Requirements Engineering: A Novel Approach

    Authors: Ben Cheng, Chetan Arora, Xiao Liu, Thuong Hoang, Yi Wang, John Grundy

    Abstract: Requirements engineering (RE) plays a crucial role in developing software systems by bridging the gap between stakeholders' needs and system specifications. However, effective communication and elicitation of stakeholder requirements can be challenging, as traditional RE methods often overlook emotional cues. This paper introduces a multi-modal emotion recognition platform (MEmoRE) to enhance the… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

  44. arXiv:2304.06274  [pdf, other

    cs.CV

    EWT: Efficient Wavelet-Transformer for Single Image Denoising

    Authors: Juncheng Li, Bodong Cheng, Ying Chen, Guangwei Gao, Tieyong Zeng

    Abstract: Transformer-based image denoising methods have achieved encouraging results in the past year. However, it must uses linear operations to model long-range dependencies, which greatly increases model inference time and consumes GPU storage space. Compared with convolutional neural network-based methods, current Transformer-based image denoising methods cannot achieve a balance between performance im… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

    Comments: 12 pages, 11 figurs

  45. arXiv:2303.17915  [pdf, other

    eess.IV cs.CV

    Multiple Instance Ensembling For Paranasal Anomaly Classification In The Maxillary Sinus

    Authors: Debayan Bhattacharya, Finn Behrendt, Benjamin Tobias Becker, Dirk Beyersdorff, Elina Petersen, Marvin Petersen, Bastian Cheng, Dennis Eggert, Christian Betz, Anna Sophie Hoffmann, Alexander Schlaefer

    Abstract: Paranasal anomalies are commonly discovered during routine radiological screenings and can present with a wide range of morphological features. This diversity can make it difficult for convolutional neural networks (CNNs) to accurately classify these anomalies, especially when working with limited datasets. Additionally, current approaches to paranasal anomaly classification are constrained to ide… ▽ More

    Submitted 31 March, 2023; originally announced March 2023.

  46. arXiv:2303.14408  [pdf, other

    cs.CV

    VL-SAT: Visual-Linguistic Semantics Assisted Training for 3D Semantic Scene Graph Prediction in Point Cloud

    Authors: Ziqin Wang, Bowen Cheng, Lichen Zhao, Dong Xu, Yang Tang, Lu Sheng

    Abstract: The task of 3D semantic scene graph (3DSSG) prediction in the point cloud is challenging since (1) the 3D point cloud only captures geometric structures with limited semantics compared to 2D images, and (2) long-tailed relation distribution inherently hinders the learning of unbiased prediction. Since 2D images provide rich semantics and scene graphs are in nature coped with languages, in this stu… ▽ More

    Submitted 25 March, 2023; originally announced March 2023.

    Comments: CVPR2023 Highlight

  47. arXiv:2301.01850  [pdf

    stat.AP cs.AI cs.LG math.PR

    Bayesian Weapon System Reliability Modeling with Cox-Weibull Neural Network

    Authors: Michael Potter, Benny Cheng

    Abstract: We propose to integrate weapon system features (such as weapon system manufacturer, deployment time and location, storage time and location, etc.) into a parameterized Cox-Weibull [1] reliability model via a neural network, like DeepSurv [2], to improve predictive maintenance. In parallel, we develop an alternative Bayesian model by parameterizing the Weibull parameters with a neural network and e… ▽ More

    Submitted 14 April, 2023; v1 submitted 4 January, 2023; originally announced January 2023.

    Comments: Pre-print with minor revisions, published at The 69th Annual Reliability and Maintainability Symposium, January 23-26, 2023, FL, USA

  48. arXiv:2212.04450  [pdf, other

    physics.chem-ph cond-mat.mtrl-sci cs.LG

    GAUCHE: A Library for Gaussian Processes in Chemistry

    Authors: Ryan-Rhys Griffiths, Leo Klarner, Henry B. Moss, Aditya Ravuri, Sang Truong, Samuel Stanton, Gary Tom, Bojana Rankovic, Yuanqi Du, Arian Jamasb, Aryan Deshwal, Julius Schwartz, Austin Tripp, Gregory Kell, Simon Frieder, Anthony Bourached, Alex Chan, Jacob Moss, Chengzhi Guo, Johannes Durholt, Saudamini Chaurasia, Felix Strieth-Kalthoff, Alpha A. Lee, Bingqing Cheng, Alán Aspuru-Guzik , et al. (2 additional authors not shown)

    Abstract: We introduce GAUCHE, a library for GAUssian processes in CHEmistry. Gaussian processes have long been a cornerstone of probabilistic machine learning, affording particular advantages for uncertainty quantification and Bayesian optimisation. Extending Gaussian processes to chemical representations, however, is nontrivial, necessitating kernels defined over structured inputs such as graphs, strings… ▽ More

    Submitted 21 February, 2023; v1 submitted 6 December, 2022; originally announced December 2022.

  49. arXiv:2211.08888  [pdf, other

    cs.CV

    ELDA: Using Edges to Have an Edge on Semantic Segmentation Based UDA

    Authors: Ting-Hsuan Liao, Huang-Ru Liao, Shan-Ya Yang, Jie-En Yao, Li-Yuan Tsao, Hsu-Shen Liu, Bo-Wun Cheng, Chen-Hao Chao, Chia-Che Chang, Yi-Chen Lo, Chun-Yi Lee

    Abstract: Many unsupervised domain adaptation (UDA) methods have been proposed to bridge the domain gap by utilizing domain invariant information. Most approaches have chosen depth as such information and achieved remarkable success. Despite their effectiveness, using depth as domain invariant information in UDA tasks may lead to multiple issues, such as excessively high extraction costs and difficulties in… ▽ More

    Submitted 16 November, 2022; originally announced November 2022.

    Comments: Accepted by BMVC2022. Ting-Hsuan Liao and Huang-Ru Liao contributed equally to this work

  50. arXiv:2210.17313  [pdf, ps, other

    eess.SY cs.AI math.OC

    DiscreteCommunication and ControlUpdating in Event-Triggered Consensus

    Authors: Bin Cheng, Yuezu Lv, Zhongkui Li, Zhisheng Duan

    Abstract: This paper studies the consensus control problem faced with three essential demands, namely, discrete control updating for each agent, discrete-time communications among neighboring agents, and the fully distributed fashion of the controller implementation without requiring any global information of the whole network topology. Noting that the existing related results only meeting one or two demand… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.