Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 257 results for author: Ma, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01718  [pdf, other

    stat.ML cs.LG math.ST

    Entropic Optimal Transport Eigenmaps for Nonlinear Alignment and Joint Embedding of High-Dimensional Datasets

    Authors: Boris Landa, Yuval Kluger, Rong Ma

    Abstract: Embedding high-dimensional data into a low-dimensional space is an indispensable component of data analysis. In numerous applications, it is necessary to align and jointly embed multiple datasets from different studies or experimental conditions. Such datasets may share underlying structures of interest but exhibit individual distortions, resulting in misaligned embeddings using traditional techni… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2407.01130  [pdf, other

    cs.CL

    Cross-Lingual Transfer Learning for Speech Translation

    Authors: Rao Ma, Yassir Fathullah, Mengjie Qian, Siyuan Tang, Mark Gales, Kate Knill

    Abstract: There has been increasing interest in building multilingual foundation models for NLP and speech research. Zero-shot cross-lingual transfer has been demonstrated on a range of NLP tasks where a model fine-tuned on task-specific data in one language yields performance gains in other languages. Here, we explore whether speech-based models exhibit the same transfer capability. Using Whisper as an exa… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  3. arXiv:2407.00412  [pdf, other

    cs.RO cs.IT cs.MA cs.NI

    C-MASS: Combinatorial Mobility-Aware Sensor Scheduling for Collaborative Perception with Second-Order Topology Approximation

    Authors: Yukuan Jia, Yuxuan Sun, Ruiqing Mao, Zhaojun Nan, Sheng Zhou, Zhisheng Niu

    Abstract: Collaborative Perception (CP) has been a promising solution to address occlusions in the traffic environment by sharing sensor data among collaborative vehicles (CoV) via vehicle-to-everything (V2X) network. With limited wireless bandwidth, CP necessitates task-oriented and receiver-aware sensor scheduling to prioritize important and complementary sensor data. However, due to vehicular mobility, i… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: 14 pages, 10 figures

  4. arXiv:2406.09876  [pdf, other

    cs.LG stat.ML

    Sailing in high-dimensional spaces: Low-dimensional embeddings through angle preservation

    Authors: Jonas Fischer, Rong Ma

    Abstract: Low-dimensional embeddings (LDEs) of high-dimensional data are ubiquitous in science and engineering. They allow us to quickly understand the main properties of the data, identify outliers and processing errors, and inform the next steps of data analysis. As such, LDEs have to be faithful to the original high-dimensional data, i.e., they should represent the relationships that are encoded in the d… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  5. arXiv:2406.07852  [pdf, other

    cs.CV

    DiffPop: Plausibility-Guided Object Placement Diffusion for Image Composition

    Authors: Jiacheng Liu, Hang Zhou, Shida Wei, Rui Ma

    Abstract: In this paper, we address the problem of plausible object placement for the challenging task of realistic image composition. We propose DiffPop, the first framework that utilizes plausibility-guided denoising diffusion probabilistic model to learn the scale and spatial relations among multiple objects and the corresponding scene image. First, we train an unguided diffusion model to directly learn… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  6. arXiv:2406.06022  [pdf, other

    cs.LG cs.DC

    GraphStorm: all-in-one graph machine learning framework for industry applications

    Authors: Da Zheng, Xiang Song, Qi Zhu, Jian Zhang, Theodore Vasiloudis, Runjie Ma, Houyu Zhang, Zichen Wang, Soji Adeshina, Israt Nisa, Alejandro Mottini, Qingjun Cui, Huzefa Rangwala, Belinda Zeng, Christos Faloutsos, George Karypis

    Abstract: Graph machine learning (GML) is effective in many business applications. However, making GML easy to use and applicable to industry applications with massive datasets remain challenging. We developed GraphStorm, which provides an end-to-end solution for scalable graph construction, graph model training and inference. GraphStorm has the following desirable properties: (a) Easy to use: it can perfor… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Journal ref: KDD 2024

  7. arXiv:2406.04373  [pdf, other

    cs.SE cs.AI

    VerilogReader: LLM-Aided Hardware Test Generation

    Authors: Ruiyang Ma, Yuxin Yang, Ziqian Liu, Jiaxi Zhang, Min Li, Junhua Huang, Guojie Luo

    Abstract: Test generation has been a critical and labor-intensive process in hardware design verification. Recently, the emergence of Large Language Model (LLM) with their advanced understanding and inference capabilities, has introduced a novel approach. In this work, we investigate the integration of LLM into the Coverage Directed Test Generation (CDG) process, where the LLM functions as a Verilog Reader.… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  8. arXiv:2406.03086  [pdf, other

    cs.MA cs.IT cs.LG

    Task-Oriented Wireless Communications for Collaborative Perception in Intelligent Unmanned Systems

    Authors: Sheng Zhou, Yukuan Jia, Ruiqing Mao, Zhaojun Nan, Yuxuan Sun, Zhisheng Niu

    Abstract: Collaborative Perception (CP) has shown great potential to achieve more holistic and reliable environmental perception in intelligent unmanned systems (IUSs). However, implementing CP still faces key challenges due to the characteristics of the CP task and the dynamics of wireless channels. In this article, a task-oriented wireless communication framework is proposed to jointly optimize the commun… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted by IEEE Network Magazine

  9. arXiv:2406.00276  [pdf

    cs.LG cs.AI cs.CE physics.data-an

    Non-destructive Degradation Pattern Decoupling for Ultra-early Battery Prototype Verification Using Physics-informed Machine Learning

    Authors: Shengyu Tao, Mengtian Zhang, Zixi Zhao, Haoyang Li, Ruifei Ma, Yunhong Che, Xin Sun, Lin Su, Xiangyu Chen, Zihao Zhou, Heng Chang, Tingwei Cao, Xiao Xiao, Yaojun Liu, Wenjun Yu, Zhongling Xu, Yang Li, Han Hao, Xuan Zhang, Xiaosong Hu, Guangmin ZHou

    Abstract: Manufacturing complexities and uncertainties have impeded the transition from material prototypes to commercial batteries, making prototype verification critical to quality assessment. A fundamental challenge involves deciphering intertwined chemical processes to characterize degradation patterns and their quantitative relationship with battery performance. Here we show that a physics-informed mac… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    ACM Class: J.2; G.3

  10. GaussianPrediction: Dynamic 3D Gaussian Prediction for Motion Extrapolation and Free View Synthesis

    Authors: Boming Zhao, Yuan Li, Ziyu Sun, Lin Zeng, Yujun Shen, Rui Ma, Yinda Zhang, Hujun Bao, Zhaopeng Cui

    Abstract: Forecasting future scenarios in dynamic environments is essential for intelligent decision-making and navigation, a challenge yet to be fully realized in computer vision and robotics. Traditional approaches like video prediction and novel-view synthesis either lack the ability to forecast from arbitrary viewpoints or to predict temporal dynamics. In this paper, we introduce GaussianPrediction, a n… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Accepted to SIGGRAPH 2024 Conference. Project Page: https://zju3dv.github.io/gaussian-prediction/

  11. arXiv:2405.15305  [pdf, other

    cs.CV

    Diff3DS: Generating View-Consistent 3D Sketch via Differentiable Curve Rendering

    Authors: Yibo Zhang, Lihong Wang, Changqing Zou, Tieru Wu, Rui Ma

    Abstract: 3D sketches are widely used for visually representing the 3D shape and structure of objects or scenes. However, the creation of 3D sketch often requires users to possess professional artistic skills. Existing research efforts primarily focus on enhancing the ability of interactive sketch generation in 3D virtual systems. In this work, we propose Diff3DS, a novel differentiable rendering framework… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: Project: https://yiboz2001.github.io/Diff3DS/

  12. arXiv:2405.12317  [pdf, other

    stat.ML cs.LG

    Kernel spectral joint embeddings for high-dimensional noisy datasets using duo-landmark integral operators

    Authors: Xiucai Ding, Rong Ma

    Abstract: Integrative analysis of multiple heterogeneous datasets has become standard practice in many research fields, especially in single-cell genomics and medical informatics. Existing approaches oftentimes suffer from limited power in capturing nonlinear structures, insufficient account of noisiness and effects of high-dimensionality, lack of adaptivity to signals and sample sizes imbalance, and their… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 32 pages, 5 figures; comments are welcome

  13. arXiv:2405.10633  [pdf, other

    cs.LG

    Harnessing Collective Structure Knowledge in Data Augmentation for Graph Neural Networks

    Authors: Rongrong Ma, Guansong Pang, Ling Chen

    Abstract: Graph neural networks (GNNs) have achieved state-of-the-art performance in graph representation learning. Message passing neural networks, which learn representations through recursively aggregating information from each node and its neighbors, are among the most commonly-used GNNs. However, a wealth of structural information of individual nodes and full graphs is often ignored in such process, wh… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  14. arXiv:2405.06134  [pdf, other

    cs.CL cs.SD eess.AS

    Muting Whisper: A Universal Acoustic Adversarial Attack on Speech Foundation Models

    Authors: Vyas Raina, Rao Ma, Charles McGhee, Kate Knill, Mark Gales

    Abstract: Recent developments in large speech foundation models like Whisper have led to their widespread use in many automatic speech recognition (ASR) applications. These systems incorporate `special tokens' in their vocabulary, such as $\texttt{<endoftext>}$, to guide their language generation process. However, we demonstrate that these tokens can be exploited by adversarial attacks to manipulate the mod… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  15. arXiv:2405.04903  [pdf, other

    cs.LG

    Imbalanced Graph Classification with Multi-scale Oversampling Graph Neural Networks

    Authors: Rongrong Ma, Guansong Pang, Ling Chen

    Abstract: One main challenge in imbalanced graph classification is to learn expressive representations of the graphs in under-represented (minority) classes. Existing generic imbalanced learning methods, such as oversampling and imbalanced learning loss functions, can be adopted for enabling graph representation learning models to cope with this challenge. However, these methods often directly operate on th… ▽ More

    Submitted 17 May, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

  16. arXiv:2405.04674  [pdf, other

    cs.DB

    Towards Accurate and Efficient Document Analytics with Large Language Models

    Authors: Yiming Lin, Madelon Hulsebos, Ruiying Ma, Shreya Shankar, Sepanta Zeigham, Aditya G. Parameswaran, Eugene Wu

    Abstract: Unstructured data formats account for over 80% of the data currently stored, and extracting value from such formats remains a considerable challenge. In particular, current approaches for managing unstructured documents do not support ad-hoc analytical queries on document collections. Moreover, Large Language Models (LLMs) directly applied to the documents themselves, or on portions of documents t… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  17. arXiv:2405.01312  [pdf, other

    cs.DB cs.CR

    Privacy-Enhanced Database Synthesis for Benchmark Publishing

    Authors: Yongrui Zhong, Yunqing Ge, Jianbin Qin, Shuyuan Zheng, Bo Tang, Yu-Xuan Qiu, Rui Mao, Ye Yuan, Makoto Onizuka, Chuan Xiao

    Abstract: Benchmarking is crucial for evaluating a DBMS, yet existing benchmarks often fail to reflect the varied nature of user workloads. As a result, there is increasing momentum toward creating databases that incorporate real-world user data to more accurately mirror business environments. However, privacy concerns deter users from directly sharing their data, underscoring the importance of creating syn… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  18. arXiv:2405.01202  [pdf, other

    cs.SE cs.CR

    DLAP: A Deep Learning Augmented Large Language Model Prompting Framework for Software Vulnerability Detection

    Authors: Yanjing Yang, Xin Zhou, Runfeng Mao, Jinwei Xu, Lanxin Yang, Yu Zhangm, Haifeng Shen, He Zhang

    Abstract: Software vulnerability detection is generally supported by automated static analysis tools, which have recently been reinforced by deep learning (DL) models. However, despite the superior performance of DL-based approaches over rule-based ones in research, applying DL approaches to software vulnerability detection in practice remains a challenge due to the complex structure of source code, the bla… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 15 pages, 8 figures

  19. arXiv:2404.18598  [pdf, other

    cs.CV cs.GR

    Anywhere: A Multi-Agent Framework for Reliable and Diverse Foreground-Conditioned Image Inpainting

    Authors: Tianyidan Xie, Rui Ma, Qian Wang, Xiaoqian Ye, Feixuan Liu, Ying Tai, Zhenyu Zhang, Zili Yi

    Abstract: Recent advancements in image inpainting, particularly through diffusion modeling, have yielded promising outcomes. However, when tested in scenarios involving the completion of images based on the foreground objects, current methods that aim to inpaint an image in an end-to-end manner encounter challenges such as "over-imagination", inconsistency between foreground and background, and limited dive… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: 16 pages, 9 figures, project page: https://anywheremultiagent.github.io

  20. arXiv:2404.18359  [pdf, other

    cs.CL cs.AI

    FoundaBench: Evaluating Chinese Fundamental Knowledge Capabilities of Large Language Models

    Authors: Wei Li, Ren Ma, Jiang Wu, Chenya Gu, Jiahui Peng, Jinyang Len, Songyang Zhang, Hang Yan, Dahua Lin, Conghui He

    Abstract: In the burgeoning field of large language models (LLMs), the assessment of fundamental knowledge remains a critical challenge, particularly for models tailored to Chinese language and culture. This paper introduces FoundaBench, a pioneering benchmark designed to rigorously evaluate the fundamental knowledge capabilities of Chinese LLMs. FoundaBench encompasses a diverse array of 3354 multiple-choi… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  21. arXiv:2404.13862  [pdf, other

    cs.CV

    PGAHum: Prior-Guided Geometry and Appearance Learning for High-Fidelity Animatable Human Reconstruction

    Authors: Hao Wang, Qingshan Xu, Hongyuan Chen, Rui Ma

    Abstract: Recent techniques on implicit geometry representation learning and neural rendering have shown promising results for 3D clothed human reconstruction from sparse video inputs. However, it is still challenging to reconstruct detailed surface geometry and even more difficult to synthesize photorealistic novel views with animated human poses. In this work, we introduce PGAHum, a prior-guided geometry… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  22. arXiv:2404.12458  [pdf

    cs.AI

    The collective use and evaluation of generative AI tools in digital humanities research: Survey-based results

    Authors: Meredith Dedema, Rongqian Ma

    Abstract: The advent of generative artificial intelligence (GenAI) technologies has revolutionized research, with significant implications for Digital Humanities (DH), a field inherently intertwined with technological progress. This article investigates how digital humanities scholars adopt, practice, as well as critically evaluate, GenAI technologies such as ChatGPT in the research process. Drawing on 76 r… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  23. arXiv:2404.11127  [pdf, other

    cs.CV

    D-Aug: Enhancing Data Augmentation for Dynamic LiDAR Scenes

    Authors: Jiaxing Zhao, Peng Zheng, Rui Ma

    Abstract: Creating large LiDAR datasets with pixel-level labeling poses significant challenges. While numerous data augmentation methods have been developed to reduce the reliance on manual labeling, these methods predominantly focus on static scenes and they overlook the importance of data augmentation for dynamic scenes, which is critical for autonomous driving. To address this issue, we propose D-Aug, a… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: 4pages, 4 figures

    ACM Class: I.4.3

  24. arXiv:2404.05553  [pdf, other

    q-bio.NC cs.AI

    Alljoined1 -- A dataset for EEG-to-Image decoding

    Authors: Jonathan Xu, Bruno Aristimunha, Max Emanuel Feucht, Emma Qian, Charles Liu, Tazik Shahjahan, Martyna Spyra, Steven Zifan Zhang, Nicholas Short, Jioh Kim, Paula Perdomo, Ricky Renfeng Mao, Yashvir Sabharwal, Michael Ahedor Moaz Shoura, Adrian Nestor

    Abstract: We present Alljoined1, a dataset built specifically for EEG-to-Image decoding. Recognizing that an extensive and unbiased sampling of neural responses to visual stimuli is crucial for image reconstruction efforts, we collected data from 8 participants looking at 10,000 natural images each. We have currently gathered 46,080 epochs of brain responses recorded with a 64-channel EEG headset. The datas… ▽ More

    Submitted 14 May, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

    Comments: 8 Pages, 6 Figures

    ACM Class: I.5.1; I.6.3; I.2.6; K.3.2

  25. GTS: GPU-based Tree Index for Fast Similarity Search

    Authors: Yifan Zhu, Ruiyao Ma, Baihua Zheng, Xiangyu Ke, Lu Chen, Yunjun Gao

    Abstract: Similarity search, the task of identifying objects most similar to a given query object under a specific metric, has gathered significant attention due to its practical applications. However, the absence of coordinate information to accelerate similarity search and the high computational cost of measuring object similarity hinder the efficiency of existing CPU-based methods. Additionally, these me… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: Accepted by SIGMOD 2024

    Journal ref: Proc. ACM Manag. Data, 2(3): 142:1-142:27

  26. arXiv:2403.17297  [pdf, other

    cs.CL cs.AI

    InternLM2 Technical Report

    Authors: Zheng Cai, Maosong Cao, Haojiong Chen, Kai Chen, Keyu Chen, Xin Chen, Xun Chen, Zehui Chen, Zhi Chen, Pei Chu, Xiaoyi Dong, Haodong Duan, Qi Fan, Zhaoye Fei, Yang Gao, Jiaye Ge, Chenya Gu, Yuzhe Gu, Tao Gui, Aijia Guo, Qipeng Guo, Conghui He, Yingfan Hu, Ting Huang, Tao Jiang , et al. (75 additional authors not shown)

    Abstract: The evolution of Large Language Models (LLMs) like ChatGPT and GPT-4 has sparked discussions on the advent of Artificial General Intelligence (AGI). However, replicating such advancements in open-source models has been challenging. This paper introduces InternLM2, an open-source LLM that outperforms its predecessors in comprehensive evaluations across 6 dimensions and 30 benchmarks, long-context m… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  27. arXiv:2403.16607  [pdf, other

    cs.LG cs.CV

    Enhancing Industrial Transfer Learning with Style Filter: Cost Reduction and Defect-Focus

    Authors: Chen Li, Ruijie Ma, Xiang Qian, Xiaohao Wang, Xinghui Li

    Abstract: Addressing the challenge of data scarcity in industrial domains, transfer learning emerges as a pivotal paradigm. This work introduces Style Filter, a tailored methodology for industrial contexts. By selectively filtering source domain data before knowledge transfer, Style Filter reduces the quantity of data while maintaining or even enhancing the performance of transfer learning strategy. Offerin… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: 17 pages, 11 figures,4 tables

  28. arXiv:2403.15156  [pdf, other

    cs.RO cs.CV eess.SY

    Infrastructure-Assisted Collaborative Perception in Automated Valet Parking: A Safety Perspective

    Authors: Yukuan Jia, Jiawen Zhang, Shimeng Lu, Baokang Fan, Ruiqing Mao, Sheng Zhou, Zhisheng Niu

    Abstract: Environmental perception in Automated Valet Parking (AVP) has been a challenging task due to severe occlusions in parking garages. Although Collaborative Perception (CP) can be applied to broaden the field of view of connected vehicles, the limited bandwidth of vehicular communications restricts its application. In this work, we propose a BEV feature-based CP network architecture for infrastructur… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: 7 pages, 7 figures, 4 tables, accepted by IEEE VTC2024-Spring

  29. arXiv:2403.12053   

    cs.MM

    PiGW: A Plug-in Generative Watermarking Framework

    Authors: Rui Ma, Mengxi Guo, Li Yuming, Hengyuan Zhang, Cong Ma, Yuan Li, Xiaodong Xie, Shanghang Zhang

    Abstract: Integrating watermarks into generative images is a critical strategy for protecting intellectual property and enhancing artificial intelligence security. This paper proposes Plug-in Generative Watermarking (PiGW) as a general framework for integrating watermarks into generative images. More specifically, PiGW embeds watermark information into the initial noise using a learnable watermark embedding… ▽ More

    Submitted 19 March, 2024; v1 submitted 4 January, 2024; originally announced March 2024.

    Comments: Improve experimental content

  30. arXiv:2403.12052  [pdf, other

    cs.CV

    A Dataset and Benchmark for Copyright Infringement Unlearning from Text-to-Image Diffusion Models

    Authors: Rui Ma, Qiang Zhou, Yizhu Jin, Daquan Zhou, Bangjun Xiao, Xiuyu Li, Yi Qu, Aishani Singh, Kurt Keutzer, Jingtong Hu, Xiaodong Xie, Zhen Dong, Shanghang Zhang, Shiji Zhou

    Abstract: Copyright law confers upon creators the exclusive rights to reproduce, distribute, and monetize their creative works. However, recent progress in text-to-image generation has introduced formidable challenges to copyright enforcement. These technologies enable the unauthorized learning and replication of copyrighted content, artistic creations, and likenesses, leading to the proliferation of unregu… ▽ More

    Submitted 21 June, 2024; v1 submitted 4 January, 2024; originally announced March 2024.

    Comments: 20 pages, 7 figures, 3 table

  31. arXiv:2403.11053  [pdf, other

    cs.CV

    OSTAF: A One-Shot Tuning Method for Improved Attribute-Focused T2I Personalization

    Authors: Ye Wang, Zili Yi, Rui Ma

    Abstract: Personalized text-to-image (T2I) models not only produce lifelike and varied visuals but also allow users to tailor the images to fit their personal taste. These personalization techniques can grasp the essence of a concept through a collection of images, or adjust a pre-trained text-to-image model with a specific image input for subject-driven or attribute-aware guidance. Yet, accurately capturin… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

  32. arXiv:2403.10166  [pdf, other

    cs.CV

    SemanticHuman-HD: High-Resolution Semantic Disentangled 3D Human Generation

    Authors: Peng Zheng, Tao Liu, Zili Yi, Rui Ma

    Abstract: With the development of neural radiance fields and generative models, numerous methods have been proposed for learning 3D human generation from 2D images. These methods allow control over the pose of the generated 3D human and enable rendering from different viewpoints. However, none of these methods explore semantic disentanglement in human image synthesis, i.e., they can not disentangle the gene… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: 26 pages, 14 figures

    ACM Class: I.2.10

  33. arXiv:2403.10051  [pdf, other

    cs.DB

    Accelerating Regular Path Queries over Graph Database with Processing-in-Memory

    Authors: Ruoyan Ma, Shengan Zheng, Guifeng Wang, Jin Pu, Yifan Hua, Wentao Wang, Linpeng Huang

    Abstract: Regular path queries (RPQs) in graph databases are bottlenecked by the memory wall. Emerging processing-in-memory (PIM) technologies offer a promising solution to dispatch and execute path matching tasks in parallel within PIM modules. We present Moctopus, a PIM-based data management system for graph databases that supports efficient batch RPQs and graph updates. Moctopus employs a PIM-friendly dy… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  34. arXiv:2403.09987  [pdf, other

    cs.HC

    Trusting the Search: Unraveling Human Trust in Health Information from Google and ChatGPT

    Authors: Xin Sun, Rongjun Ma, Xiaochang Zhao, Zhuying Li, Janne Lindqvist, Abdallah El Ali, Jos A. Bosch

    Abstract: People increasingly rely on online sources for health information seeking due to their convenience and timeliness, traditionally using search engines like Google as the primary search agent. Recently, the emergence of generative Artificial Intelligence (AI) has made Large Language Model (LLM) powered conversational agents such as ChatGPT a viable alternative for health information search. However,… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: 24 pages

    ACM Class: F.2.2, I.2.7

  35. arXiv:2403.09747  [pdf, other

    cs.CL cs.AI

    Re-Search for The Truth: Multi-round Retrieval-augmented Large Language Models are Strong Fake News Detectors

    Authors: Guanghua Li, Wensheng Lu, Wei Zhang, Defu Lian, Kezhong Lu, Rui Mao, Kai Shu, Hao Liao

    Abstract: The proliferation of fake news has had far-reaching implications on politics, the economy, and society at large. While Fake news detection methods have been employed to mitigate this issue, they primarily depend on two essential elements: the quality and relevance of the evidence, and the effectiveness of the verdict prediction mechanism. Traditional methods, which often source information from st… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  36. arXiv:2403.09583  [pdf, other

    cs.RO

    ExploRLLM: Guiding Exploration in Reinforcement Learning with Large Language Models

    Authors: Runyu Ma, Jelle Luijkx, Zlatan Ajanovic, Jens Kober

    Abstract: In image-based robot manipulation tasks with large observation and action spaces, reinforcement learning struggles with low sample efficiency, slow training speed, and uncertain convergence. As an alternative, large pre-trained foundation models have shown promise in robotic manipulation, particularly in zero-shot and few-shot applications. However, using these models directly is unreliable due to… ▽ More

    Submitted 15 March, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

    Comments: 8 pages,8 figures, conference IROS 2024

  37. arXiv:2403.09439  [pdf, other

    cs.CV cs.AI

    3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation

    Authors: Frank Zhang, Yibo Zhang, Quan Zheng, Rui Ma, Wei Hua, Hujun Bao, Weiwei Xu, Changqing Zou

    Abstract: Text-driven 3D scene generation techniques have made rapid progress in recent years. Their success is mainly attributed to using existing generative models to iteratively perform image warping and inpainting to generate 3D scenes. However, these methods heavily rely on the outputs of existing models, leading to error accumulation in geometry and appearance that prevent the models from being used i… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: 11 pages, 7 figures

  38. arXiv:2403.03739  [pdf, other

    cs.LG cs.AI

    A&B BNN: Add&Bit-Operation-Only Hardware-Friendly Binary Neural Network

    Authors: Ruichen Ma, Guanchao Qiao, Yian Liu, Liwei Meng, Ning Ning, Yang Liu, Shaogang Hu

    Abstract: Binary neural networks utilize 1-bit quantized weights and activations to reduce both the model's storage demands and computational burden. However, advanced binary architectures still incorporate millions of inefficient and nonhardware-friendly full-precision multiplication operations. A&B BNN is proposed to directly remove part of the multiplication operations in a traditional BNN and replace th… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: CVPR 2024 Accepted

  39. arXiv:2403.02866  [pdf

    physics.optics cs.ET

    Unlocking Electro-optic Resonant Phase Shifting for Multi-dimensional, Ultra-dynamic Photonic Switches

    Authors: Lingzhi Luo, Rui Ma, Richard V. Penty, Qixiang Cheng

    Abstract: Optical circuit switching is connection-oriented, being deterministic through the reservation of a complete wavelength channel or spatial path for a certain period. However, this comes at a trade-off against link dynamics, and overall capacity can thus be constrained by the time slot reservations, especially for switches with microsecond- to millisecond-scale reconfiguration times. For data-intens… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: 10 pages

  40. arXiv:2402.14415  [pdf, other

    cs.CV cs.GR

    TaylorGrid: Towards Fast and High-Quality Implicit Field Learning via Direct Taylor-based Grid Optimization

    Authors: Renyi Mao, Qingshan Xu, Peng Zheng, Ye Wang, Tieru Wu, Rui Ma

    Abstract: Coordinate-based neural implicit representation or implicit fields have been widely studied for 3D geometry representation or novel view synthesis. Recently, a series of efforts have been devoted to accelerating the speed and improving the quality of the coordinate-based implicit field learning. Instead of learning heavy MLPs to predict the neural implicit values for the query coordinates, neural… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

  41. arXiv:2402.02985  [pdf, other

    cs.CV cs.LG

    Applying Unsupervised Semantic Segmentation to High-Resolution UAV Imagery for Enhanced Road Scene Parsing

    Authors: Zihan Ma, Yongshang Li, Ronggui Ma, Chen Liang

    Abstract: There are two challenges presented in parsing road scenes from UAV images: the complexity of processing high-resolution images and the dependency on extensive manual annotations required by traditional supervised deep learning methods to train robust and accurate models. In this paper, a novel unsupervised road parsing framework that leverages advancements in vision language models with fundamenta… ▽ More

    Submitted 26 April, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  42. arXiv:2402.02101  [pdf, other

    cs.CL cs.AI

    Are Large Language Models Good Prompt Optimizers?

    Authors: Ruotian Ma, Xiaolei Wang, Xin Zhou, Jian Li, Nan Du, Tao Gui, Qi Zhang, Xuanjing Huang

    Abstract: LLM-based Automatic Prompt Optimization, which typically utilizes LLMs as Prompt Optimizers to self-reflect and refine prompts, has shown promising performance in recent studies. Despite the success, the underlying mechanism of this approach remains unexplored, and the true effectiveness of LLMs as Prompt Optimizers requires further validation. In this work, we conducted a comprehensive study to u… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

  43. arXiv:2402.00084  [pdf, other

    cs.LG cs.AI cs.CV

    EPSD: Early Pruning with Self-Distillation for Efficient Model Compression

    Authors: Dong Chen, Ning Liu, Yichen Zhu, Zhengping Che, Rui Ma, Fachao Zhang, Xiaofeng Mou, Yi Chang, Jian Tang

    Abstract: Neural network compression techniques, such as knowledge distillation (KD) and network pruning, have received increasing attention. Recent work `Prune, then Distill' reveals that a pruned student-friendly teacher network can benefit the performance of KD. However, the conventional teacher-student pipeline, which entails cumbersome pre-training of the teacher and complicated compression steps, make… ▽ More

    Submitted 31 January, 2024; originally announced February 2024.

    Comments: The first two authors are with equal contributions. Paper accepted by AAAI 2024

  44. arXiv:2401.16144  [pdf, other

    cs.CV cs.AI

    Divide and Conquer: Rethinking the Training Paradigm of Neural Radiance Fields

    Authors: Rongkai Ma, Leo Lebrat, Rodrigo Santa Cruz, Gil Avraham, Yan Zuo, Clinton Fookes, Olivier Salvado

    Abstract: Neural radiance fields (NeRFs) have exhibited potential in synthesizing high-fidelity views of 3D scenes but the standard training paradigm of NeRF presupposes an equal importance for each image in the training set. This assumption poses a significant challenge for rendering specific views presenting intricate geometries, thereby resulting in suboptimal performance. In this paper, we take a closer… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

  45. arXiv:2401.10729  [pdf, ps, other

    cs.DS

    Network Design on Undirected Series-Parallel Graphs

    Authors: Ishan Bansal, Ryan Mao, Avhan Mishra

    Abstract: We study the single pair capacitated network design problem and the budget constrained max flow problem on undirected series-parallel graphs. These problems were well studied on directed series-parallel graphs, but little is known in the context of undirected graphs. The major difference between the cases is that the source and sink of the problem instance do not necessarily coincide with the term… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

  46. arXiv:2401.05708  [pdf, other

    cs.ET

    FeReX: A Reconfigurable Design of Multi-bit Ferroelectric Compute-in-Memory for Nearest Neighbor Search

    Authors: Zhicheng Xu, Che-Kai Liu, Chao Li, Ruibin Mao, Jianyi Yang, Thomas Kämpfe, Mohsen Imani, Can Li, Cheng Zhuo, Xunzhao Yin

    Abstract: Rapid advancements in artificial intelligence have given rise to transformative models, profoundly impacting our lives. These models demand massive volumes of data to operate effectively, exacerbating the data-transfer bottleneck inherent in the conventional von-Neumann architecture. Compute-in-memory (CIM), a novel computing paradigm, tackles these issues by seamlessly embedding in-memory search… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

    Comments: 6 pages, 8 figures, 3 tables. Accepted by Design Automation and Test in Europe (DATE) 2024

  47. arXiv:2401.04923  [pdf, other

    cs.LG cs.CV

    Inconsistency-Based Data-Centric Active Open-Set Annotation

    Authors: Ruiyu Mao, Ouyang Xu, Yunhui Guo

    Abstract: Active learning is a commonly used approach that reduces the labeling effort required to train deep neural networks. However, the effectiveness of current active learning methods is limited by their closed-world assumptions, which assume that all data in the unlabeled pool comes from a set of predefined known classes. This assumption is often not valid in practical situations, as there may be unkn… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

    Comments: AAAI 2024

  48. arXiv:2401.03764  [pdf, other

    cs.CV cs.GR

    3D-SSGAN: Lifting 2D Semantics for 3D-Aware Compositional Portrait Synthesis

    Authors: Ruiqi Liu, Peng Zheng, Ye Wang, Rui Ma

    Abstract: Existing 3D-aware portrait synthesis methods can generate impressive high-quality images while preserving strong 3D consistency. However, most of them cannot support the fine-grained part-level control over synthesized images. Conversely, some GAN-based 2D portrait synthesis methods can achieve clear disentanglement of facial regions, but they cannot preserve view consistency due to a lack of 3D m… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

  49. arXiv:2401.03201  [pdf, other

    cs.CV cs.MM

    3DMIT: 3D Multi-modal Instruction Tuning for Scene Understanding

    Authors: Zeju Li, Chao Zhang, Xiaoyan Wang, Ruilong Ren, Yifan Xu, Ruifei Ma, Xiangde Liu

    Abstract: The remarkable potential of multi-modal large language models (MLLMs) in comprehending both vision and language information has been widely acknowledged. However, the scarcity of 3D scenes-language pairs in comparison to their 2D counterparts, coupled with the inadequacy of existing approaches in understanding of 3D scenes by LLMs, poses a significant challenge. In response, we collect and constru… ▽ More

    Submitted 16 January, 2024; v1 submitted 6 January, 2024; originally announced January 2024.

    Comments: 9 pages, 5 figures

  50. arXiv:2312.17611  [pdf, other

    cs.CV

    P2M2-Net: Part-Aware Prompt-Guided Multimodal Point Cloud Completion

    Authors: Linlian Jiang, Pan Chen, Ye Wang, Tieru Wu, Rui Ma

    Abstract: Inferring missing regions from severely occluded point clouds is highly challenging. Especially for 3D shapes with rich geometry and structure details, inherent ambiguities of the unknown parts are existing. Existing approaches either learn a one-to-one mapping in a supervised manner or train a generative model to synthesize the missing points for the completion of 3D point cloud shapes. These met… ▽ More

    Submitted 29 December, 2023; originally announced December 2023.

    Comments: Best Poster Award of CAD/Graphics 2023