Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 223 results for author: Chu, Z

.
  1. arXiv:2406.19820  [pdf, other

    cs.CL cs.AI

    BeamAggR: Beam Aggregation Reasoning over Multi-source Knowledge for Multi-hop Question Answering

    Authors: Zheng Chu, Jingchang Chen, Qianglong Chen, Haotian Wang, Kun Zhu, Xiyuan Du, Weijiang Yu, Ming Liu, Bing Qin

    Abstract: Large language models (LLMs) have demonstrated strong reasoning capabilities. Nevertheless, they still suffer from factual errors when tackling knowledge-intensive tasks. Retrieval-augmented reasoning represents a promising approach. However, significant challenges still persist, including inaccurate and insufficient retrieval for complex questions, as well as difficulty in integrating multi-sourc… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL 2024

  2. arXiv:2406.18227  [pdf, other

    cs.CV cs.CL

    GUIDE: A Guideline-Guided Dataset for Instructional Video Comprehension

    Authors: Jiafeng Liang, Shixin Jiang, Zekun Wang, Haojie Pan, Zerui Chen, Zheng Chu, Ming Liu, Ruiji Fu, Zhongyuan Wang, Bing Qin

    Abstract: There are substantial instructional videos on the Internet, which provide us tutorials for completing various tasks. Existing instructional video datasets only focus on specific steps at the video level, lacking experiential guidelines at the task level, which can lead to beginners struggling to learn new tasks due to the lack of relevant experience. Moreover, the specific steps without guidelines… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: IJCAI 2024

  3. arXiv:2406.16333  [pdf, other

    cs.CV cs.AI

    Prompt-Consistency Image Generation (PCIG): A Unified Framework Integrating LLMs, Knowledge Graphs, and Controllable Diffusion Models

    Authors: Yichen Sun, Zhixuan Chu, Zhan Qin, Kui Ren

    Abstract: The rapid advancement of Text-to-Image(T2I) generative models has enabled the synthesis of high-quality images guided by textual descriptions. Despite this significant progress, these models are often susceptible in generating contents that contradict the input text, which poses a challenge to their reliability and practical deployment. To address this problem, we introduce a novel diffusion-based… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  4. arXiv:2406.11434  [pdf, other

    cs.DB

    DB-GPT-Hub: Towards Open Benchmarking Text-to-SQL Empowered by Large Language Models

    Authors: Fan Zhou, Siqiao Xue, Danrui Qi, Wenhui Shi, Wang Zhao, Ganglin Wei, Hongyang Zhang, Caigai Jiang, Gangwei Jiang, Zhixuan Chu, Faqiang Chen

    Abstract: Large language models (LLMs) becomes the dominant paradigm for the challenging task of text-to-SQL. LLM-empowered text-to-SQL methods are typically categorized into prompting-based and tuning approaches. Compared to prompting-based methods, benchmarking fine-tuned LLMs for text-to-SQL is important yet under-explored, partially attributed to the prohibitively high computational cost. In this paper,… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  5. arXiv:2406.04531  [pdf, other

    cs.SE

    TESTEVAL: Benchmarking Large Language Models for Test Case Generation

    Authors: Wenhan Wang, Chenyuan Yang, Zhijie Wang, Yuheng Huang, Zhaoyang Chu, Da Song, Lingming Zhang, An Ran Chen, Lei Ma

    Abstract: Testing plays a crucial role in the software development cycle, enabling the detection of bugs, vulnerabilities, and other undesirable behaviors. To perform software testing, testers need to write code snippets that execute the program under test. Recently, researchers have recognized the potential of large language models (LLMs) in software testing. However, there remains a lack of fair compariso… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  6. arXiv:2406.03712  [pdf, other

    cs.CL cs.LG

    A Survey on Medical Large Language Models: Technology, Application, Trustworthiness, and Future Directions

    Authors: Lei Liu, Xiaoyan Yang, Junchi Lei, Xiaoyang Liu, Yue Shen, Zhiqiang Zhang, Peng Wei, Jinjie Gu, Zhixuan Chu, Zhan Qin, Kui Ren

    Abstract: Large language models (LLMs), such as GPT series models, have received substantial attention due to their impressive capabilities for generating and understanding human-level language. More recently, LLMs have emerged as an innovative and powerful adjunct in the medical field, transforming traditional practices and heralding a new era of enhanced healthcare services. This survey provides a compreh… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  7. arXiv:2406.01549  [pdf, other

    cs.CL cs.AI

    An Information Bottleneck Perspective for Effective Noise Filtering on Retrieval-Augmented Generation

    Authors: Kun Zhu, Xiaocheng Feng, Xiyuan Du, Yuxuan Gu, Weijiang Yu, Haotian Wang, Qianglong Chen, Zheng Chu, Jingchang Chen, Bing Qin

    Abstract: Retrieval-augmented generation integrates the capabilities of large language models with relevant information retrieved from an extensive corpus, yet encounters challenges when confronted with real-world noisy data. One recent solution is to train a filter module to find relevant content but only achieve suboptimal noise compression. In this paper, we propose to introduce the information bottlenec… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: ACL24 Main

  8. arXiv:2406.01406  [pdf, other

    hep-ph nucl-th

    On the role of $J/ψ$ production in electron-ion collisions

    Authors: Zexuan Chu, Jinhui Chen, Xiang-Peng Wang, Hongxi Xing

    Abstract: Within the framework of non-relativistic QCD (NRQCD) effective field theory, we study the leptoproduction of $J/ψ$ at next-to-leading order in perturbative QCD for both unpolarized and polarized electron-ion collisions. We demonstrate that the $J/ψ$-tagged deep inelastic scattering in the future Electron-Ion Collider can be served as a golden channel for the reasons including constraining NRQCD lo… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 6 pages, 4 figures

  9. arXiv:2405.20092  [pdf, other

    cs.CL cs.SE

    Divide-and-Conquer Meets Consensus: Unleashing the Power of Functions in Code Generation

    Authors: Jingchang Chen, Hongxuan Tang, Zheng Chu, Qianglong Chen, Zekun Wang, Ming Liu, Bing Qin

    Abstract: Despite recent progress made by large language models in code generation, they still struggle with programs that meet complex requirements. Recent work utilizes plan-and-solve decomposition to decrease the complexity and leverage self-tests to refine the generated program. Yet, planning deep-inside requirements in advance can be challenging, and the tests need to be accurate to accomplish self-imp… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  10. arXiv:2405.17792  [pdf, other

    hep-ex hep-ph

    JUNO Sensitivity to Invisible Decay Modes of Neutrons

    Authors: JUNO Collaboration, Angel Abusleme, Thomas Adam, Kai Adamowicz, Shakeel Ahmad, Rizwan Ahmed, Sebastiano Aiello, Fengpeng An, Qi An, Giuseppe Andronico, Nikolay Anfimov, Vito Antonelli, Tatiana Antoshkina, João Pedro Athayde Marcondes de André, Didier Auguste, Weidong Bai, Nikita Balashov, Wander Baldini, Andrea Barresi, Davide Basilico, Eric Baussan, Marco Bellato, Marco Beretta, Antonio Bergnoli, Daniel Bick , et al. (635 additional authors not shown)

    Abstract: We explore the bound neutrons decay into invisible particles (e.g., $n\rightarrow 3 ν$ or $nn \rightarrow 2 ν$) in the JUNO liquid scintillator detector. The invisible decay includes two decay modes: $ n \rightarrow { inv} $ and $ nn \rightarrow { inv} $. The invisible decays of $s$-shell neutrons in $^{12}{\rm C}$ will leave a highly excited residual nucleus. Subsequently, some de-excitation mode… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 28 pages, 7 figures, 4 tables

  11. arXiv:2405.15240  [pdf, other

    cs.LG cs.CV

    Towards Real World Debiasing: A Fine-grained Analysis On Spurious Correlation

    Authors: Zhibo Wang, Peng Kuang, Zhixuan Chu, Jingyi Wang, Kui Ren

    Abstract: Spurious correlations in training data significantly hinder the generalization capability of machine learning models when faced with distribution shifts in real-world scenarios. To tackle the problem, numerous debias approaches have been proposed and benchmarked on datasets intentionally designed with severe biases. However, it remains to be asked: \textit{1. Do existing benchmarks really capture… ▽ More

    Submitted 30 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: 9 pages of main paper, 10 pages of appendix

  12. arXiv:2405.04180  [pdf, other

    cs.LG cs.CV

    Sora Detector: A Unified Hallucination Detection for Large Text-to-Video Models

    Authors: Zhixuan Chu, Lei Zhang, Yichen Sun, Siqiao Xue, Zhibo Wang, Zhan Qin, Kui Ren

    Abstract: The rapid advancement in text-to-video (T2V) generative models has enabled the synthesis of high-fidelity video content guided by textual descriptions. Despite this significant progress, these models are often susceptible to hallucination, generating contents that contradict the input text, which poses a challenge to their reliability and practical deployment. To address this critical issue, we in… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2306.08302, arXiv:2403.05131 by other authors

  13. arXiv:2405.04160  [pdf, other

    cs.CL

    A Causal Explainable Guardrails for Large Language Models

    Authors: Zhixuan Chu, Yan Wang, Longfei Li, Zhibo Wang, Zhan Qin, Kui Ren

    Abstract: Large Language Models (LLMs) have shown impressive performance in natural language tasks, but their outputs can exhibit undesirable attributes or biases. Existing methods for steering LLMs towards desired attributes often assume unbiased representations and rely solely on steering prompts. However, the representations learned from pre-training can introduce semantic biases that influence the steer… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: 23 pages

  14. arXiv:2405.02778  [pdf, other

    cs.IR

    Improve Temporal Awareness of LLMs for Sequential Recommendation

    Authors: Zhendong Chu, Zichao Wang, Ruiyi Zhang, Yangfeng Ji, Hongning Wang, Tong Sun

    Abstract: Large language models (LLMs) have demonstrated impressive zero-shot abilities in solving a wide range of general-purpose tasks. However, it is empirically found that LLMs fall short in recognizing and utilizing temporal information, rendering poor performance in tasks that require an understanding of sequential data, such as sequential recommendation. In this paper, we aim to improve temporal awar… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: 10 pages

  15. arXiv:2404.15687  [pdf, other

    cs.SE cs.AI cs.CR

    Graph Neural Networks for Vulnerability Detection: A Counterfactual Explanation

    Authors: Zhaoyang Chu, Yao Wan, Qian Li, Yang Wu, Hongyu Zhang, Yulei Sui, Guandong Xu, Hai Jin

    Abstract: Vulnerability detection is crucial for ensuring the security and reliability of software systems. Recently, Graph Neural Networks (GNNs) have emerged as a prominent code embedding approach for vulnerability detection, owing to their ability to capture the underlying semantic structure of source code. However, GNNs face significant challenges in explainability due to their inherently black-box natu… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: This paper was accepted in the proceedings of the 33nd ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2024)

  16. arXiv:2404.14012  [pdf, other

    eess.SY

    Coordinated Planning for Stability Enhancement in High IBR-Penetrated Systems

    Authors: Zhongda Chu, Fei Teng

    Abstract: Security and stability challenges in future power systems with high penetration Inverter-Based Resources (IBR) have been anticipated as the main barrier to decolonization. Grid-following IBRs may become unstable under small disturbances in weak grids, while, during transient processes, system stability and protection may be jeopardized due to the lack of sufficient Short-Circuit Current (SCC). To… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  17. arXiv:2404.12675  [pdf, other

    cs.CR

    ESPM-D: Efficient Sparse Polynomial Multiplication for Dilithium on ARM Cortex-M4 and Apple M2

    Authors: Jieyu Zheng, Hong Zhang, Le Tian, Zhuo Zhang, Hanyu Wei, Zhiwei Chu, Yafang Yang, Yunlei Zhao

    Abstract: Dilithium is a lattice-based digital signature scheme standardized by the NIST post-quantum cryptography (PQC) project. In this study, we focus on developing efficient sparse polynomial multiplication implementations of Dilithium for ARM Cortex-M4 and Apple M2, which are both based on the ARM architecture. The ARM Cortex-M4 is commonly utilized in resource-constrained devices such as sensors. Conv… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: 19 pages, 1 figure

  18. arXiv:2404.01349  [pdf, other

    cs.CL cs.AI

    Fairness in Large Language Models: A Taxonomic Survey

    Authors: Zhibo Chu, Zichong Wang, Wenbin Zhang

    Abstract: Large Language Models (LLMs) have demonstrated remarkable success across various domains. However, despite their promising performance in numerous real-world applications, most of these algorithms lack fairness considerations. Consequently, they may lead to discriminatory outcomes against certain communities, particularly marginalized populations, prompting extensive study in fair LLMs. On the oth… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

  19. arXiv:2403.19446  [pdf, other

    cs.LO

    EDA-Driven Preprocessing for SAT Solving

    Authors: Zhengyuan Shi, Tiebing Tang, Sadaf Khan, Hui-Ling Zhen, Mingxuan Yuan, Zhufei Chu, Qiang Xu

    Abstract: Effective formulation of problems into Conjunctive Normal Form (CNF) is critical in modern Boolean Satisfiability (SAT) solving for optimizing solver performance. Addressing the limitations of existing methods, our Electronic Design Automation (EDA)-driven preprocessing framework introduces a novel methodology for preparing SAT instances, leveraging both circuit and CNF formats for enhanced flexib… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  20. arXiv:2403.09233  [pdf, other

    cs.CV eess.IV

    D-YOLO a robust framework for object detection in adverse weather conditions

    Authors: Zihan Chu

    Abstract: Adverse weather conditions including haze, snow and rain lead to decline in image qualities, which often causes a decline in performance for deep-learning based detection networks. Most existing approaches attempts to rectify hazy images before performing object detection, which increases the complexity of the network and may result in the loss in latent information. To better integrate image rest… ▽ More

    Submitted 19 March, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

    Comments: Object detection in adverse weather conditions. arXiv admin note: text overlap with arXiv:2209.01373 by other authors

  21. arXiv:2403.07257  [pdf, other

    cs.AR cs.ET

    The Dawn of AI-Native EDA: Opportunities and Challenges of Large Circuit Models

    Authors: Lei Chen, Yiqi Chen, Zhufei Chu, Wenji Fang, Tsung-Yi Ho, Ru Huang, Yu Huang, Sadaf Khan, Min Li, Xingquan Li, Yu Li, Yun Liang, Jinwei Liu, Yi Liu, Yibo Lin, Guojie Luo, Zhengyuan Shi, Guangyu Sun, Dimitrios Tsaras, Runsheng Wang, Ziyi Wang, Xinming Wei, Zhiyao Xie, Qiang Xu, Chenhao Xue , et al. (14 additional authors not shown)

    Abstract: Within the Electronic Design Automation (EDA) domain, AI-driven solutions have emerged as formidable tools, yet they typically augment rather than redefine existing methodologies. These solutions often repurpose deep learning models from other domains, such as vision, text, and graph analytics, applying them to circuit design without tailoring to the unique complexities of electronic circuits. Suc… ▽ More

    Submitted 1 May, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: The authors are ordered alphabetically. Contact: qxu@cse[dot]cuhk[dot]edu[dot]hk, gluo@pku[dot]edu[dot]cn, yuan.mingxuan@huawei[dot]com

  22. arXiv:2402.15410  [pdf, other

    hep-ex hep-ph nucl-ex

    Detailed Report on the Measurement of the Positive Muon Anomalous Magnetic Moment to 0.20 ppm

    Authors: D. P. Aguillard, T. Albahri, D. Allspach, A. Anisenkov, K. Badgley, S. Baeßler, I. Bailey, L. Bailey, V. A. Baranov, E. Barlas-Yucel, T. Barrett, E. Barzi, F. Bedeschi, M. Berz, M. Bhattacharya, H. P. Binney, P. Bloom, J. Bono, E. Bottalico, T. Bowcock, S. Braun, M. Bressler, G. Cantatore, R. M. Carey, B. C. K. Casey , et al. (168 additional authors not shown)

    Abstract: We present details on a new measurement of the muon magnetic anomaly, $a_μ= (g_μ-2)/2$. The result is based on positive muon data taken at Fermilab's Muon Campus during the 2019 and 2020 accelerator runs. The measurement uses $3.1$ GeV$/c$ polarized muons stored in a $7.1$-m-radius storage ring with a $1.45$ T uniform magnetic field. The value of $ a_μ$ is determined from the measured difference b… ▽ More

    Submitted 22 May, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

    Comments: 48 pages, 29 figures; 4 pages of Supplement Material; version accepted for publication in Physical Review D

    Report number: FERMILAB-PUB-24-0084-AD-CSAID-PPD

  23. arXiv:2402.11068  [pdf, other

    cs.CL cs.AI

    Bridging Causal Discovery and Large Language Models: A Comprehensive Survey of Integrative Approaches and Future Directions

    Authors: Guangya Wan, Yuqi Wu, Mengxuan Hu, Zhixuan Chu, Sheng Li

    Abstract: Causal discovery (CD) and Large Language Models (LLMs) represent two emerging fields of study with significant implications for artificial intelligence. Despite their distinct origins, CD focuses on uncovering cause-effect relationships from data, and LLMs on processing and generating humanlike text, the convergence of these domains offers novel insights and methodologies for understanding complex… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  24. arXiv:2402.07191  [pdf, other

    cs.LG cs.AI

    GSINA: Improving Subgraph Extraction for Graph Invariant Learning via Graph Sinkhorn Attention

    Authors: Fangyu Ding, Haiyang Wang, Zhixuan Chu, Tianming Li, Zhaoping Hu, Junchi Yan

    Abstract: Graph invariant learning (GIL) has been an effective approach to discovering the invariant relationships between graph data and its labels for different graph learning tasks under various distribution shifts. Many recent endeavors of GIL focus on extracting the invariant subgraph from the input graph for prediction as a regularization strategy to improve the generalization performance of graph lea… ▽ More

    Submitted 11 February, 2024; originally announced February 2024.

  25. arXiv:2402.06853  [pdf, other

    cs.CL

    History, Development, and Principles of Large Language Models-An Introductory Survey

    Authors: Zhibo Chu, Shiwen Ni, Zichong Wang, Xi Feng, Min Yang, Wenbin Zhang

    Abstract: Language models serve as a cornerstone in natural language processing (NLP), utilizing mathematical methods to generalize language laws and knowledge for prediction and generation. Over extensive research spanning decades, language modeling has progressed from initial statistical language models (SLMs) to the contemporary landscape of large language models (LLMs). Notably, the swift evolution of L… ▽ More

    Submitted 11 June, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

  26. arXiv:2402.03628  [pdf, other

    cs.CL

    Professional Agents -- Evolving Large Language Models into Autonomous Experts with Human-Level Competencies

    Authors: Zhixuan Chu, Yan Wang, Feng Zhu, Lu Yu, Longfei Li, Jinjie Gu

    Abstract: The advent of large language models (LLMs) such as ChatGPT, PaLM, and GPT-4 has catalyzed remarkable advances in natural language processing, demonstrating human-like language fluency and reasoning capacities. This position paper introduces the concept of Professional Agents (PAgents), an application framework harnessing LLM capabilities to create autonomous agents with controllable, specialized,… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    Comments: 14 pages, 1 figure

  27. arXiv:2401.15641  [pdf, other

    cs.IR cs.CL

    PRE: A Peer Review Based Large Language Model Evaluator

    Authors: Zhumin Chu, Qingyao Ai, Yiteng Tu, Haitao Li, Yiqun Liu

    Abstract: The impressive performance of large language models (LLMs) has attracted considerable attention from the academic and industrial communities. Besides how to construct and train LLMs, how to effectively evaluate and compare the capacity of LLMs has also been well recognized as an important yet difficult problem. Existing paradigms rely on either human annotators or model-based evaluators to evaluat… ▽ More

    Submitted 3 June, 2024; v1 submitted 28 January, 2024; originally announced January 2024.

    Comments: 11 pages

  28. arXiv:2401.13561  [pdf, other

    eess.SY

    Pricing of Short Circuit Current in High IBR-Penetrated System

    Authors: Zhongda Chu, Jingyi Wu, Fei Teng

    Abstract: With the growing penetration of Inverter-Based Resources (IBRs) in power systems, stability service markets have emerged to incentivize technologies that ensure power system stability and reliability. Among the various challenges faced in power system operation and stability, a prominent issue raised from the increasing integration of large-scale IBRs is the significant reduction of the Short-Circ… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

  29. arXiv:2401.11968  [pdf, other

    cs.CR

    Effective Intrusion Detection in Heterogeneous Internet-of-Things Networks via Ensemble Knowledge Distillation-based Federated Learning

    Authors: Jiyuan Shen, Wenzhuo Yang, Zhaowei Chu, Jiani Fan, Dusit Niyato, Kwok-Yan Lam

    Abstract: With the rapid development of low-cost consumer electronics and cloud computing, Internet-of-Things (IoT) devices are widely adopted for supporting next-generation distributed systems such as smart cities and industrial control systems. IoT devices are often susceptible to cyber attacks due to their open deployment environment and limited computing capabilities for stringent security controls. Hen… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

  30. arXiv:2401.08217  [pdf, other

    cs.IR

    LLM-Guided Multi-View Hypergraph Learning for Human-Centric Explainable Recommendation

    Authors: Zhixuan Chu, Yan Wang, Qing Cui, Longfei Li, Wenqing Chen, Zhan Qin, Kui Ren

    Abstract: As personalized recommendation systems become vital in the age of information overload, traditional methods relying solely on historical user interactions often fail to fully capture the multifaceted nature of human interests. To enable more human-centric modeling of user preferences, this work proposes a novel explainable recommendation framework, i.e., LLMHG, synergizing the reasoning capabiliti… ▽ More

    Submitted 29 March, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: 14 pages, 5 figures

  31. arXiv:2312.16113  [pdf, other

    cs.LG cs.AI

    Task-Driven Causal Feature Distillation: Towards Trustworthy Risk Prediction

    Authors: Zhixuan Chu, Mengxuan Hu, Qing Cui, Longfei Li, Sheng Li

    Abstract: Since artificial intelligence has seen tremendous recent successes in many areas, it has sparked great interest in its potential for trustworthy and interpretable risk prediction. However, most models lack causal reasoning and struggle with class imbalance, leading to poor precision and recall. To address this, we propose a Task-Driven Causal Feature Distillation model (TDCFD) to transform origina… ▽ More

    Submitted 21 January, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

    Comments: Proceedings of the 2024 AAAI Conference on Artificial Intelligence

  32. arXiv:2312.06677  [pdf, other

    cs.LG cs.AI cs.CL

    Intelligent Virtual Assistants with LLM-based Process Automation

    Authors: Yanchu Guan, Dong Wang, Zhixuan Chu, Shiyu Wang, Feiyue Ni, Ruihua Song, Longfei Li, Jinjie Gu, Chenyi Zhuang

    Abstract: While intelligent virtual assistants like Siri, Alexa, and Google Assistant have become ubiquitous in modern life, they still face limitations in their ability to follow multi-step instructions and accomplish complex goals articulated in natural language. However, recent breakthroughs in large language models (LLMs) show promise for overcoming existing barriers by enhancing natural language proces… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  33. arXiv:2312.04854  [pdf, other

    cs.CL cs.AI

    Apollo's Oracle: Retrieval-Augmented Reasoning in Multi-Agent Debates

    Authors: Haotian Wang, Xiyuan Du, Weijiang Yu, Qianglong Chen, Kun Zhu, Zheng Chu, Lian Yan, Yi Guan

    Abstract: Multi-agent debate systems are designed to derive accurate and consistent conclusions through adversarial interactions among agents. However, these systems often encounter challenges due to cognitive constraints, manifesting as (1) agents' obstinate adherence to incorrect viewpoints and (2) their propensity to abandon correct viewpoints. These issues are primarily responsible for the ineffectivene… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

    Comments: 16 pages, 7 figures

  34. arXiv:2312.00421  [pdf, other

    cs.LO

    A Semi-Tensor Product based Circuit Simulation for SAT-sweeping

    Authors: Hongyang Pan, Ruibing Zhang, Yinshui Xia, Lunyao Wang, Fan Yang, Xuan Zeng, Zhufei Chu

    Abstract: In recent years, circuit simulators and Boolean satisfiability (SAT) solvers have been tightly integrated to provide efficient logic synthesis and verification. Circuit simulation can generate highly expressive simulation patterns that can either enumerate or filter out most candidates for synthesis. Subsequently, SAT solvers are employed to check those that remain, thereby making the logic synthe… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

    Comments: In this 6 page manuscript, we introduce a novel Semi-Tensor Product based circuit simulation for SAT-sweeping in DATE'24

  35. arXiv:2311.17667  [pdf, other

    cs.CL cs.AI

    TimeBench: A Comprehensive Evaluation of Temporal Reasoning Abilities in Large Language Models

    Authors: Zheng Chu, Jingchang Chen, Qianglong Chen, Weijiang Yu, Haotian Wang, Ming Liu, Bing Qin

    Abstract: Grasping the concept of time is a fundamental facet of human cognition, indispensable for truly comprehending the intricacies of the world. Previous studies typically focus on specific aspects of time, lacking a comprehensive temporal reasoning benchmark. To address this, we propose TimeBench, a comprehensive hierarchical temporal reasoning benchmark that covers a broad spectrum of temporal reason… ▽ More

    Submitted 28 June, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

    Comments: Accepted to ACL 2024

  36. arXiv:2311.10389  [pdf, other

    cs.CV

    Two-Factor Authentication Approach Based on Behavior Patterns for Defeating Puppet Attacks

    Authors: Wenhao Wang, Guyue Li, Zhiming Chu, Haobo Li, Daniele Faccio

    Abstract: Fingerprint traits are widely recognized for their unique qualities and security benefits. Despite their extensive use, fingerprint features can be vulnerable to puppet attacks, where attackers manipulate a reluctant but genuine user into completing the authentication process. Defending against such attacks is challenging due to the coexistence of a legitimate identity and an illegitimate intent.… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

  37. arXiv:2311.04816  [pdf, other

    cs.CL cs.AI

    MTGER: Multi-view Temporal Graph Enhanced Temporal Reasoning over Time-Involved Document

    Authors: Zheng Chu, Zekun Wang, Jiafeng Liang, Ming Liu, Bing Qin

    Abstract: The facts and time in the document are intricately intertwined, making temporal reasoning over documents challenging. Previous work models time implicitly, making it difficult to handle such complex relationships. To address this issue, we propose MTGER, a novel Multi-view Temporal Graph Enhanced Temporal Reasoning framework for temporal reasoning over time-involved documents. Concretely, MTGER ex… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: Findings of EMNLP 2023, long paper

  38. arXiv:2311.03282  [pdf, ps, other

    cs.IT eess.SP

    Resource Allocation for RIS-Empowered Wireless Communications: Low-Complexity and Robust Designs

    Authors: Ming Zeng, Wanming Hao, Zhangjie Peng, Zheng Chu, Xingwang Li, Changsheng You, Cunhua Pan

    Abstract: This article delves into advancements in resource allocation techniques tailored for systems utilizing reconfigurable intelligent surfaces (RIS), with a primary focus on achieving low-complexity and resilient solutions. The investigation of low-complexity approaches for RIS holds significant relevance, primarily owing to the intricate characteristics inherent in RIS-based systems and the need of d… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Comments: submitted to IEEE WCM

  39. A New Extrapolation Economy Cascadic Multigrid Method for Image Restoration Problems

    Authors: Zhaoteng Chu, Ziqi Yan, Chenliang Li

    Abstract: In this paper, a new extrapolation economy cascadic multigrid method is proposed to solve the image restoration model. The new method combines the new extrapolation formula and quadratic interpolation to design a nonlinear prolongation operator, which provides more accurate initial values for the fine grid level. An edge preserving denoising operator is constructed to remove noise and preserve ima… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    MSC Class: 65F10; 65N55

    Journal ref: American Journal of Computational Mathematics, 13, 323-341 (2023)

  40. arXiv:2310.20109  [pdf, other

    cs.IR

    Multi-Objective Intrinsic Reward Learning for Conversational Recommender Systems

    Authors: Zhendong Chu, Nan Wang, Hongning Wang

    Abstract: Conversational Recommender Systems (CRS) actively elicit user preferences to generate adaptive recommendations. Mainstream reinforcement learning-based CRS solutions heavily rely on handcrafted reward functions, which may not be aligned with user intent in CRS tasks. Therefore, the design of task-specific rewards is critical to facilitate CRS policy learning, which remains largely under-explored i… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: 11 pages

  41. arXiv:2310.17784  [pdf, other

    cs.CL cs.AI cs.LG

    Data-Centric Financial Large Language Models

    Authors: Zhixuan Chu, Huaiyu Guo, Xinyuan Zhou, Yijia Wang, Fei Yu, Hong Chen, Wanqing Xu, Xin Lu, Qing Cui, Longfei Li, Jun Zhou, Sheng Li

    Abstract: Large language models (LLMs) show promise for natural language tasks but struggle when applied directly to complex domains like finance. LLMs have difficulty reasoning about and integrating all relevant information. We propose a data-centric approach to enable LLMs to better handle financial tasks. Our key insight is that rather than overloading the LLM with everything at once, it is more effectiv… ▽ More

    Submitted 13 November, 2023; v1 submitted 7 October, 2023; originally announced October 2023.

  42. arXiv:2310.16790  [pdf, other

    cs.CL cs.AI cs.LG

    Improving a Named Entity Recognizer Trained on Noisy Data with a Few Clean Instances

    Authors: Zhendong Chu, Ruiyi Zhang, Tong Yu, Rajiv Jain, Vlad I Morariu, Jiuxiang Gu, Ani Nenkova

    Abstract: To achieve state-of-the-art performance, one still needs to train NER models on large-scale, high-quality annotated data, an asset that is both costly and time-intensive to accumulate. In contrast, real-world applications often resort to massive low-quality labeled data through non-expert annotators via crowdsourcing and external knowledge bases via distant supervision as a cost-effective alternat… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: 14 pages

  43. arXiv:2310.11295  [pdf, other

    cs.CV cs.CG

    CorrTalk: Correlation Between Hierarchical Speech and Facial Activity Variances for 3D Animation

    Authors: Zhaojie Chu, Kailing Guo, Xiaofen Xing, Yilin Lan, Bolun Cai, Xiangmin Xu

    Abstract: Speech-driven 3D facial animation is a challenging cross-modal task that has attracted growing research interest. During speaking activities, the mouth displays strong motions, while the other facial regions typically demonstrate comparatively weak activity levels. Existing approaches often simplify the process by directly mapping single-level speech features to the entire facial animation, which… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

  44. arXiv:2310.04993  [pdf, other

    cs.LG

    Prompt-augmented Temporal Point Process for Streaming Event Sequence

    Authors: Siqiao Xue, Yan Wang, Zhixuan Chu, Xiaoming Shi, Caigao Jiang, Hongyan Hao, Gangwei Jiang, Xiaoyun Feng, James Y. Zhang, Jun Zhou

    Abstract: Neural Temporal Point Processes (TPPs) are the prevalent paradigm for modeling continuous-time event sequences, such as user activities on the web and financial transactions. In real-world applications, event data is typically received in a \emph{streaming} manner, where the distribution of patterns may shift over time. Additionally, \emph{privacy and memory constraints} are commonly observed in p… ▽ More

    Submitted 13 October, 2023; v1 submitted 7 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023 camera ready version

  45. arXiv:2310.01728  [pdf, other

    cs.LG cs.AI

    Time-LLM: Time Series Forecasting by Reprogramming Large Language Models

    Authors: Ming Jin, Shiyu Wang, Lintao Ma, Zhixuan Chu, James Y. Zhang, Xiaoming Shi, Pin-Yu Chen, Yuxuan Liang, Yuan-Fang Li, Shirui Pan, Qingsong Wen

    Abstract: Time series forecasting holds significant importance in many real-world dynamic systems and has been extensively studied. Unlike natural language process (NLP) and computer vision (CV), where a single large model can tackle multiple tasks, models for time series forecasting are often specialized, necessitating distinct designs for different tasks and applications. While pre-trained foundation mode… ▽ More

    Submitted 29 January, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: Accepted by the 12th International Conference on Learning Representations (ICLR 2024)

  46. arXiv:2309.15402  [pdf, other

    cs.CL cs.AI

    Navigate through Enigmatic Labyrinth A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future

    Authors: Zheng Chu, Jingchang Chen, Qianglong Chen, Weijiang Yu, Tao He, Haotian Wang, Weihua Peng, Ming Liu, Bing Qin, Ting Liu

    Abstract: Reasoning, a fundamental cognitive process integral to human intelligence, has garnered substantial interest within artificial intelligence. Notably, recent studies have revealed that chain-of-thought prompting significantly enhances LLM's reasoning capabilities, which attracts widespread attention from both academics and industry. In this paper, we systematically investigate relevant research, su… ▽ More

    Submitted 5 June, 2024; v1 submitted 27 September, 2023; originally announced September 2023.

    Comments: Accepted to ACL 2024

  47. Monotonic Neural Ordinary Differential Equation: Time-series Forecasting for Cumulative Data

    Authors: Zhichao Chen, Leilei Ding, Zhixuan Chu, Yucheng Qi, Jianmin Huang, Hao Wang

    Abstract: Time-Series Forecasting based on Cumulative Data (TSFCD) is a crucial problem in decision-making across various industrial scenarios. However, existing time-series forecasting methods often overlook two important characteristics of cumulative data, namely monotonicity and irregularity, which limit their practical applicability. To address this limitation, we propose a principled approach called Mo… ▽ More

    Submitted 23 September, 2023; originally announced September 2023.

    Comments: Accepted as CIKM'23 Applied Research Track

  48. arXiv:2309.12424  [pdf, other

    cs.CV

    DualToken-ViT: Position-aware Efficient Vision Transformer with Dual Token Fusion

    Authors: Zhenzhen Chu, Jiayu Chen, Cen Chen, Chengyu Wang, Ziheng Wu, Jun Huang, Weining Qian

    Abstract: Self-attention-based vision transformers (ViTs) have emerged as a highly competitive architecture in computer vision. Unlike convolutional neural networks (CNNs), ViTs are capable of global information sharing. With the development of various structures of ViTs, ViTs are increasingly advantageous for many vision tasks. However, the quadratic complexity of self-attention renders ViTs computationall… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

  49. arXiv:2309.07109  [pdf, ps, other

    hep-ex astro-ph.HE hep-ph

    Real-time Monitoring for the Next Core-Collapse Supernova in JUNO

    Authors: Angel Abusleme, Thomas Adam, Shakeel Ahmad, Rizwan Ahmed, Sebastiano Aiello, Muhammad Akram, Abid Aleem, Fengpeng An, Qi An, Giuseppe Andronico, Nikolay Anfimov, Vito Antonelli, Tatiana Antoshkina, Burin Asavapibhop, João Pedro Athayde Marcondes de André, Didier Auguste, Weidong Bai, Nikita Balashov, Wander Baldini, Andrea Barresi, Davide Basilico, Eric Baussan, Marco Bellato, Marco Beretta, Antonio Bergnoli , et al. (606 additional authors not shown)

    Abstract: The core-collapse supernova (CCSN) is considered one of the most energetic astrophysical events in the universe. The early and prompt detection of neutrinos before (pre-SN) and during the supernova (SN) burst presents a unique opportunity for multi-messenger observations of CCSN events. In this study, we describe the monitoring concept and present the sensitivity of the system to pre-SN and SN neu… ▽ More

    Submitted 4 December, 2023; v1 submitted 13 September, 2023; originally announced September 2023.

    Comments: 24 pages, 9 figures, accepted for the publication at JCAP

  50. arXiv:2309.06909  [pdf, other

    eess.SP

    Intelligent Reflective Surface Assisted Integrated Sensing and Wireless Power Transfer

    Authors: Zheng Li, Zhengyu Zhu, Zheng Chu, Yingying Guan, De Mi, Fan Liu, Lie-Liang Yang

    Abstract: Wireless sensing and wireless energy are enablers to pave the way for smart transportation and a greener future. In this paper, an intelligent reflecting surface (IRS) assisted integrated sensing and wireless power transfer (ISWPT) system is investigated, where the transmitter in transportation infrastructure networks sends signals to sense multiple targets and simultaneously to multiple energy ha… ▽ More

    Submitted 30 November, 2023; v1 submitted 13 September, 2023; originally announced September 2023.

    Comments: Firstly,the simulation has some error and is needed to checked. Secondly, the authors relationship needs to be corrected between zheng li and zheng chu

    ACM Class: H.4.3