Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 498 results for author: Chang, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.02512  [pdf, other

    cs.LG cs.AI

    Continual Diffuser (CoD): Mastering Continual Offline Reinforcement Learning with Experience Rehearsal

    Authors: Jifeng Hu, Li Shen, Sili Huang, Zhejian Yang, Hechang Chen, Lichao Sun, Yi Chang, Dacheng Tao

    Abstract: Artificial neural networks, especially recent diffusion-based models, have shown remarkable superiority in gaming, control, and QA systems, where the training tasks' datasets are usually static. However, in real-world applications, such as robotic control of reinforcement learning (RL), the tasks are changing, and new tasks arise in a sequential order. This situation poses the new challenge of pla… ▽ More

    Submitted 4 September, 2024; originally announced September 2024.

  2. arXiv:2409.01007  [pdf, other

    cs.AI

    Unlocking the Wisdom of Large Language Models: An Introduction to The Path to Artificial General Intelligence

    Authors: Edward Y. Chang

    Abstract: This booklet, "Unlocking the Wisdom of Large Language Models," serves as an introduction to the comprehensive work "The Path to Artificial General Intelligence." Through a series of nine aphorisms, we distill key insights and principles that underpin the larger exploration of AI's future through adversarial LLM dialogue. We propose this approach as a potential path to realizing artificial general… ▽ More

    Submitted 2 September, 2024; originally announced September 2024.

    ACM Class: I.2.7

  3. arXiv:2408.15777  [pdf, other

    cs.CV

    A Survey on Facial Expression Recognition of Static and Dynamic Emotions

    Authors: Yan Wang, Shaoqi Yan, Yang Liu, Wei Song, Jing Liu, Yang Chang, Xinji Mai, Xiping Hu, Wenqiang Zhang, Zhongxue Gan

    Abstract: Facial expression recognition (FER) aims to analyze emotional states from static images and dynamic sequences, which is pivotal in enhancing anthropomorphic communication among humans, robots, and digital avatars by leveraging AI technologies. As the FER field evolves from controlled laboratory environments to more complex in-the-wild scenarios, advanced methods have been rapidly developed and new… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

  4. arXiv:2408.14575  [pdf, other

    cs.AI

    EVINCE: Optimizing Adversarial LLM Dialogues via Conditional Statistics and Information Theory

    Authors: Edward Y. Chang

    Abstract: This paper introduces EVINCE (Entropy and Variation IN Conditional Exchanges), a dialogue framework advancing Artificial General Intelligence (AGI) by enhancing versatility, adaptivity, and reasoning in large language models (LLMs). Leveraging adversarial debate and a novel dual entropy theory, EVINCE improves prediction accuracy, robustness, and stability in LLMs by integrating statistical modeli… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

    Comments: 19 pages, 7 figures, four tables

    ACM Class: I.2.7

  5. arXiv:2408.13464  [pdf, other

    cs.AI cs.CL cs.LG

    Uncovering Biases with Reflective Large Language Models

    Authors: Edward Y. Chang

    Abstract: Biases inherent in human endeavors pose significant challenges for machine learning, particularly in supervised learning that relies on potentially biased "ground truth" data. This reliance, coupled with models' tendency to generalize based on statistical maximal likelihood, can propagate and amplify biases, exacerbating societal issues. To address this, our study proposes a reflective methodology… ▽ More

    Submitted 24 August, 2024; originally announced August 2024.

    Comments: 16 pages, 3 figures, 8 tables

    ACM Class: I.2.7

  6. arXiv:2408.11451  [pdf, other

    cs.AI

    Bidirectional Gated Mamba for Sequential Recommendation

    Authors: Ziwei Liu, Qidong Liu, Yejing Wang, Wanyu Wang, Pengyue Jia, Maolin Wang, Zitao Liu, Yi Chang, Xiangyu Zhao

    Abstract: In various domains, Sequential Recommender Systems (SRS) have become essential due to their superior capability to discern intricate user preferences. Typically, SRS utilize transformer-based architectures to forecast the subsequent item within a sequence. Nevertheless, the quadratic computational complexity inherent in these models often leads to inefficiencies, hindering the achievement of real-… ▽ More

    Submitted 22 August, 2024; v1 submitted 21 August, 2024; originally announced August 2024.

  7. arXiv:2408.10908  [pdf, other

    cs.RO cs.HC

    Enhancing End-to-End Autonomous Driving Systems Through Synchronized Human Behavior Data

    Authors: Yiqun Duan, Zhuoli Zhuang, Jinzhao Zhou, Yu-Cheng Chang, Yu-Kai Wang, Chin-Teng Lin

    Abstract: This paper presents a pioneering exploration into the integration of fine-grained human supervision within the autonomous driving domain to enhance system performance. The current advances in End-to-End autonomous driving normally are data-driven and rely on given expert trials. However, this reliance limits the systems' generalizability and their ability to earn human trust. Addressing this gap,… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  8. arXiv:2408.10668  [pdf, other

    cs.CR cs.AI

    Probing the Safety Response Boundary of Large Language Models via Unsafe Decoding Path Generation

    Authors: Haoyu Wang, Bingzhe Wu, Yatao Bian, Yongzhe Chang, Xueqian Wang, Peilin Zhao

    Abstract: Large Language Models (LLMs) are implicit troublemakers. While they provide valuable insights and assist in problem-solving, they can also potentially serve as a resource for malicious activities. Implementing safety alignment could mitigate the risk of LLMs generating harmful responses. We argue that: even when an LLM appears to successfully block harmful queries, there may still be hidden vulner… ▽ More

    Submitted 26 August, 2024; v1 submitted 20 August, 2024; originally announced August 2024.

  9. arXiv:2408.10504  [pdf, other

    cs.AI

    QPO: Query-dependent Prompt Optimization via Multi-Loop Offline Reinforcement Learning

    Authors: Yilun Kong, Hangyu Mao, Qi Zhao, Bin Zhang, Jingqing Ruan, Li Shen, Yongzhe Chang, Xueqian Wang, Rui Zhao, Dacheng Tao

    Abstract: Prompt engineering has demonstrated remarkable success in enhancing the performance of large language models (LLMs) across diverse tasks. However, most existing prompt optimization methods only focus on the task-level performance, overlooking the importance of query-preferred prompts, which leads to suboptimal performances. Additionally, these methods rely heavily on frequent interactions with LLM… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

  10. arXiv:2408.10099  [pdf, other

    cs.GR

    Neural Representation of Shape-Dependent Laplacian Eigenfunctions

    Authors: Yue Chang, Otman Benchekroun, Maurizio M. Chiaramonte, Peter Yichen Chen, Eitan Grinspun

    Abstract: The eigenfunctions of the Laplace operator are essential in mathematical physics, engineering, and geometry processing. Typically, these are computed by discretizing the domain and performing eigendecomposition, tying the results to a specific mesh. However, this method is unsuitable for continuously-parameterized shapes. We propose a novel representation for eigenfunctions in continuously-param… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

  11. arXiv:2408.09591  [pdf, other

    cs.DS

    Pre-assignment problem for unique minimum vertex cover on bounded clique-width graphs

    Authors: Shinwoo An, Yeonsu Chang, Kyungjin Cho, O-joung Kwon, Myounghwan Lee, Eunjin Oh, Hyeonjun Shin

    Abstract: Horiyama et al. (AAAI 2024) considered the problem of generating instances with a unique minimum vertex cover under certain conditions. The Pre-assignment for Uniquification of Minimum Vertex Cover problem (shortly PAU-VC) is the problem, for given a graph $G$, to find a minimum set $S$ of vertices in $G$ such that there is a unique minimum vertex cover of $G$ containing $S$. We show that PAU-VC i… ▽ More

    Submitted 22 August, 2024; v1 submitted 18 August, 2024; originally announced August 2024.

    Comments: 19 pages, 3 figures

  12. arXiv:2408.08815  [pdf, other

    cs.LG

    An Empirical Examination of Balancing Strategy for Counterfactual Estimation on Time Series

    Authors: Qiang Huang, Chuizheng Meng, Defu Cao, Biwei Huang, Yi Chang, Yan Liu

    Abstract: Counterfactual estimation from observations represents a critical endeavor in numerous application fields, such as healthcare and finance, with the primary challenge being the mitigation of treatment bias. The balancing strategy aimed at reducing covariate disparities between different treatment groups serves as a universal solution. However, when it comes to the time series data, the effectivenes… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

    Comments: ICML 2024 Carema Ready Version. 20 Pages, 12 Figures, 10 Tables

  13. arXiv:2408.08500  [pdf, other

    cs.CV

    CoSEC: A Coaxial Stereo Event Camera Dataset for Autonomous Driving

    Authors: Shihan Peng, Hanyu Zhou, Hao Dong, Zhiwei Shi, Haoyue Liu, Yuxing Duan, Yi Chang, Luxin Yan

    Abstract: Conventional frame camera is the mainstream sensor of the autonomous driving scene perception, while it is limited in adverse conditions, such as low light. Event camera with high dynamic range has been applied in assisting frame camera for the multimodal fusion, which relies heavily on the pixel-level spatial alignment between various modalities. Typically, existing multimodal datasets mainly pla… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

    Comments: This work has been submitted to the IEEE for possible publication

  14. arXiv:2408.07083  [pdf, other

    cs.LG cs.AI

    Masked EEG Modeling for Driving Intention Prediction

    Authors: Jinzhao Zhou, Justin Sia, Yiqun Duan, Yu-Cheng Chang, Yu-Kai Wang, Chin-Teng Lin

    Abstract: Driving under drowsy conditions significantly escalates the risk of vehicular accidents. Although recent efforts have focused on using electroencephalography to detect drowsiness, helping prevent accidents caused by driving in such states, seamless human-machine interaction in driving scenarios requires a more versatile EEG-based system. This system should be capable of understanding a driver's in… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

  15. arXiv:2408.04679  [pdf, other

    cs.CL cs.AI cs.LG

    Towards Linguistic Neural Representation Learning and Sentence Retrieval from Electroencephalogram Recordings

    Authors: Jinzhao Zhou, Yiqun Duan, Ziyi Zhao, Yu-Cheng Chang, Yu-Kai Wang, Thomas Do, Chin-Teng Lin

    Abstract: Decoding linguistic information from non-invasive brain signals using EEG has gained increasing research attention due to its vast applicational potential. Recently, a number of works have adopted a generative-based framework to decode electroencephalogram (EEG) signals into sentences by utilizing the power generative capacity of pretrained large language models (LLMs). However, this approach has… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

  16. arXiv:2408.04556  [pdf, other

    cs.CL cs.LG

    Bias-Aware Low-Rank Adaptation: Mitigating Catastrophic Inheritance of Large Language Models

    Authors: Yupeng Chang, Yi Chang, Yuan Wu

    Abstract: Large language models (LLMs) have exhibited remarkable proficiency across a diverse array of natural language processing (NLP) tasks. However, adapting LLMs to downstream applications typically necessitates computationally intensive and memory-demanding fine-tuning procedures. To mitigate these burdens, parameter-efficient fine-tuning (PEFT) techniques have emerged as a promising approach to tailo… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    Comments: Work in progress

  17. arXiv:2408.01551  [pdf, other

    cs.SD eess.AS

    PiCoGen2: Piano cover generation with transfer learning approach and weakly aligned data

    Authors: Chih-Pin Tan, Hsin Ai, Yi-Hsin Chang, Shuen-Huei Guan, Yi-Hsuan Yang

    Abstract: Piano cover generation aims to create a piano cover from a pop song. Existing approaches mainly employ supervised learning and the training demands strongly-aligned and paired song-to-piano data, which is built by remapping piano notes to song audio. This would, however, result in the loss of piano information and accordingly cause inconsistencies between the original and remapped piano versions.… ▽ More

    Submitted 2 August, 2024; originally announced August 2024.

    Comments: Accepted at the 25th International Society for Music Information Retrieval Conference (ISMIR), 2024

  18. arXiv:2408.00327  [pdf, other

    cs.AR

    Search-in-Memory (SiM): Reliable, Versatile, and Efficient Data Matching in SSD's NAND Flash Memory Chip for Data Indexing Acceleration

    Authors: Yun-Chih Chen, Yuan-Hao Chang, Tei-Wei Kuo

    Abstract: To index the increasing volume of data, modern data indexes are typically stored on SSDs and cached in DRAM. However, searching such an index has resulted in significant I/O traffic due to limited access locality and inefficient cache utilization. At the heart of index searching is the operation of filtering through vast data spans to isolate a small, relevant subset, which involves basic equality… ▽ More

    Submitted 2 August, 2024; v1 submitted 1 August, 2024; originally announced August 2024.

    Comments: This paper has been accepted for presentation at the The International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS) in September, 2024. An extended abstract of this paper was presented in Design, Automation & Test in Europe Conference & Exhibition (DATE), 2024

  19. arXiv:2407.16252  [pdf, other

    cs.CL cs.AI cs.CV

    LawLuo: A Chinese Law Firm Co-run by LLM Agents

    Authors: Jingyun Sun, Chengxiao Dai, Zhongze Luo, Yangbo Chang, Yang Li

    Abstract: Large Language Models (LLMs) demonstrate substantial potential in delivering legal consultation services to users without a legal background, attributed to their superior text comprehension and generation capabilities. Nonetheless, existing Chinese legal LLMs limit interaction to a single model-user dialogue, unlike the collaborative consultations typical of law firms, where multiple staff members… ▽ More

    Submitted 4 August, 2024; v1 submitted 23 July, 2024; originally announced July 2024.

    Comments: 11 pages, 13 figures, 2 tables

    ACM Class: I.2.1

  20. arXiv:2407.16095  [pdf, other

    cs.RO

    Robotically adjustable kinematics in a wrist-driven orthosis eases grasping across tasks

    Authors: Erin Y. Chang, Andrew I. W. McPherson, Hannah S. Stuart

    Abstract: Without finger function, people with C5-7 spinal cord injury (SCI) regularly utilize wrist extension to passively close the fingers and thumb together for grasping. Wearable assistive grasping devices often focus on this familiar wrist-driven technique to provide additional support and amplify grasp force. Despite recent research advances in modernizing these tools, people with SCI often abandon s… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

    Comments: 6 pages, 8 figures. Presented at the 2024 International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)

  21. arXiv:2407.14136  [pdf, other

    cs.CV

    6DoF Head Pose Estimation through Explicit Bidirectional Interaction with Face Geometry

    Authors: Sungho Chun, Ju Yong Chang

    Abstract: This study addresses the nuanced challenge of estimating head translations within the context of six-degrees-of-freedom (6DoF) head pose estimation, placing emphasis on this aspect over the more commonly studied head rotations. Identifying a gap in existing methodologies, we recognized the underutilized potential synergy between facial geometry and head translation. To bridge this gap, we propose… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

  22. arXiv:2407.12068  [pdf, other

    cs.LG cs.AI

    Learning on Graphs with Large Language Models(LLMs): A Deep Dive into Model Robustness

    Authors: Kai Guo, Zewen Liu, Zhikai Chen, Hongzhi Wen, Wei Jin, Jiliang Tang, Yi Chang

    Abstract: Large Language Models (LLMs) have demonstrated remarkable performance across various natural language processing tasks. Recently, several LLMs-based pipelines have been developed to enhance learning on graphs with text attributes, showcasing promising performance. However, graphs are well-known to be susceptible to adversarial attacks and it remains unclear whether LLMs exhibit robustness in learn… ▽ More

    Submitted 28 July, 2024; v1 submitted 16 July, 2024; originally announced July 2024.

  23. arXiv:2407.10051  [pdf, ps, other

    cs.IT

    The weight distributions of some linear codes derived from Kloosterman sums

    Authors: Mengzhen Zhao, Yanxun Chang

    Abstract: Linear codes with few weights have applications in data storage systems, secret sharing schemes, and authentication codes. In this paper, some kinds of p-ary linear codes with few weights are constructed by use of the given de ning set, where p is a prime. Their weight distributions are determined based on Kloosterman sums over nite elds. In addition, some linear codes we given is minimal.

    Submitted 13 July, 2024; originally announced July 2024.

    Comments: 11 pages

    MSC Class: 94A24; 94B05

  24. arXiv:2407.08377  [pdf, other

    cs.CV

    Long-range Turbulence Mitigation: A Large-scale Dataset and A Coarse-to-fine Framework

    Authors: Shengqi Xu, Run Sun, Yi Chang, Shuning Cao, Xueyao Xiao, Luxin Yan

    Abstract: Long-range imaging inevitably suffers from atmospheric turbulence with severe geometric distortions due to random refraction of light. The further the distance, the more severe the disturbance. Despite existing research has achieved great progress in tackling short-range turbulence, there is less attention paid to long-range turbulence with significant distortions. To address this dilemma and adva… ▽ More

    Submitted 17 July, 2024; v1 submitted 11 July, 2024; originally announced July 2024.

    Comments: This paper is accepted by ECCV 2024

  25. arXiv:2407.05364  [pdf, other

    cs.LG

    PTaRL: Prototype-based Tabular Representation Learning via Space Calibration

    Authors: Hangting Ye, Wei Fan, Xiaozhuang Song, Shun Zheng, He Zhao, Dandan Guo, Yi Chang

    Abstract: Tabular data have been playing a mostly important role in diverse real-world fields, such as healthcare, engineering, finance, etc. With the recent success of deep learning, many tabular machine learning (ML) methods based on deep networks (e.g., Transformer, ResNet) have achieved competitive performance on tabular benchmarks. However, existing deep tabular ML methods suffer from the representatio… ▽ More

    Submitted 15 July, 2024; v1 submitted 7 July, 2024; originally announced July 2024.

    Comments: Accepted by ICLR 2024

  26. arXiv:2407.03925  [pdf, other

    cs.LG

    Reduced-Order Neural Operators: Learning Lagrangian Dynamics on Highly Sparse Graphs

    Authors: Hrishikesh Viswanath, Yue Chang, Julius Berner, Peter Yichen Chen, Aniket Bera

    Abstract: We present a neural operator architecture to simulate Lagrangian dynamics, such as fluid flow, granular flows, and elastoplasticity. Traditional numerical methods, such as the finite element method (FEM), suffer from long run times and large memory consumption. On the other hand, approaches based on graph neural networks are faster but still suffer from long computation times on dense graphs, whic… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  27. arXiv:2407.00383  [pdf, other

    cs.LG cs.AI

    FANFOLD: Graph Normalizing Flows-driven Asymmetric Network for Unsupervised Graph-Level Anomaly Detection

    Authors: Rui Cao, Shijie Xue, Jindong Li, Qi Wang, Yi Chang

    Abstract: Unsupervised graph-level anomaly detection (UGAD) has attracted increasing interest due to its widespread application. In recent studies, knowledge distillation-based methods have been widely used in unsupervised anomaly detection to improve model efficiency and generalization. However, the inherent symmetry between the source (teacher) and target (student) networks typically results in consistent… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  28. arXiv:2407.00118  [pdf, other

    cs.LG cs.AI

    From Efficient Multimodal Models to World Models: A Survey

    Authors: Xinji Mai, Zeng Tao, Junxiong Lin, Haoran Wang, Yang Chang, Yanlan Kang, Yan Wang, Wenqiang Zhang

    Abstract: Multimodal Large Models (MLMs) are becoming a significant research focus, combining powerful large language models with multimodal learning to perform complex tasks across different data modalities. This review explores the latest developments and challenges in MLMs, emphasizing their potential in achieving artificial general intelligence and as a pathway to world models. We provide an overview of… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

  29. arXiv:2406.19228  [pdf, other

    cs.CL cs.AI cs.LG

    Tools Fail: Detecting Silent Errors in Faulty Tools

    Authors: Jimin Sun, So Yeon Min, Yingshan Chang, Yonatan Bisk

    Abstract: Tools have become a mainstay of LLMs, allowing them to retrieve knowledge not in their weights, to perform tasks on the web, and even to control robots. However, most ontologies and surveys of tool-use have assumed the core challenge for LLMs is choosing the tool. Instead, we introduce a framework for tools more broadly which guides us to explore a model's ability to detect "silent" tool errors, a… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 18 pages, 12 figures

  30. arXiv:2406.17386  [pdf, other

    math.OC cs.AI cs.LG

    Double Momentum Method for Lower-Level Constrained Bilevel Optimization

    Authors: Wanli Shi, Yi Chang, Bin Gu

    Abstract: Bilevel optimization (BO) has recently gained prominence in many machine learning applications due to its ability to capture the nested structure inherent in these problems. Recently, many hypergradient methods have been proposed as effective solutions for solving large-scale problems. However, current hypergradient methods for the lower-level constrained bilevel optimization (LCBO) problems need… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 27pages, 9 figures

    Journal ref: Proceedings of the 41st International Conference on Machine Learning, PMLR 235:44838-44864, 2024

  31. arXiv:2406.15848  [pdf, other

    cs.CV

    Quality-guided Skin Tone Enhancement for Portrait Photography

    Authors: Shiqi Gao, Huiyu Duan, Xinyue Li, Kang Fu, Yicong Peng, Qihang Xu, Yuanyuan Chang, Jia Wang, Xiongkuo Min, Guangtao Zhai

    Abstract: In recent years, learning-based color and tone enhancement methods for photos have become increasingly popular. However, most learning-based image enhancement methods just learn a mapping from one distribution to another based on one dataset, lacking the ability to adjust images continuously and controllably. It is important to enable the learning-based enhancement models to adjust an image contin… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  32. arXiv:2406.15119  [pdf, other

    cs.SD cs.AI eess.AS

    Speech Emotion Recognition under Resource Constraints with Data Distillation

    Authors: Yi Chang, Zhao Ren, Zhonghao Zhao, Thanh Tam Nguyen, Kun Qian, Tanja Schultz, Björn W. Schuller

    Abstract: Speech emotion recognition (SER) plays a crucial role in human-computer interaction. The emergence of edge devices in the Internet of Things (IoT) presents challenges in constructing intricate deep learning models due to constraints in memory and computational resources. Moreover, emotional speech data often contains private information, raising concerns about privacy leakage during the deployment… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  33. arXiv:2406.14517  [pdf, other

    cs.LG cs.AI cs.CL cs.CR

    PostMark: A Robust Blackbox Watermark for Large Language Models

    Authors: Yapei Chang, Kalpesh Krishna, Amir Houmansadr, John Wieting, Mohit Iyyer

    Abstract: The most effective techniques to detect LLM-generated text rely on inserting a detectable signature -- or watermark -- during the model's decoding process. Most existing watermarking methods require access to the underlying LLM's logits, which LLM API providers are loath to share due to fears of model distillation. As such, these watermarks must be implemented independently by each LLM provider. I… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: preprint; 18 pages, 5 figures

  34. arXiv:2406.13153  [pdf, other

    cs.CV

    SwinStyleformer is a favorable choice for image inversion

    Authors: Jiawei Mao, Guangyi Zhao, Xuesong Yin, Yuanqi Chang

    Abstract: This paper proposes the first pure Transformer structure inversion network called SwinStyleformer, which can compensate for the shortcomings of the CNNs inversion framework by handling long-range dependencies and learning the global structure of objects. Experiments found that the inversion network with the Transformer backbone could not successfully invert the image. The above phenomena arise fro… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  35. arXiv:2406.12587  [pdf, other

    cs.CV

    Restorer: Removing Multi-Degradation with All-Axis Attention and Prompt Guidance

    Authors: Jiawei Mao, Juncheng Wu, Yuyin Zhou, Xuesong Yin, Yuanqi Chang

    Abstract: There are many excellent solutions in image restoration.However, most methods require on training separate models to restore images with different types of degradation.Although existing all-in-one models effectively address multiple types of degradation simultaneously, their performance in real-world scenarios is still constrained by the task confusion problem.In this work, we attempt to address t… ▽ More

    Submitted 3 September, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

  36. arXiv:2406.12585  [pdf, other

    cs.CL cs.AI

    Breaking the Ceiling of the LLM Community by Treating Token Generation as a Classification for Ensembling

    Authors: Yao-Ching Yu, Chun-Chih Kuo, Ziqi Ye, Yu-Cheng Chang, Yueh-Se Li

    Abstract: Ensembling multiple models has always been an effective approach to push the limits of existing performance and is widely used in classification tasks by simply averaging the classification probability vectors from multiple classifiers to achieve better accuracy. However, in the thriving open-source Large Language Model (LLM) community, ensembling methods are rare and typically limited to ensembli… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  37. arXiv:2406.10272  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Connected Speech-Based Cognitive Assessment in Chinese and English

    Authors: Saturnino Luz, Sofia De La Fuente Garcia, Fasih Haider, Davida Fromm, Brian MacWhinney, Alyssa Lanzi, Ya-Ning Chang, Chia-Ju Chou, Yi-Chien Liu

    Abstract: We present a novel benchmark dataset and prediction tasks for investigating approaches to assess cognitive function through analysis of connected speech. The dataset consists of speech samples and clinical information for speakers of Mandarin Chinese and English with different levels of cognitive impairment as well as individuals with normal cognition. These data have been carefully matched by age… ▽ More

    Submitted 18 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

    Comments: To appear in Proceedings of Interspeech 2024

    ACM Class: J.3; I.5.4

  38. arXiv:2406.08855  [pdf, other

    cs.RO

    Trajectory Planning for Autonomous Driving in Unstructured Scenarios Based on Graph Neural Network and Numerical Optimization

    Authors: Sumin Zhang, Kuo Li, Rui He, Zhiwei Meng, Yupeng Chang, Xiaosong Jin, Ri Bai

    Abstract: In unstructured environments, obstacles are diverse and lack lane markings, making trajectory planning for intelligent vehicles a challenging task. Traditional trajectory planning methods typically involve multiple stages, including path planning, speed planning, and trajectory optimization. These methods require the manual design of numerous parameters for each stage, resulting in significant wor… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  39. arXiv:2406.05191  [pdf, other

    cs.CV

    DiffusionPID: Interpreting Diffusion via Partial Information Decomposition

    Authors: Shaurya Dewan, Rushikesh Zawar, Prakanshul Saxena, Yingshan Chang, Andrew Luo, Yonatan Bisk

    Abstract: Text-to-image diffusion models have made significant progress in generating naturalistic images from textual inputs, and demonstrate the capacity to learn and represent complex visual-semantic relationships. While these diffusion models have achieved remarkable success, the underlying mechanisms driving their performance are not yet fully accounted for, with many unanswered questions surrounding w… ▽ More

    Submitted 12 June, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

  40. arXiv:2406.04368  [pdf, other

    cs.CL cs.AI cs.CY

    SocialNLP Fake-EmoReact 2021 Challenge Overview: Predicting Fake Tweets from Their Replies and GIFs

    Authors: Chien-Kun Huang, Yi-Ting Chang, Lun-Wei Ku, Cheng-Te Li, Hong-Han Shuai

    Abstract: This paper provides an overview of the Fake-EmoReact 2021 Challenge, held at the 9th SocialNLP Workshop, in conjunction with NAACL 2021. The challenge requires predicting the authenticity of tweets using reply context and augmented GIF categories from EmotionGIF dataset. We offer the Fake-EmoReact dataset with more than 453k as the experimental materials, where every tweet is labeled with authenti… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

  41. arXiv:2406.03102  [pdf, other

    cs.LG cs.AI

    DEER: A Delay-Resilient Framework for Reinforcement Learning with Variable Delays

    Authors: Bo Xia, Yilun Kong, Yongzhe Chang, Bo Yuan, Zhiheng Li, Xueqian Wang, Bin Liang

    Abstract: Classic reinforcement learning (RL) frequently confronts challenges in tasks involving delays, which cause a mismatch between received observations and subsequent actions, thereby deviating from the Markov assumption. Existing methods usually tackle this issue with end-to-end solutions using state augmentation. However, these black-box approaches often involve incomprehensible processes and redund… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  42. arXiv:2406.01033  [pdf

    cs.CV cs.LG cs.MM

    Generalized Jersey Number Recognition Using Multi-task Learning With Orientation-guided Weight Refinement

    Authors: Yung-Hui Lin, Yu-Wen Chang, Huang-Chia Shih, Takahiro Ogawa

    Abstract: Jersey number recognition (JNR) has always been an important task in sports analytics. Improving recognition accuracy remains an ongoing challenge because images are subject to blurring, occlusion, deformity, and low resolution. Recent research has addressed these problems using number localization and optical character recognition. Some approaches apply player identification schemes to image sequ… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 10 pages, 6 figures, 5 tables

  43. arXiv:2405.20404  [pdf, other

    cs.CL cs.LG

    XPrompt:Explaining Large Language Model's Generation via Joint Prompt Attribution

    Authors: Yurui Chang, Bochuan Cao, Yujia Wang, Jinghui Chen, Lu Lin

    Abstract: Large Language Models (LLMs) have demonstrated impressive performances in complex text generation tasks. However, the contribution of the input prompt to the generated content still remains obscure to humans, underscoring the necessity of elucidating and explaining the causality between input and output pairs. Existing works for providing prompt-specific explanation often confine model output to b… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  44. arXiv:2405.20131  [pdf, other

    cs.CL cs.AI

    Language Models Need Inductive Biases to Count Inductively

    Authors: Yingshan Chang, Yonatan Bisk

    Abstract: Counting is a fundamental example of generalization, whether viewed through the mathematical lens of Peano's axioms defining the natural numbers or the cognitive science literature for children learning to count. The argument holds for both cases that learning to count means learning to count infinitely. While few papers have tried to distill transformer "reasoning" to the simplest case of countin… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  45. arXiv:2405.19718  [pdf, other

    cs.CV

    LED: A Large-scale Real-world Paired Dataset for Event Camera Denoising

    Authors: Yuxing Duan, Shihan Peng, Lin Zhu, Wei Zhang, Yi Chang, Sheng Zhong, Luxin Yan

    Abstract: Event camera has significant advantages in capturing dynamic scene information while being prone to noise interference, particularly in challenging conditions like low threshold and low illumination. However, most existing research focuses on gentle situations, hindering event camera applications in realistic complex scenarios. To tackle this limitation and advance the field, we construct a new pa… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Accepted by CVPR 2024

  46. arXiv:2405.18816  [pdf, other

    cs.CV cs.LG

    Flow Priors for Linear Inverse Problems via Iterative Corrupted Trajectory Matching

    Authors: Yasi Zhang, Peiyu Yu, Yaxuan Zhu, Yingshan Chang, Feng Gao, Ying Nian Wu, Oscar Leong

    Abstract: Generative models based on flow matching have attracted significant attention for their simplicity and superior performance in high-resolution image synthesis. By leveraging the instantaneous change-of-variables formula, one can directly compute image likelihoods from a learned flow, making them enticing candidates as priors for downstream tasks such as inverse problems. In particular, a natural a… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  47. arXiv:2405.16755  [pdf, other

    cs.LG cs.AI cs.DB

    CHESS: Contextual Harnessing for Efficient SQL Synthesis

    Authors: Shayan Talaei, Mohammadreza Pourreza, Yu-Chen Chang, Azalia Mirhoseini, Amin Saberi

    Abstract: Utilizing large language models (LLMs) for transforming natural language questions into SQL queries (text-to-SQL) is a promising yet challenging approach, particularly when applied to real-world databases with complex and extensive schemas. In particular, effectively incorporating data catalogs and database values for SQL generation remains an obstacle, leading to suboptimal solutions. We address… ▽ More

    Submitted 27 June, 2024; v1 submitted 26 May, 2024; originally announced May 2024.

  48. arXiv:2405.15808  [pdf, other

    cs.AI

    Ensuring Ground Truth Accuracy in Healthcare with the EVINCE framework

    Authors: Edward Y. Chang

    Abstract: Misdiagnosis is a significant issue in healthcare, leading to harmful consequences for patients. The propagation of mislabeled data through machine learning models into clinical practice is unacceptable. This paper proposes EVINCE, a system designed to 1) improve diagnosis accuracy and 2) rectify misdiagnoses and minimize training data errors. EVINCE stands for Entropy Variation through Informatio… ▽ More

    Submitted 28 May, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

    Comments: 23 pages, 4 tables, 4 figures

    ACM Class: I.2.7

  49. arXiv:2405.14573  [pdf, other

    cs.AI cs.LG

    AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents

    Authors: Christopher Rawles, Sarah Clinckemaillie, Yifan Chang, Jonathan Waltz, Gabrielle Lau, Marybeth Fair, Alice Li, William Bishop, Wei Li, Folawiyo Campbell-Ajala, Daniel Toyama, Robert Berry, Divya Tyamagundlu, Timothy Lillicrap, Oriana Riva

    Abstract: Autonomous agents that execute human tasks by controlling computers can enhance human productivity and application accessibility. However, progress in this field will be driven by realistic and reproducible benchmarks. We present AndroidWorld, a fully functional Android environment that provides reward signals for 116 programmatic tasks across 20 real-world Android apps. Unlike existing interactiv… ▽ More

    Submitted 10 June, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  50. arXiv:2405.13368  [pdf, other

    cs.IT

    Static Deep Q-learning for Green Downlink C-RAN

    Authors: Yuchao Chang, Hongli Wang, Wen Chen, Yonghui Li, Naofal Al-Dhahir

    Abstract: Power saving is a main pillar in the operation of wireless communication systems. In this paper, we investigate cloud radio access network (C-RAN) capability to reduce power consumption based on the user equipment (UE) requirement. Aiming to save the long-term C-RAN energy consumption, an optimization problem is formulated to manage the downlink power without degrading the UE requirement by design… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.