Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 264 results for author: Tian, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.02452  [pdf, other

    cs.CR

    A Hardware-Friendly Shuffling Countermeasure Against Side-Channel Attacks for Kyber

    Authors: Dejun Xu, Kai Wang, Jing Tian

    Abstract: CRYSTALS-Kyber (a.k.a. Kyber) has been drafted to be standardized as the only key encapsulation mechanism (KEM) scheme by the national institute of standards and technology (NIST) to withstand attacks by large-scale quantum computers. However, the side-channel attack (SCA) on its implementation is still needed to be well considered for the upcoming migration. In this brief, we propose a secure and… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  2. arXiv:2407.02253  [pdf, other

    cs.LG cs.CV

    Parameter-Selective Continual Test-Time Adaptation

    Authors: Jiaxu Tian, Fan Lyu

    Abstract: Continual Test-Time Adaptation (CTTA) aims to adapt a pretrained model to ever-changing environments during the test time under continuous domain shifts. Most existing CTTA approaches are based on the Mean Teacher (MT) structure, which contains a student and a teacher model, where the student is updated using the pseudo-labels from the teacher model, and the teacher is then updated by exponential… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 17pages, 4 figures

  3. arXiv:2407.00837  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    Towards Robust Speech Representation Learning for Thousands of Languages

    Authors: William Chen, Wangyou Zhang, Yifan Peng, Xinjian Li, Jinchuan Tian, Jiatong Shi, Xuankai Chang, Soumi Maiti, Karen Livescu, Shinji Watanabe

    Abstract: Self-supervised learning (SSL) has helped extend speech technologies to more languages by reducing the need for labeled data. However, models are still far from supporting the world's 7000+ languages. We propose XEUS, a Cross-lingual Encoder for Universal Speech, trained on over 1 million hours of data across 4057 languages, extending the language coverage of SSL models 4-fold. We combine 1 millio… ▽ More

    Submitted 2 July, 2024; v1 submitted 30 June, 2024; originally announced July 2024.

    Comments: Updated affiliations; 20 pages

  4. arXiv:2406.16349  [pdf, other

    cs.LG

    AnnotatedTables: A Large Tabular Dataset with Language Model Annotations

    Authors: Yaojie Hu, Ilias Fountalis, Jin Tian, Nikolaos Vasiloglou

    Abstract: Tabular data is ubiquitous in real-world applications and abundant on the web, yet its annotation has traditionally required human labor, posing a significant scalability bottleneck for tabular machine learning. Our methodology can successfully annotate a large amount of tabular data and can be flexibly steered to generate various types of annotations based on specific research objectives, as we d… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  5. arXiv:2406.15480  [pdf, other

    cs.CL cs.AI cs.LG

    On Giant's Shoulders: Effortless Weak to Strong by Dynamic Logits Fusion

    Authors: Chenghao Fan, Zhenyi Lu, Wei Wei, Jie Tian, Xiaoye Qu, Dangyang Chen, Yu Cheng

    Abstract: Efficient fine-tuning of large language models for task-specific applications is imperative, yet the vast number of parameters in these models makes their training increasingly challenging. Despite numerous proposals for effective methods, a substantial memory overhead remains for gradient computations during updates. \thm{Can we fine-tune a series of task-specific small models and transfer their… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: submit under review

  6. arXiv:2406.12548  [pdf, other

    cs.CL

    P-Tailor: Customizing Personality Traits for Language Models via Mixture of Specialized LoRA Experts

    Authors: Yuhao Dan, Jie Zhou, Qin Chen, Junfeng Tian, Liang He

    Abstract: Personalized large language models (LLMs) have attracted great attention in many applications, such as intelligent education and emotional support. Most work focuses on controlling the character settings based on the profile (e.g., age, skill, experience, and so on). Conversely, the psychological theory-based personality traits with implicit expression and behavior are not well modeled, limiting t… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  7. arXiv:2406.09282  [pdf, other

    cs.CL cs.SD eess.AS

    On the Effects of Heterogeneous Data Sources on Speech-to-Text Foundation Models

    Authors: Jinchuan Tian, Yifan Peng, William Chen, Kwanghee Choi, Karen Livescu, Shinji Watanabe

    Abstract: The Open Whisper-style Speech Model (OWSM) series was introduced to achieve full transparency in building advanced speech-to-text (S2T) foundation models. To this end, OWSM models are trained on 25 public speech datasets, which are heterogeneous in multiple ways. In this study, we advance the OWSM series by introducing OWSM v3.2, which improves on prior models by investigating and addressing the i… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  8. arXiv:2406.09095  [pdf, other

    cs.CL

    Modeling Comparative Logical Relation with Contrastive Learning for Text Generation

    Authors: Yuhao Dan, Junfeng Tian, Jie Zhou, Ming Yan, Ji Zhang, Qin Chen, Liang He

    Abstract: Data-to-Text Generation (D2T), a classic natural language generation problem, aims at producing fluent descriptions for structured input data, such as a table. Existing D2T works mainly focus on describing the superficial associative relations among entities, while ignoring the deep comparative logical relations, such as A is better than B in a certain aspect with a corresponding opinion, which is… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  9. arXiv:2406.08641  [pdf, ps, other

    cs.SD cs.CL eess.AS

    ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets

    Authors: Jiatong Shi, Shih-Heng Wang, William Chen, Martijn Bartelds, Vanya Bannihatti Kumar, Jinchuan Tian, Xuankai Chang, Dan Jurafsky, Karen Livescu, Hung-yi Lee, Shinji Watanabe

    Abstract: ML-SUPERB evaluates self-supervised learning (SSL) models on the tasks of language identification and automatic speech recognition (ASR). This benchmark treats the models as feature extractors and uses a single shallow downstream model, which can be fine-tuned for a downstream task. However, real-world use cases may require different configurations. This paper presents ML-SUPERB~2.0, which is a ne… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  10. arXiv:2406.07725  [pdf, ps, other

    cs.SD eess.AS

    The Interspeech 2024 Challenge on Speech Processing Using Discrete Units

    Authors: Xuankai Chang, Jiatong Shi, Jinchuan Tian, Yuning Wu, Yuxun Tang, Yihan Wu, Shinji Watanabe, Yossi Adi, Xie Chen, Qin Jin

    Abstract: Representing speech and audio signals in discrete units has become a compelling alternative to traditional high-dimensional feature vectors. Numerous studies have highlighted the efficacy of discrete units in various applications such as speech compression and restoration, speech recognition, and speech generation. To foster exploration in this domain, we introduce the Interspeech 2024 Challenge,… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: This manuscript has been accepted by Interspeech2024

  11. arXiv:2406.07001  [pdf, other

    cs.CL cs.AI

    Mitigating Boundary Ambiguity and Inherent Bias for Text Classification in the Era of Large Language Models

    Authors: Zhenyi Lu, Jie Tian, Wei Wei, Xiaoye Qu, Yu Cheng, Wenfeng xie, Dangyang Chen

    Abstract: Text classification is a crucial task encountered frequently in practical scenarios, yet it is still under-explored in the era of large language models (LLMs). This study shows that LLMs are vulnerable to changes in the number and arrangement of options in text classification. Our extensive empirical analyses reveal that the key bottleneck arises from ambiguous decision boundaries and inherent bia… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: ACL2024 findings

  12. arXiv:2406.06025  [pdf, other

    cs.SE cs.CL cs.LG

    RepoQA: Evaluating Long Context Code Understanding

    Authors: Jiawei Liu, Jia Le Tian, Vijay Daita, Yuxiang Wei, Yifeng Ding, Yuhan Katherine Wang, Jun Yang, Lingming Zhang

    Abstract: Recent advances have been improving the context windows of Large Language Models (LLMs). To quantify the real long-context capabilities of LLMs, evaluators such as the popular Needle in a Haystack have been developed to test LLMs over a large chunk of raw texts. While effective, current evaluations overlook the insight of how LLMs work with long-context code, i.e., repositories. To this end, we in… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  13. arXiv:2406.06007  [pdf, other

    cs.LG cs.CL cs.CV cs.CY

    CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models

    Authors: Peng Xia, Ze Chen, Juanxi Tian, Yangrui Gong, Ruibo Hou, Yue Xu, Zhenbang Wu, Zhiyuan Fan, Yiyang Zhou, Kangyu Zhu, Wenhao Zheng, Zhaoyang Wang, Xiao Wang, Xuchao Zhang, Chetan Bansal, Marc Niethammer, Junzhou Huang, Hongtu Zhu, Yun Li, Jimeng Sun, Zongyuan Ge, Gang Li, James Zou, Huaxiu Yao

    Abstract: Artificial intelligence has significantly impacted medical applications, particularly with the advent of Medical Large Vision Language Models (Med-LVLMs), sparking optimism for the future of automated and personalized healthcare. However, the trustworthiness of Med-LVLMs remains unverified, posing significant risks for future model deployment. In this paper, we introduce CARES and aim to comprehen… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  14. arXiv:2406.04647  [pdf, other

    cs.CV

    UVCPNet: A UAV-Vehicle Collaborative Perception Network for 3D Object Detection

    Authors: Yuchao Wang, Peirui Cheng, Pengju Tian, Ziyang Yuan, Liangjin Zhao, Jing Tian, Wensheng Wang, Zhirui Wang, Xian Sun

    Abstract: With the advancement of collaborative perception, the role of aerial-ground collaborative perception, a crucial component, is becoming increasingly important. The demand for collaborative perception across different perspectives to construct more comprehensive perceptual information is growing. However, challenges arise due to the disparities in the field of view between cross-domain agents and th… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  15. arXiv:2406.03017  [pdf, other

    cs.CV

    DifAttack++: Query-Efficient Black-Box Adversarial Attack via Hierarchical Disentangled Feature Space in Cross-Domain

    Authors: Jun Liu, Jiantao Zhou, Jiandian Zeng, Jinyu Tian, Zheng Li

    Abstract: This work investigates efficient score-based black-box adversarial attacks with a high Attack Success Rate (\textbf{ASR}) and good generalizability. We design a novel attack method based on a hierarchical DIsentangled Feature space, called \textbf{DifAttack++}, which differs significantly from the existing ones operating over the entire feature space. Specifically, DifAttack++ firstly disentangles… ▽ More

    Submitted 1 July, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2309.14585 An extension of the AAAI24 paper "DifAttack: Query-Efficient Black-Box Attack via Disentangled Feature Space."

  16. arXiv:2406.02969  [pdf, other

    cs.LG cs.AI cs.CL q-fin.CP q-fin.MF

    Filtered not Mixed: Stochastic Filtering-Based Online Gating for Mixture of Large Language Models

    Authors: Raeid Saqur, Anastasis Kratsios, Florian Krach, Yannick Limmer, Jacob-Junqi Tian, John Willes, Blanka Horvath, Frank Rudzicz

    Abstract: We propose MoE-F -- a formalised mechanism for combining $N$ pre-trained expert Large Language Models (LLMs) in online time-series prediction tasks by adaptively forecasting the best weighting of LLM predictions at every time step. Our mechanism leverages the conditional information in each expert's running performance to forecast the best combination of LLMs for predicting the time series in its… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 29 pages, 5 Appendix sections

    MSC Class: 60J05; 60G35; 68T20; 68T42; 68T50 ACM Class: I.2.6; I.2.7; G.3

  17. arXiv:2405.20487  [pdf, ps, other

    cs.AI

    Probabilities of Causation for Continuous and Vector Variables

    Authors: Yuta Kawakami, Manabu Kuroki, Jin Tian

    Abstract: Probabilities of causation (PoC) are valuable concepts for explainable artificial intelligence and practical decision-making. PoC are originally defined for scalar binary variables. In this paper, we extend the concept of PoC to continuous treatment and outcome variables, and further generalize PoC to capture causal effects between multiple treatments and multiple outcomes. In addition, we conside… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  18. arXiv:2405.15225  [pdf, other

    cs.CV

    Unbiased Faster R-CNN for Single-source Domain Generalized Object Detection

    Authors: Yajing Liu, Shijun Zhou, Xiyao Liu, Chunhui Hao, Baojie Fan, Jiandong Tian

    Abstract: Single-source domain generalization (SDG) for object detection is a challenging yet essential task as the distribution bias of the unseen domain degrades the algorithm performance significantly. However, existing methods attempt to extract domain-invariant features, neglecting that the biased data leads the network to learn biased features that are non-causal and poorly generalizable. To this end,… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: CVPR 2024

  19. arXiv:2405.11754  [pdf, other

    cs.CV

    Versatile Teacher: A Class-aware Teacher-student Framework for Cross-domain Adaptation

    Authors: Runou Yang, Tian Tian, Jinwen Tian

    Abstract: Addressing the challenge of domain shift between datasets is vital in maintaining model performance. In the context of cross-domain object detection, the teacher-student framework, a widely-used semi-supervised model, has shown significant accuracy improvements. However, existing methods often overlook class differences, treating all classes equally, resulting in suboptimal results. Furthermore, t… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

  20. arXiv:2405.08251  [pdf, other

    cs.CV

    Multimodal Collaboration Networks for Geospatial Vehicle Detection in Dense, Occluded, and Large-Scale Events

    Authors: Xin Wu, Zhanchao Huang, Li Wang, Jocelyn Chanussot, Jiaojiao Tian

    Abstract: In large-scale disaster events, the planning of optimal rescue routes depends on the object detection ability at the disaster scene, with one of the main challenges being the presence of dense and occluded objects. Existing methods, which are typically based on the RGB modality, struggle to distinguish targets with similar colors and textures in crowded environments and are unable to identify obsc… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  21. arXiv:2405.06707  [pdf, other

    cs.CL cs.AI

    Hypothesis Testing Prompting Improves Deductive Reasoning in Large Language Models

    Authors: Yitian Li, Jidong Tian, Hao He, Yaohui Jin

    Abstract: Combining different forms of prompts with pre-trained large language models has yielded remarkable results on reasoning tasks (e.g. Chain-of-Thought prompting). However, along with testing on more complex reasoning, these methods also expose problems such as invalid reasoning and fictional reasoning paths. In this paper, we develop \textit{Hypothesis Testing Prompting}, which adds conclusion assum… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  22. arXiv:2405.05498  [pdf, other

    cs.SD eess.AS

    The RoyalFlush Automatic Speech Diarization and Recognition System for In-Car Multi-Channel Automatic Speech Recognition Challenge

    Authors: Jingguang Tian, Shuaishuai Ye, Shunfei Chen, Yang Xiang, Zhaohui Yin, Xinhui Hu, Xinkang Xu

    Abstract: This paper presents our system submission for the In-Car Multi-Channel Automatic Speech Recognition (ICMC-ASR) Challenge, which focuses on speaker diarization and speech recognition in complex multi-speaker scenarios. To address these challenges, we develop end-to-end speaker diarization models that notably decrease the diarization error rate (DER) by 49.58\% compared to the official baseline on t… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  23. arXiv:2405.04872  [pdf, other

    cs.CL cs.AI cs.LO

    Logical Negation Augmenting and Debiasing for Prompt-based Methods

    Authors: Yitian Li, Jidong Tian, Hao He, Yaohui Jin

    Abstract: Prompt-based methods have gained increasing attention on NLP and shown validity on many downstream tasks. Many works have focused on mining these methods' potential for knowledge extraction, but few explore their ability to make logical reasoning. In this work, we focus on the effectiveness of the prompt-based methods on first-order logical reasoning and find that the bottleneck lies in logical ne… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  24. arXiv:2404.19171  [pdf, other

    cs.CV cs.AI

    Explicit Correlation Learning for Generalizable Cross-Modal Deepfake Detection

    Authors: Cai Yu, Shan Jia, Xiaomeng Fu, Jin Liu, Jiahe Tian, Jiao Dai, Xi Wang, Siwei Lyu, Jizhong Han

    Abstract: With the rising prevalence of deepfakes, there is a growing interest in developing generalizable detection methods for various types of deepfakes. While effective in their specific modalities, traditional detection methods fall short in addressing the generalizability of detection across diverse cross-modal deepfakes. This paper aims to explicitly learn potential cross-modal correlation to enhance… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: accepted by ICME 2024

  25. arXiv:2404.15744  [pdf, other

    cs.LG cs.AI cs.CR

    A General Black-box Adversarial Attack on Graph-based Fake News Detectors

    Authors: Peican Zhu, Zechen Pan, Yang Liu, Jiwei Tian, Keke Tang, Zhen Wang

    Abstract: Graph Neural Network (GNN)-based fake news detectors apply various methods to construct graphs, aiming to learn distinctive news embeddings for classification. Since the construction details are unknown for attackers in a black-box scenario, it is unrealistic to conduct the classical adversarial attacks that require a specific adjacency matrix. In this paper, we propose the first general black-box… ▽ More

    Submitted 25 April, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

    Comments: Accepted by IJCAI2024

  26. arXiv:2404.15702  [pdf, other

    cs.CL

    Nyonic Technical Report

    Authors: Junfeng Tian, Rui Wang, Cong Li, Yudong Zhou, Jun Liu, Jun Wang

    Abstract: This report details the development and key achievements of our latest language model designed for custom large language models. The advancements introduced include a novel Online Data Scheduler that supports flexible training data adjustments and curriculum learning. The model's architecture is fortified with state-of-the-art techniques such as Rotary Positional Embeddings, QK-LayerNorm, and a sp… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  27. arXiv:2404.05960  [pdf, other

    cs.CV

    EasyTrack: Efficient and Compact One-stream 3D Point Clouds Tracker

    Authors: Baojie Fan, Wuyang Zhou, Kai Wang, Shijun Zhou, Fengyu Xu, Jiandong Tian

    Abstract: Most of 3D single object trackers (SOT) in point clouds follow the two-stream multi-stage 3D Siamese or motion tracking paradigms, which process the template and search area point clouds with two parallel branches, built on supervised point cloud backbones. In this work, beyond typical 3D Siamese or motion tracking, we propose a neat and compact one-stream transformer 3D SOT paradigm from the nove… ▽ More

    Submitted 12 April, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

  28. arXiv:2404.03471  [pdf, other

    cs.CL cs.CY cs.LG

    The Impact of Unstated Norms in Bias Analysis of Language Models

    Authors: Farnaz Kohankhaki, Jacob-Junqi Tian, David Emerson, Laleh Seyyed-Kalantari, Faiza Khan Khattak

    Abstract: Large language models (LLMs), trained on vast datasets, can carry biases that manifest in various forms, from overt discrimination to implicit stereotypes. One facet of bias is performance disparities in LLMs, often harming underprivileged groups, such as racial minorities. A common approach to quantifying bias is to use template-based bias probes, which explicitly state group membership (e.g. Whi… ▽ More

    Submitted 7 April, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

  29. arXiv:2404.02840  [pdf, ps, other

    cs.DC

    A Survey on Error-Bounded Lossy Compression for Scientific Datasets

    Authors: Sheng Di, Jinyang Liu, Kai Zhao, Xin Liang, Robert Underwood, Zhaorui Zhang, Milan Shah, Yafan Huang, Jiajun Huang, Xiaodong Yu, Congrong Ren, Hanqi Guo, Grant Wilkins, Dingwen Tao, Jiannan Tian, Sian Jin, Zizhe Jian, Daoce Wang, MD Hasanur Rahman, Boyuan Zhang, Jon C. Calhoun, Guanpeng Li, Kazutomo Yoshii, Khalid Ayed Alharthi, Franck Cappello

    Abstract: Error-bounded lossy compression has been effective in significantly reducing the data storage/transfer burden while preserving the reconstructed data fidelity very well. Many error-bounded lossy compressors have been developed for a wide range of parallel and distributed use cases for years. These lossy compressors are designed with distinct compression models and design principles, such that each… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: submitted to ACM Computing journal, requited to be 35 pages including references

  30. arXiv:2404.01714  [pdf, other

    cs.LG cs.AI cs.CV math.OC

    Conjugate-Gradient-like Based Adaptive Moment Estimation Optimization Algorithm for Deep Learning

    Authors: Jiawu Tian, Liwei Xu, Xiaowei Zhang, Yongqi Li

    Abstract: Training deep neural networks is a challenging task. In order to speed up training and enhance the performance of deep neural networks, we rectify the vanilla conjugate gradient as conjugate-gradient-like and incorporate it into the generic Adam, and thus propose a new optimization algorithm named CG-like-Adam for deep learning. Specifically, both the first-order and the second-order moment estima… ▽ More

    Submitted 11 May, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 32 pages, 13 figures

  31. arXiv:2403.18100  [pdf

    cs.AI cs.LG

    Driving Intelligent IoT Monitoring and Control through Cloud Computing and Machine Learning

    Authors: Hanzhe Li, Xiangxiang Wang, Yuan Feng, Yaqian Qi, Jingxiao Tian

    Abstract: This article explores how to drive intelligent iot monitoring and control through cloud computing and machine learning. As iot and the cloud continue to generate large and diverse amounts of data as sensor devices in the network, the collected data is sent to the cloud for statistical analysis, prediction, and data analysis to achieve business objectives. However, because the cloud computing model… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  32. arXiv:2403.16169  [pdf, other

    cs.CV

    Gaze-guided Hand-Object Interaction Synthesis: Benchmark and Method

    Authors: Jie Tian, Lingxiao Yang, Ran Ji, Yuexin Ma, Lan Xu, Jingyi Yu, Ye Shi, Jingya Wang

    Abstract: Gaze plays a crucial role in revealing human attention and intention, shedding light on the cognitive processes behind human actions. The integration of gaze guidance with the dynamics of hand-object interactions boosts the accuracy of human motion prediction. However, the lack of datasets that capture the intricate relationship and consistency among gaze, hand, and object movements remains a subs… ▽ More

    Submitted 28 March, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

  33. arXiv:2403.16107  [pdf, other

    cs.HC

    Designing Upper-Body Gesture Interaction with and for People with Spinal Muscular Atrophy in VR

    Authors: Jingze Tian, Yingna Wang, Keye Yu, Liyi Xu, Junan Xie, Franklin Mingzhe Li, Yafeng Niu, Mingming Fan

    Abstract: Recent research proposed gaze-assisted gestures to enhance interaction within virtual reality (VR), providing opportunities for people with motor impairments to experience VR. Compared to people with other motor impairments, those with Spinal Muscular Atrophy (SMA) exhibit enhanced distal limb mobility, providing them with more design space. However, it remains unknown what gaze-assisted upper-bod… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

    Comments: Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI '24), May 11--16, 2024, Honolulu, HI, USA

  34. arXiv:2403.05852  [pdf, other

    cs.CV

    SSF-Net: Spatial-Spectral Fusion Network with Spectral Angle Awareness for Hyperspectral Object Tracking

    Authors: Hanzheng Wang, Wei Li, Xiang-Gen Xia, Qian Du, Jing Tian

    Abstract: Hyperspectral video (HSV) offers valuable spatial, spectral, and temporal information simultaneously, making it highly suitable for handling challenges such as background clutter and visual similarity in object tracking. However, existing methods primarily focus on band regrouping and rely on RGB trackers for feature extraction, resulting in limited exploration of spectral information and difficul… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

  35. arXiv:2403.04070  [pdf, other

    cs.LG cs.AI cs.CR cs.CV

    Improving Adversarial Training using Vulnerability-Aware Perturbation Budget

    Authors: Olukorede Fakorede, Modeste Atsague, Jin Tian

    Abstract: Adversarial Training (AT) effectively improves the robustness of Deep Neural Networks (DNNs) to adversarial attacks. Generally, AT involves training DNN models with adversarial examples obtained within a pre-defined, fixed perturbation bound. Notably, individual natural examples from which these adversarial examples are crafted exhibit varying degrees of intrinsic vulnerabilities, and as such, cra… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: 19 pages, 2 figures

  36. arXiv:2403.03165  [pdf

    cs.AI

    Leveraging Federated Learning and Edge Computing for Recommendation Systems within Cloud Computing Networks

    Authors: Yaqian Qi, Yuan Feng, Xiangxiang Wang, Hanzhe Li, Jingxiao Tian

    Abstract: To enable large-scale and efficient deployment of artificial intelligence (AI), the combination of AI and edge computing has spawned Edge Intelligence, which leverages the computing and communication capabilities of end devices and edge servers to process data closer to where it is generated. A key technology for edge intelligence is the privacy-protecting machine learning paradigm known as Federa… ▽ More

    Submitted 13 March, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

  37. arXiv:2402.15764  [pdf, other

    cs.CL cs.AI

    Look Before You Leap: Problem Elaboration Prompting Improves Mathematical Reasoning in Large Language Models

    Authors: Haoran Liao, Jidong Tian, Shaohua Hu, Hao He, Yaohui Jin

    Abstract: Large language models (LLMs) still grapple with complex tasks like mathematical reasoning. Despite significant efforts invested in improving prefix prompts or reasoning process, the crucial role of problem context might have been neglected. Accurate recognition of inputs is fundamental for solving mathematical tasks, as ill-formed problems could potentially mislead LLM's reasoning. In this study,… ▽ More

    Submitted 26 March, 2024; v1 submitted 24 February, 2024; originally announced February 2024.

  38. arXiv:2402.11934  [pdf, other

    cs.CL cs.AI

    Team QUST at SemEval-2024 Task 8: A Comprehensive Study of Monolingual and Multilingual Approaches for Detecting AI-generated Text

    Authors: Xiaoman Xu, Xiangrun Li, Taihang Wang, Jianxiang Tian, Ye Jiang

    Abstract: This paper presents the participation of team QUST in Task 8 SemEval 2024. We first performed data augmentation and cleaning on the dataset to enhance model training efficiency and accuracy. In the monolingual task, we evaluated traditional deep-learning methods, multiscale positive-unlabeled framework (MPU), fine-tuning, adapters and ensemble methods. Then, we selected the top-performing models b… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

  39. arXiv:2402.09240  [pdf, other

    cs.LG cs.CV

    Switch EMA: A Free Lunch for Better Flatness and Sharpness

    Authors: Siyuan Li, Zicheng Liu, Juanxi Tian, Ge Wang, Zedong Wang, Weiyang Jin, Di Wu, Cheng Tan, Tao Lin, Yang Liu, Baigui Sun, Stan Z. Li

    Abstract: Exponential Moving Average (EMA) is a widely used weight averaging (WA) regularization to learn flat optima for better generalizations without extra cost in deep neural network (DNN) optimization. Despite achieving better flatness, existing WA methods might fall into worse final performances or require extra test-time computations. This work unveils the full potential of EMA with a single line of… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: Preprint V1. Source code and models at https://github.com/Westlake-AI/SEMA

  40. arXiv:2401.16658  [pdf, ps, other

    cs.CL eess.AS

    OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer

    Authors: Yifan Peng, Jinchuan Tian, William Chen, Siddhant Arora, Brian Yan, Yui Sudo, Muhammad Shakeel, Kwanghee Choi, Jiatong Shi, Xuankai Chang, Jee-weon Jung, Shinji Watanabe

    Abstract: Recent studies have highlighted the importance of fully open foundation models. The Open Whisper-style Speech Model (OWSM) is an initial step towards reproducing OpenAI Whisper using public data and open-source toolkits. However, previous versions of OWSM (v1 to v3) are still based on standard Transformer, which might lead to inferior performance compared to state-of-the-art speech encoder archite… ▽ More

    Submitted 16 June, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

    Comments: Accepted at INTERSPEECH 2024. Webpage: https://www.wavlab.org/activities/2024/owsm/

  41. LESSON: Multi-Label Adversarial False Data Injection Attack for Deep Learning Locational Detection

    Authors: Jiwei Tian, Chao Shen, Buhong Wang, Xiaofang Xia, Meng Zhang, Chenhao Lin, Qian Li

    Abstract: Deep learning methods can not only detect false data injection attacks (FDIA) but also locate attacks of FDIA. Although adversarial false data injection attacks (AFDIA) based on deep learning vulnerabilities have been studied in the field of single-label FDIA detection, the adversarial attack and defense against multi-label FDIA locational detection are still not involved. To bridge this gap, this… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

    Comments: Accepted by TDSC

  42. arXiv:2401.12983  [pdf

    cs.CL cs.AI physics.ed-ph

    Assessing Large Language Models in Mechanical Engineering Education: A Study on Mechanics-Focused Conceptual Understanding

    Authors: Jie Tian, Jixin Hou, Zihao Wu, Peng Shu, Zhengliang Liu, Yujie Xiang, Beikang Gu, Nicholas Filla, Yiwei Li, Ning Liu, Xianyan Chen, Keke Tang, Tianming Liu, Xianqiao Wang

    Abstract: This study is a pioneering endeavor to investigate the capabilities of Large Language Models (LLMs) in addressing conceptual questions within the domain of mechanical engineering with a focus on mechanics. Our examination involves a manually crafted exam encompassing 126 multiple-choice questions, spanning various aspects of mechanics courses, including Fluid Mechanics, Mechanical Vibration, Engin… ▽ More

    Submitted 13 January, 2024; originally announced January 2024.

    Comments: 30 pages, 7 figures, and 1 table

  43. arXiv:2401.11130  [pdf, ps, other

    cs.LG stat.ML

    Identification and Estimation of Conditional Average Partial Causal Effects via Instrumental Variable

    Authors: Yuta Kawakami, Manabu Kuroki, Jin Tian

    Abstract: There has been considerable recent interest in estimating heterogeneous causal effects. In this paper, we study conditional average partial causal effects (CAPCE) to reveal the heterogeneity of causal effects with continuous treatment. We provide conditions for identifying CAPCE in an instrumental variable setting. Notably, CAPCE is identifiable under a weaker assumption than required by a commonl… ▽ More

    Submitted 30 May, 2024; v1 submitted 20 January, 2024; originally announced January 2024.

  44. arXiv:2401.05676  [pdf, other

    cs.CV

    Exploring Self- and Cross-Triplet Correlations for Human-Object Interaction Detection

    Authors: Weibo Jiang, Weihong Ren, Jiandong Tian, Liangqiong Qu, Zhiyong Wang, Honghai Liu

    Abstract: Human-Object Interaction (HOI) detection plays a vital role in scene understanding, which aims to predict the HOI triplet in the form of <human, object, action>. Existing methods mainly extract multi-modal features (e.g., appearance, object semantics, human pose) and then fuse them together to directly predict HOI triplets. However, most of these methods focus on seeking for self-triplet aggregati… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

  45. arXiv:2401.05093  [pdf, other

    cs.CV

    SwiMDiff: Scene-wide Matching Contrastive Learning with Diffusion Constraint for Remote Sensing Image

    Authors: Jiayuan Tian, Jie Lei, Jiaqing Zhang, Weiying Xie, Yunsong Li

    Abstract: With recent advancements in aerospace technology, the volume of unlabeled remote sensing image (RSI) data has increased dramatically. Effectively leveraging this data through self-supervised learning (SSL) is vital in the field of remote sensing. However, current methodologies, particularly contrastive learning (CL), a leading SSL method, encounter specific challenges in this domain. Firstly, CL o… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

  46. arXiv:2401.02708  [pdf, other

    cs.LG cs.AI stat.ML

    TripleSurv: Triplet Time-adaptive Coordinate Loss for Survival Analysis

    Authors: Liwen Zhang, Lianzhen Zhong, Fan Yang, Di Dong, Hui Hui, Jie Tian

    Abstract: A core challenge in survival analysis is to model the distribution of censored time-to-event data, where the event of interest may be a death, failure, or occurrence of a specific event. Previous studies have showed that ranking and maximum likelihood estimation (MLE)loss functions are widely-used for survival analysis. However, ranking loss only focus on the ranking of survival time and does not… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

    Comments: 9 pages,6 figures

  47. arXiv:2401.02038  [pdf, other

    cs.CL

    Understanding LLMs: A Comprehensive Overview from Training to Inference

    Authors: Yiheng Liu, Hao He, Tianle Han, Xu Zhang, Mengyuan Liu, Jiaming Tian, Yutong Zhang, Jiaqi Wang, Xiaohui Gao, Tianyang Zhong, Yi Pan, Shaochen Xu, Zihao Wu, Zhengliang Liu, Xin Zhang, Shu Zhang, Xintao Hu, Tuo Zhang, Ning Qiang, Tianming Liu, Bao Ge

    Abstract: The introduction of ChatGPT has led to a significant increase in the utilization of Large Language Models (LLMs) for addressing downstream tasks. There's an increasing focus on cost-efficient training and deployment within this context. Low-cost training and deployment of LLMs represent the future development trend. This paper reviews the evolution of large language model training techniques and i… ▽ More

    Submitted 5 January, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

    Comments: 30 pages,6 figures

  48. arXiv:2312.15273  [pdf, other

    cs.CV cs.AI

    Benefit from public unlabeled data: A Frangi filtering-based pretraining network for 3D cerebrovascular segmentation

    Authors: Gen Shi, Hao Lu, Hui Hui, Jie Tian

    Abstract: The precise cerebrovascular segmentation in time-of-flight magnetic resonance angiography (TOF-MRA) data is crucial for clinically computer-aided diagnosis. However, the sparse distribution of cerebrovascular structures in TOF-MRA results in an exceedingly high cost for manual data labeling. The use of unlabeled TOF-MRA data holds the potential to enhance model performance significantly. In this s… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

    Comments: Under Review

  49. arXiv:2312.10892  [pdf, other

    eess.IV cs.CV q-bio.QM

    Deep Learning-based MRI Reconstruction with Artificial Fourier Transform (AFT)-Net

    Authors: Yanting Yang, Jeffery Siyuan Tian, Matthieu Dagommer, Jia Guo

    Abstract: The deep complex-valued neural network provides a powerful way to leverage complex number operations and representations, which has succeeded in several phase-based applications. However, most previously published networks have not fully accessed the impact of complex-valued networks in the frequency domain. Here, we introduced a unified complex-valued deep learning framework - artificial Fourier… ▽ More

    Submitted 17 December, 2023; originally announced December 2023.

  50. arXiv:2312.10608  [pdf, other

    cs.CV

    Robust 3D Tracking with Quality-Aware Shape Completion

    Authors: Jingwen Zhang, Zikun Zhou, Guangming Lu, Jiandong Tian, Wenjie Pei

    Abstract: 3D single object tracking remains a challenging problem due to the sparsity and incompleteness of the point clouds. Existing algorithms attempt to address the challenges in two strategies. The first strategy is to learn dense geometric features based on the captured sparse point cloud. Nevertheless, it is quite a formidable task since the learned dense geometric features are with high uncertainty… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

    Comments: A detailed version of the paper accepted by AAAI 2024