Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 87 results for author: Ye, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.12753  [pdf, other

    cs.CL cs.AI

    OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI

    Authors: Zhen Huang, Zengzhi Wang, Shijie Xia, Xuefeng Li, Haoyang Zou, Ruijie Xu, Run-Ze Fan, Lyumanshan Ye, Ethan Chern, Yixin Ye, Yikai Zhang, Yuqing Yang, Ting Wu, Binjie Wang, Shichao Sun, Yang Xiao, Yiyuan Li, Fan Zhou, Steffi Chern, Yiwei Qin, Yan Ma, Jiadi Su, Yixiu Liu, Yuxiang Zheng, Shaoting Zhang , et al. (3 additional authors not shown)

    Abstract: The evolution of Artificial Intelligence (AI) has been significantly accelerated by advancements in Large Language Models (LLMs) and Large Multimodal Models (LMMs), gradually showcasing potential cognitive reasoning abilities in problem-solving and scientific discovery (i.e., AI4Science) once exclusive to human intellect. To comprehensively evaluate current models' performance in cognitive reasoni… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 44 pages

  2. arXiv:2406.03240  [pdf, other

    cs.SD cs.AI eess.AS

    Generalized Source Tracing: Detecting Novel Audio Deepfake Algorithm with Real Emphasis and Fake Dispersion Strategy

    Authors: Yuankun Xie, Ruibo Fu, Zhengqi Wen, Zhiyong Wang, Xiaopeng Wang, Haonnan Cheng, Long Ye, Jianhua Tao

    Abstract: With the proliferation of deepfake audio, there is an urgent need to investigate their attribution. Current source tracing methods can effectively distinguish in-distribution (ID) categories. However, the rapid evolution of deepfake algorithms poses a critical challenge in the accurate identification of out-of-distribution (OOD) novel deepfake algorithms. In this paper, we propose Real Emphasis an… ▽ More

    Submitted 8 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted by INTERSPEECH 2024

  3. arXiv:2405.13324  [pdf, other

    cs.LG cs.AI

    Adversarial Training via Adaptive Knowledge Amalgamation of an Ensemble of Teachers

    Authors: Shayan Mohajer Hamidi, Linfeng Ye

    Abstract: Adversarial training (AT) is a popular method for training robust deep neural networks (DNNs) against adversarial attacks. Yet, AT suffers from two shortcomings: (i) the robustness of DNNs trained by AT is highly intertwined with the size of the DNNs, posing challenges in achieving robustness in smaller models; and (ii) the adversarial samples employed during the AT process exhibit poor generaliza… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  4. arXiv:2405.13152  [pdf, other

    cs.CV cs.AI

    Enhancing Interaction Modeling with Agent Selection and Physical Methods for Trajectory Prediction

    Authors: Shiji Huang, Lei Ye, Min Chen, Wenhai Luo, Chenqi Xu, Deyuan Liang, Dihong Wang

    Abstract: In this study, we address the limitations inherent in most existing vehicle trajectory prediction methodologies that indiscriminately incorporate all agents within a predetermined proximity when accounting for inter-agent interactions. These approaches commonly employ attention-based architecture or graph neural networks for encoding interactions, which introduces three challenges: (i) The indiscr… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: code:https://github.com/kkk00714/ASPILin

  5. arXiv:2405.06495  [pdf, other

    cs.HC

    Storypark: Leveraging Large Language Models to Enhance Children Story Learning Through Child-AI collaboration Storytelling

    Authors: Lyumanshan Ye, Jiandong Jiang, Danni Chang, Pengfei Liu

    Abstract: Interactive storytelling has been widely adopted by educators in teaching activities of young children. Such a teaching method combines storytelling with active child participation, benefiting their expressive abilities, creative thinking, and understanding of stories. Interactive storytelling requires facilitators to unidirectionally narrate the story content and encourage children's participatio… ▽ More

    Submitted 13 May, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

  6. arXiv:2405.04880  [pdf, other

    cs.SD cs.AI eess.AS

    The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio

    Authors: Yuankun Xie, Yi Lu, Ruibo Fu, Zhengqi Wen, Zhiyong Wang, Jianhua Tao, Xin Qi, Xiaopeng Wang, Yukun Liu, Haonan Cheng, Long Ye, Yi Sun

    Abstract: With the proliferation of Audio Language Model (ALM) based deepfake audio, there is an urgent need for generalized detection methods. ALM-based deepfake audio currently exhibits widespread, high deception, and type versatility, posing a significant challenge to current audio deepfake detection (ADD) models trained solely on vocoded data. To effectively detect ALM-based deepfake audio, we focus on… ▽ More

    Submitted 15 May, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

  7. arXiv:2404.18225  [pdf, other

    cs.RO

    Quadruped robot traversing 3D complex environments with limited perception

    Authors: Yi Cheng, Hang Liu, Guoping Pan, Linqi Ye, Houde Liu, Bin Liang

    Abstract: Traversing 3-D complex environments has always been a significant challenge for legged locomotion. Existing methods typically rely on external sensors such as vision and lidar to preemptively react to obstacles by acquiring environmental information. However, in scenarios like nighttime or dense forests, external sensors often fail to function properly, necessitating robots to rely on propriocepti… ▽ More

    Submitted 29 April, 2024; v1 submitted 28 April, 2024; originally announced April 2024.

    Comments: 10 pages, 8 figures,submitted to iros2024

  8. arXiv:2404.08246  [pdf

    cs.RO cs.LG

    Agile and versatile bipedal robot tracking control through reinforcement learning

    Authors: Jiayi Li, Linqi Ye, Yi Cheng, Houde Liu, Bin Liang

    Abstract: The remarkable athletic intelligence displayed by humans in complex dynamic movements such as dancing and gymnastics suggests that the balance mechanism in biological beings is decoupled from specific movement patterns. This decoupling allows for the execution of both learned and unlearned movements under certain constraints while maintaining balance through minor whole-body coordination. To repli… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  9. arXiv:2403.18243  [pdf, other

    cs.AI

    Boosting Conversational Question Answering with Fine-Grained Retrieval-Augmentation and Self-Check

    Authors: Linhao Ye, Zhikai Lei, Jianghao Yin, Qin Chen, Jie Zhou, Liang He

    Abstract: Retrieval-Augmented Generation (RAG) aims to generate more reliable and accurate responses, by augmenting large language models (LLMs) with the external vast and dynamic knowledge. Most previous work focuses on using RAG for single-round question answering, while how to adapt RAG to the complex conversational setting wherein the question is interdependent on the preceding context is not well studi… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  10. arXiv:2403.14718  [pdf, other

    cs.LG cs.DC

    FedSR: A Semi-Decentralized Federated Learning Algorithm for Non-IIDness in IoT System

    Authors: Jianjun Huang, Lixin Ye, Li Kang

    Abstract: In the Industrial Internet of Things (IoT), a large amount of data will be generated every day. Due to privacy and security issues, it is difficult to collect all these data together to train deep learning models, thus the federated learning, a distributed machine learning paradigm that protects data privacy, has been widely used in IoT. However, in practical federated learning, the data distribut… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: 11 pages, 10 figures

  11. arXiv:2402.15220  [pdf, other

    cs.LG cs.CL

    ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition

    Authors: Lu Ye, Ze Tao, Yong Huang, Yang Li

    Abstract: Self-attention is an essential component of large language models(LLMs) but a significant source of inference latency for long sequences. In multi-tenant LLMs serving scenarios, the compute and memory operation cost of self-attention can be optimized by using the probability that multiple LLM requests have shared system prompts in prefixes. In this paper, we introduce ChunkAttention, a prefix-awar… ▽ More

    Submitted 22 March, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

  12. ConceptThread: Visualizing Threaded Concepts in MOOC Videos

    Authors: Zhiguang Zhou, Li Ye, Lihong Cai, Lei Wang, Yigang Wang, Yongheng Wang, Wei Chen, Yong Wang

    Abstract: Massive Open Online Courses (MOOCs) platforms are becoming increasingly popular in recent years. Online learners need to watch the whole course video on MOOC platforms to learn the underlying new knowledge, which is often tedious and time-consuming due to the lack of a quick overview of the covered knowledge and their structures. In this paper, we propose ConceptThread, a visual analytics approach… ▽ More

    Submitted 20 January, 2024; originally announced January 2024.

    Comments: 17 pages, 10 figures, 2 tables

  13. arXiv:2401.08732  [pdf, other

    cs.LG cs.CV cs.IT

    Bayes Conditional Distribution Estimation for Knowledge Distillation Based on Conditional Mutual Information

    Authors: Linfeng Ye, Shayan Mohajer Hamidi, Renhao Tan, En-Hui Yang

    Abstract: It is believed that in knowledge distillation (KD), the role of the teacher is to provide an estimate for the unknown Bayes conditional probability distribution (BCPD) to be used in the student training process. Conventionally, this estimate is obtained by training the teacher using maximum log-likelihood (MLL) method. To improve this estimate for KD, in this paper we introduce the concept of cond… ▽ More

    Submitted 7 March, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: 32 pages, 19 figures, Published as a conference paper at ICLR 2024

    MSC Class: 68T30 ACM Class: I.2.6

    Journal ref: International Conference on Learning Representations 2024 (ICLR)

  14. arXiv:2401.07991  [pdf, other

    cs.LG cs.CR

    Robustness Against Adversarial Attacks via Learning Confined Adversarial Polytopes

    Authors: Shayan Mohajer Hamidi, Linfeng Ye

    Abstract: Deep neural networks (DNNs) could be deceived by generating human-imperceptible perturbations of clean samples. Therefore, enhancing the robustness of DNNs against adversarial attacks is a crucial task. In this paper, we aim to train robust DNNs by limiting the set of outputs reachable via a norm-bounded perturbation added to a clean sample. We refer to this set as adversarial polytope, and each c… ▽ More

    Submitted 20 January, 2024; v1 submitted 15 January, 2024; originally announced January 2024.

    Comments: The paper has been accepted in ICASSP 2024

  15. arXiv:2401.01258  [pdf, other

    math.OC cs.LG eess.SY

    Towards Model-Free LQR Control over Rate-Limited Channels

    Authors: Aritra Mitra, Lintao Ye, Vijay Gupta

    Abstract: Given the success of model-free methods for control design in many problem settings, it is natural to ask how things will change if realistic communication channels are utilized for the transmission of gradients or policies. While the resulting problem has analogies with the formulations studied under the rubric of networked control systems, the rich literature in that area has typically assumed t… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

    Comments: 24 pages

  16. arXiv:2309.16706  [pdf, other

    cs.CR cs.AI cs.LG

    AIR: Threats of Adversarial Attacks on Deep Learning-Based Information Recovery

    Authors: Jinyin Chen, Jie Ge, Shilian Zheng, Linhui Ye, Haibin Zheng, Weiguo Shen, Keqiang Yue, Xiaoniu Yang

    Abstract: A wireless communications system usually consists of a transmitter which transmits the information and a receiver which recovers the original information from the received distorted signal. Deep learning (DL) has been used to improve the performance of the receiver in complicated channel environments and state-of-the-art (SOTA) performance has been achieved. However, its robustness has not been in… ▽ More

    Submitted 17 August, 2023; originally announced September 2023.

  17. arXiv:2309.09167  [pdf

    cs.RO

    From Knowing to Doing: Learning Diverse Motor Skills through Instruction Learning

    Authors: Linqi Ye, Jiayi Li, Yi Cheng, Xianhao Wang, Bin Liang, Yan Peng

    Abstract: Recent years have witnessed many successful trials in the robot learning field. For contact-rich robotic tasks, it is challenging to learn coordinated motor skills by reinforcement learning. Imitation learning solves this problem by using a mimic reward to encourage the robot to track a given reference trajectory. However, imitation learning is not so efficient and may constrain the learned motion… ▽ More

    Submitted 1 November, 2023; v1 submitted 17 September, 2023; originally announced September 2023.

  18. arXiv:2309.09123  [pdf, other

    cs.LG cs.AI

    Conditional Mutual Information Constrained Deep Learning for Classification

    Authors: En-Hui Yang, Shayan Mohajer Hamidi, Linfeng Ye, Renhao Tan, Beverly Yang

    Abstract: The concepts of conditional mutual information (CMI) and normalized conditional mutual information (NCMI) are introduced to measure the concentration and separation performance of a classification deep neural network (DNN) in the output probability distribution space of the DNN, where CMI and the ratio between CMI and NCMI represent the intra-class concentration and inter-class separation of the D… ▽ More

    Submitted 16 September, 2023; originally announced September 2023.

  19. arXiv:2309.03036  [pdf, other

    cs.SD cs.AI eess.AS

    An Efficient Temporary Deepfake Location Approach Based Embeddings for Partially Spoofed Audio Detection

    Authors: Yuankun Xie, Haonan Cheng, Yutian Wang, Long Ye

    Abstract: Partially spoofed audio detection is a challenging task, lying in the need to accurately locate the authenticity of audio at the frame level. To address this issue, we propose a fine-grained partially spoofed audio detection method, namely Temporal Deepfake Location (TDL), which can effectively capture information of both features and locations. Specifically, our approach involves two novel parts:… ▽ More

    Submitted 21 November, 2023; v1 submitted 6 September, 2023; originally announced September 2023.

  20. arXiv:2309.02232  [pdf, other

    cs.SD cs.AI eess.AS

    FSD: An Initial Chinese Dataset for Fake Song Detection

    Authors: Yuankun Xie, Jingjing Zhou, Xiaolin Lu, Zhenghao Jiang, Yuxin Yang, Haonan Cheng, Long Ye

    Abstract: Singing voice synthesis and singing voice conversion have significantly advanced, revolutionizing musical experiences. However, the rise of "Deepfake Songs" generated by these technologies raises concerns about authenticity. Unlike Audio DeepFake Detection (ADD), the field of song deepfake detection lacks specialized datasets or methods for song authenticity verification. In this paper, we initial… ▽ More

    Submitted 6 September, 2023; v1 submitted 5 September, 2023; originally announced September 2023.

    Comments: Submitted to ICASSP 2024

  21. arXiv:2308.08842  [pdf, ps, other

    cs.FL

    Introducing Divergence for Infinite Probabilistic Models

    Authors: Alain Finkel, Serge Haddad, Lina Ye

    Abstract: Computing the reachability probability in infinite state probabilistic models has been the topic of numerous works. Here we introduce a new property called \emph{divergence} that when satisfied allows to compute reachability probabilities up to an arbitrary precision. One of the main interest of divergence is that our algorithm does not require the reachability problem to be decidable. Then we stu… ▽ More

    Submitted 8 October, 2023; v1 submitted 17 August, 2023; originally announced August 2023.

    Comments: 32 pages. Add more details in proofs. arXiv admin note: text overlap with arXiv:2305.19564

  22. arXiv:2308.07072  [pdf, other

    cs.CV

    Teeth And Root Canals Segmentation Using ZXYFormer With Uncertainty Guidance And Weight Transfer

    Authors: Shangxuan Li, Yu Du, Li Ye, Chichi Li, Yanshu Fang, Cheng Wang, Wu Zhou

    Abstract: This study attempts to segment teeth and root-canals simultaneously from CBCT images, but there are very challenging problems in this process. First, the clinical CBCT image data is very large (e.g., 672 *688 * 688), and the use of downsampling operation will lose useful information about teeth and root canals. Second, teeth and root canals are very different in morphology, and it is difficult for… ▽ More

    Submitted 14 August, 2023; originally announced August 2023.

  23. arXiv:2308.02773  [pdf, other

    cs.CL

    EduChat: A Large-Scale Language Model-based Chatbot System for Intelligent Education

    Authors: Yuhao Dan, Zhikai Lei, Yiyang Gu, Yong Li, Jianghao Yin, Jiaju Lin, Linhao Ye, Zhiyan Tie, Yougen Zhou, Yilei Wang, Aimin Zhou, Ze Zhou, Qin Chen, Jie Zhou, Liang He, Xipeng Qiu

    Abstract: EduChat (https://www.educhat.top/) is a large-scale language model (LLM)-based chatbot system in the education domain. Its goal is to support personalized, fair, and compassionate intelligent education, serving teachers, students, and parents. Guided by theories from psychology and education, it further strengthens educational functions such as open question answering, essay assessment, Socratic t… ▽ More

    Submitted 4 August, 2023; originally announced August 2023.

  24. Analyzing Robustness of Angluin's L$^*$ Algorithm in Presence of Noise

    Authors: Lina Ye, Igor Khmelnitsky, Serge Haddad, Benoît Barbot, Benedikt Bollig, Martin Leucker, Daniel Neider, Rajarshi Roy

    Abstract: Angluin's L$^*$ algorithm learns the minimal deterministic finite automaton (DFA) of a regular language using membership and equivalence queries. Its probabilistic approximatively correct (PAC) version substitutes an equivalence query by numerous random membership queries to get a high level confidence to the answer. Thus it can be applied to any kind of device and may be viewed as an algorithm fo… ▽ More

    Submitted 19 March, 2024; v1 submitted 14 June, 2023; originally announced June 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2209.10315

    Journal ref: Logical Methods in Computer Science (March 20, 2024) lmcs:11472

  25. arXiv:2306.04121  [pdf, other

    cs.CV

    Matte Anything: Interactive Natural Image Matting with Segment Anything Models

    Authors: Jingfeng Yao, Xinggang Wang, Lang Ye, Wenyu Liu

    Abstract: Natural image matting algorithms aim to predict the transparency map (alpha-matte) with the trimap guidance. However, the production of trimap often requires significant labor, which limits the widespread application of matting algorithms on a large scale. To address the issue, we propose Matte Anything (MatAny), an interactive natural image matting model that could produce high-quality alpha-matt… ▽ More

    Submitted 28 February, 2024; v1 submitted 6 June, 2023; originally announced June 2023.

    Comments: 21 pages, codes: https://github.com/hustvl/Matte-Anything

  26. arXiv:2305.19564  [pdf, other

    cs.FL

    About Decisiveness of Dynamic Probabilistic Models

    Authors: Alain Finkel, Serge Haddad, Lina Ye

    Abstract: Decisiveness of infinite Markov chains with respect to some (finite or infinite) target set of states is a key property that allows to compute the reachability probability of this set up to an arbitrary precision. Most of the existing works assume constant weights for defining the probability of a transition in the considered models. However numerous probabilistic modelings require (dynamic) weigh… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: 20 pages

    ACM Class: G.3; F.1.2

  27. Improving the Generalizability of Trajectory Prediction Models with Frenet-Based Domain Normalization

    Authors: Luyao Ye, Zikang Zhou, Jianping Wang

    Abstract: Predicting the future trajectories of nearby objects plays a pivotal role in Robotics and Automation such as autonomous driving. While learning-based trajectory prediction methods have achieved remarkable performance on public benchmarks, the generalization ability of these approaches remains questionable. The poor generalizability on unseen domains, a well-recognized defect of data-driven approac… ▽ More

    Submitted 20 December, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

    Comments: This paper was published in 2023 IEEE International Conference on Robotics and Automation (ICRA). New version updated with links to the source code of the Frenet+ strategy

  28. arXiv:2305.14239  [pdf, other

    cs.CL

    On Learning to Summarize with Large Language Models as References

    Authors: Yixin Liu, Kejian Shi, Katherine S He, Longtian Ye, Alexander R. Fabbri, Pengfei Liu, Dragomir Radev, Arman Cohan

    Abstract: Recent studies have found that summaries generated by large language models (LLMs) are favored by human annotators over the original reference summaries in commonly used summarization datasets. Therefore, we investigate a new learning setting of text summarization models that considers the LLMs as the reference or the gold-standard oracle on these datasets. To examine the standard practices that a… ▽ More

    Submitted 16 November, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: GitHub Repo: https://github.com/yixinL7/SumLLM

  29. arXiv:2305.10446  [pdf, other

    cs.CL cs.AI

    Emotion Recognition based on Psychological Components in Guided Narratives for Emotion Regulation

    Authors: Gustave Cortal, Alain Finkel, Patrick Paroubek, Lina Ye

    Abstract: Emotion regulation is a crucial element in dealing with emotional events and has positive effects on mental health. This paper aims to provide a more comprehensive understanding of emotional events by introducing a new French corpus of emotional narratives collected using a questionnaire for emotion regulation. We follow the theoretical framework of the Component Process Model which considers emot… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

    Journal ref: Association for Computational Linguistics, May 2023, Dubrovnik, Croatia. pp.72-81

  30. arXiv:2305.06537  [pdf, other

    cs.RO

    Visuotactile Sensor Enabled Pneumatic Device Towards Compliant Oropharyngeal Swab Sampling

    Authors: Shoujie Li, Mingshan He, Wenbo Ding, Linqi Ye, Xueqian Wang, Junbo Tan, Jinqiu Yuan, Xiao-Ping Zhang

    Abstract: Manual oropharyngeal (OP) swab sampling is an intensive and risky task. In this article, a novel OP swab sampling device of low cost and high compliance is designed by combining the visuo-tactile sensor and the pneumatic actuator-based gripper. Here, a concave visuo-tactile sensor called CoTac is first proposed to address the problems of high cost and poor reliability of traditional multi-axis for… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

    Comments: 8 pages

  31. arXiv:2302.04344  [pdf, other

    stat.ML cs.LG eess.SY

    Learning Dynamical Systems by Leveraging Data from Similar Systems

    Authors: Lei Xin, Lintao Ye, George Chiu, Shreyas Sundaram

    Abstract: We consider the problem of learning the dynamics of a linear system when one has access to data generated by an auxiliary system that shares similar (but not identical) dynamics, in addition to data from the true system. We use a weighted least squares approach, and provide finite sample error bounds of the learned model as a function of the number of samples and various system parameters from the… ▽ More

    Submitted 24 May, 2024; v1 submitted 8 February, 2023; originally announced February 2023.

    Comments: 15 pages,9 figures

  32. arXiv:2301.00230  [pdf, other

    cs.CV

    Disjoint Masking with Joint Distillation for Efficient Masked Image Modeling

    Authors: Xin Ma, Chang Liu, Chunyu Xie, Long Ye, Yafeng Deng, Xiangyang Ji

    Abstract: Masked image modeling (MIM) has shown great promise for self-supervised learning (SSL) yet been criticized for learning inefficiency. We believe the insufficient utilization of training signals should be responsible. To alleviate this issue, we introduce a conceptually simple yet learning-efficient MIM training scheme, termed Disjoint Masking with Joint Distillation (DMJD). For disjoint masking (D… ▽ More

    Submitted 31 December, 2022; originally announced January 2023.

  33. arXiv:2212.13654  [pdf

    physics.optics cs.CV eess.IV

    Large-scale single-photon imaging

    Authors: Liheng Bian, Haoze Song, Lintao Peng, Xuyang Chang, Xi Yang, Roarke Horstmeyer, Lin Ye, Tong Qin, Dezhi Zheng, Jun Zhang

    Abstract: Benefiting from its single-photon sensitivity, single-photon avalanche diode (SPAD) array has been widely applied in various fields such as fluorescence lifetime imaging and quantum computing. However, large-scale high-fidelity single-photon imaging remains a big challenge, due to the complex hardware manufacture craft and heavy noise disturbance of SPAD arrays. In this work, we introduce deep lea… ▽ More

    Submitted 27 December, 2022; originally announced December 2022.

  34. Visual-tactile Fusion for Transparent Object Grasping in Complex Backgrounds

    Authors: Shoujie Li, Haixin Yu, Wenbo Ding, Houde Liu, Linqi Ye, Chongkun Xia, Xueqian Wang, Xiao-Ping Zhang

    Abstract: The accurate detection and grasping of transparent objects are challenging but of significance to robots. Here, a visual-tactile fusion framework for transparent object grasping under complex backgrounds and variant light conditions is proposed, including the grasping position detection, tactile calibration, and visual-tactile fusion based classification. First, a multi-scene synthetic grasping da… ▽ More

    Submitted 8 June, 2024; v1 submitted 29 November, 2022; originally announced November 2022.

    Comments: Published

    Journal ref: IEEE Transactions on Robotics,2023

  35. arXiv:2211.12514   

    cs.LG cs.CV

    AugOp: Inject Transformation into Neural Operator

    Authors: Longqing Ye

    Abstract: In this paper, we propose a simple and general approach to augment regular convolution operator by injecting extra group-wise transformation during training and recover it during inference. Extra transformation is carefully selected to ensure it can be merged with regular convolution in each group and will not change the topological structure of regular convolution during inference. Compared with… ▽ More

    Submitted 20 June, 2023; v1 submitted 23 November, 2022; originally announced November 2022.

    Comments: The results are greatly influenced by random seeds. The conclusion may be wrong

  36. arXiv:2211.06223  [pdf

    cs.RO eess.SY

    The Simplest Balance Controller for Dynamic Walking

    Authors: Linqi Ye, Xueqian Wang, Houde Liu, Bin Liang

    Abstract: Humans can balance very well during walking, even when perturbed. But it seems difficult to achieve robust walking for bipedal robots. Here we describe the simplest balance controller that leads to robust walking for a linear inverted pendulum (LIP) model. The main idea is to use a linear function of the body velocity to determine the next foot placement, which we call linear foot placement contro… ▽ More

    Submitted 11 November, 2022; originally announced November 2022.

  37. arXiv:2210.08886  [pdf, other

    math.OC cs.LG eess.SY

    Learning Decentralized Linear Quadratic Regulator with $\sqrt{T}$ Regret

    Authors: Lintao Ye, Ming Chi, Ruiquan Liao, Vijay Gupta

    Abstract: We propose an online learning algorithm that adaptively designs a decentralized linear quadratic regulator when the system model is unknown a priori and new data samples from a single system trajectory become progressively available. The algorithm uses a disturbance-feedback representation of state-feedback controllers coupled with online convex optimization with memory and delayed feedback. Under… ▽ More

    Submitted 12 April, 2024; v1 submitted 17 October, 2022; originally announced October 2022.

    Comments: 49 pages, 3 figures

  38. arXiv:2210.05296  [pdf, other

    cs.CL

    Natural Language Processing for Cognitive Analysis of Emotions

    Authors: Gustave Cortal, Alain Finkel, Patrick Paroubek, Lina Ye

    Abstract: Emotion analysis in texts suffers from two major limitations: annotated gold-standard corpora are mostly small and homogeneous, and emotion identification is often simplified as a sentence-level classification problem. To address these issues, we introduce a new annotation scheme for exploring emotions and their causes, along with a new French dataset composed of autobiographical accounts of an em… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

    Journal ref: Semantics, Memory, and EMotion 2022, Sep 2022, Paris, France

  39. Analyzing Robustness of Angluin's L* Algorithm in Presence of Noise

    Authors: Igor Khmelnitsky, Serge Haddad, Lina Ye, Benoît Barbot, Benedikt Bollig, Martin Leucker, Daniel Neider, Rajarshi Roy

    Abstract: Angluin's L* algorithm learns the minimal (complete) deterministic finite automaton (DFA) of a regular language using membership and equivalence queries. Its probabilistic approximatively correct (PAC) version substitutes an equivalence query by a large enough set of random membership queries to get a high level confidence to the answer. Thus it can be applied to any kind of (also non-regular) dev… ▽ More

    Submitted 21 September, 2022; originally announced September 2022.

    Comments: In Proceedings GandALF 2022, arXiv:2209.09333

    Journal ref: EPTCS 370, 2022, pp. 81-96

  40. arXiv:2208.08042  [pdf, other

    cs.CL cs.SD eess.AS

    The Conversational Short-phrase Speaker Diarization (CSSD) Task: Dataset, Evaluation Metric and Baselines

    Authors: Gaofeng Cheng, Yifan Chen, Runyan Yang, Qingxuan Li, Zehui Yang, Lingxuan Ye, Pengyuan Zhang, Qingqing Zhang, Lei Xie, Yanmin Qian, Kong Aik Lee, Yonghong Yan

    Abstract: The conversation scenario is one of the most important and most challenging scenarios for speech processing technologies because people in conversation respond to each other in a casual style. Detecting the speech activities of each person in a conversation is vital to downstream tasks, like natural language processing, machine translation, etc. People refer to the detection technology of "who spe… ▽ More

    Submitted 16 August, 2022; originally announced August 2022.

    Comments: arXiv admin note: text overlap with arXiv:2203.16844

  41. arXiv:2203.16844  [pdf, ps, other

    cs.CL eess.AS

    Open Source MagicData-RAMC: A Rich Annotated Mandarin Conversational(RAMC) Speech Dataset

    Authors: Zehui Yang, Yifan Chen, Lei Luo, Runyan Yang, Lingxuan Ye, Gaofeng Cheng, Ji Xu, Yaohui Jin, Qingqing Zhang, Pengyuan Zhang, Lei Xie, Yonghong Yan

    Abstract: This paper introduces a high-quality rich annotated Mandarin conversational (RAMC) speech dataset called MagicData-RAMC. The MagicData-RAMC corpus contains 180 hours of conversational speech data recorded from native speakers of Mandarin Chinese over mobile phones with a sampling rate of 16 kHz. The dialogs in MagicData-RAMC are classified into 15 diversified domains and tagged with topic labels,… ▽ More

    Submitted 31 March, 2022; originally announced March 2022.

    Comments: Paper on submission to Interspeech2022

  42. arXiv:2203.06589  [pdf, other

    cs.CV cs.LG

    AugShuffleNet: Communicate More, Compute Less

    Authors: Longqing Ye

    Abstract: As a remarkable compact model, ShuffleNetV2 offers a good example to design efficient ConvNets but its limit is rarely noticed. In this paper, we rethink the design pattern of ShuffleNetV2 and find that the channel-wise redundancy problem still constrains the efficiency improvement of Shuffle block in the wider ShuffleNetV2. To resolve this issue, we propose another augmented variant of shuffle bl… ▽ More

    Submitted 21 August, 2022; v1 submitted 13 March, 2022; originally announced March 2022.

  43. arXiv:2111.12305  [pdf, other

    cs.LG

    Thundernna: a white box adversarial attack

    Authors: Linfeng Ye, Shayan Mohajer Hamidi

    Abstract: The existing work shows that the neural network trained by naive gradient-based optimization method is prone to adversarial attacks, adds small malicious on the ordinary input is enough to make the neural network wrong. At the same time, the attack against a neural network is the key to improving its robustness. The training against adversarial examples can make neural networks resist some kinds o… ▽ More

    Submitted 21 January, 2024; v1 submitted 24 November, 2021; originally announced November 2021.

    Comments: 10 pages, 5 figures

    MSC Class: 92B20 ACM Class: I.2.m

  44. arXiv:2110.07112  [pdf, other

    math.OC cs.LG eess.SY

    On the Sample Complexity of Decentralized Linear Quadratic Regulator with Partially Nested Information Structure

    Authors: Lintao Ye, Hao Zhu, Vijay Gupta

    Abstract: We study the problem of control policy design for decentralized state-feedback linear quadratic control with a partially nested information structure, when the system model is unknown. We propose a model-based learning solution, which consists of two steps. First, we estimate the unknown system model from a single system trajectory of finite length, using least squares estimation. Next, based on t… ▽ More

    Submitted 27 May, 2022; v1 submitted 13 October, 2021; originally announced October 2021.

  45. arXiv:2106.06778  [pdf, other

    cs.CV cs.LG

    Dynamic Clone Transformer for Efficient Convolutional Neural Netwoks

    Authors: Longqing Ye

    Abstract: Convolutional networks (ConvNets) have shown impressive capability to solve various vision tasks. Nevertheless, the trade-off between performance and efficiency is still a challenge for a feasible model deployment on resource-constrained platforms. In this paper, we introduce a novel concept termed multi-path fully connected pattern (MPFC) to rethink the interdependencies of topology pattern, accu… ▽ More

    Submitted 12 June, 2021; originally announced June 2021.

  46. arXiv:2105.12694  [pdf, other

    cs.CV

    Deep Learning for Weakly-Supervised Object Detection and Object Localization: A Survey

    Authors: Feifei Shao, Long Chen, Jian Shao, Wei Ji, Shaoning Xiao, Lu Ye, Yueting Zhuang, Jun Xiao

    Abstract: Weakly-Supervised Object Detection (WSOD) and Localization (WSOL), i.e., detecting multiple and single instances with bounding boxes in an image using image-level labels, are long-standing and challenging tasks in the CV community. With the success of deep neural networks in object detection, both WSOD and WSOL have received unprecedented attention. Hundreds of WSOD and WSOL methods and numerous t… ▽ More

    Submitted 26 May, 2021; originally announced May 2021.

    Comments: 13 pages, 4 figures

  47. arXiv:2105.04124  [pdf, other

    cs.SD eess.AS

    MASS: Multi-task Anthropomorphic Speech Synthesis Framework

    Authors: Jinyin Chen, Linhui Ye, Zhaoyan Ming

    Abstract: Text-to-Speech (TTS) synthesis plays an important role in human-computer interaction. Currently, most TTS technologies focus on the naturalness of speech, namely,making the speeches sound like humans. However, the key tasks of the expression of emotion and the speaker identity are ignored, which limits the application scenarios of TTS synthesis technology. To make the synthesized speech more reali… ▽ More

    Submitted 10 May, 2021; originally announced May 2021.

  48. arXiv:2104.10351  [pdf, other

    cs.CV

    Improving Weakly-supervised Object Localization via Causal Intervention

    Authors: Feifei Shao, Yawei Luo, Li Zhang, Lu Ye, Siliang Tang, Yi Yang, Jun Xiao

    Abstract: The recent emerged weakly supervised object localization (WSOL) methods can learn to localize an object in the image only using image-level labels. Previous works endeavor to perceive the interval objects from the small and sparse discriminative attention map, yet ignoring the co-occurrence confounder (e.g., bird and sky), which makes the model inspection (e.g., CAM) hard to distinguish between th… ▽ More

    Submitted 3 August, 2021; v1 submitted 21 April, 2021; originally announced April 2021.

    Comments: 9 pages, 5 figures. This paper was accepted by ACM Multimedia 2021. The code can be available at https://github.com/shaofeifei11/CI-CAM

  49. arXiv:2104.07177  [pdf, other

    cs.CV cs.AI

    PURE: Passive mUlti-peRson idEntification via Deep Footstep Separation and Recognition

    Authors: Chao Cai, Ruinan Jin, Peng Wang, Liyuan Ye, Hongbo Jiang, Jun Luo

    Abstract: Recently, \textit{passive behavioral biometrics} (e.g., gesture or footstep) have become promising complements to conventional user identification methods (e.g., face or fingerprint) under special situations, yet existing sensing technologies require lengthy measurement traces and cannot identify multiple users at the same time. To this end, we propose \systemname\ as a passive multi-person identi… ▽ More

    Submitted 14 April, 2021; originally announced April 2021.

  50. arXiv:2103.11096  [pdf, other

    cs.RO eess.SY

    An Efficient Calibration Method for Triaxial Gyroscope

    Authors: Li Wang, Tao Zhang, Lin Ye, Jiao Jiao Li, Steven Su

    Abstract: This paper presents an efficient servomotor-aided calibration method for the triaxial gyroscope. The entire calibration process only requires approximately one minute, and does not require high-precision equipment. This method is based on the idea that the measurement of the gyroscope should be equal to the rotation speed of the servomotor. A six-observation experimental design is proposed to mini… ▽ More

    Submitted 29 July, 2021; v1 submitted 20 March, 2021; originally announced March 2021.