Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 142 results for author: Qian, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.17476  [pdf, other

    cs.CY cs.AI

    ORCDF: An Oversmoothing-Resistant Cognitive Diagnosis Framework for Student Learning in Online Education Systems

    Authors: Hong Qian, Shuo Liu, Mingjia Li, Bingdong Li, Zhi Liu, Aimin Zhou

    Abstract: Cognitive diagnosis models (CDMs) are designed to learn students' mastery levels using their response logs. CDMs play a fundamental role in online education systems since they significantly influence downstream applications such as teachers' guidance and computerized adaptive testing. Despite the success achieved by existing CDMs, we find that they suffer from a thorny issue that the learned stude… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

    Journal ref: KDD 2024

  2. arXiv:2407.15661  [pdf, other

    cs.CV

    DriveDiTFit: Fine-tuning Diffusion Transformers for Autonomous Driving

    Authors: Jiahang Tu, Wei Ji, Hanbin Zhao, Chao Zhang, Roger Zimmermann, Hui Qian

    Abstract: In autonomous driving, deep models have shown remarkable performance across various visual perception tasks with the demand of high-quality and huge-diversity training datasets. Such datasets are expected to cover various driving scenarios with adverse weather, lighting conditions and diverse moving objects. However, manually collecting these data presents huge challenges and expensive cost. With… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

  3. arXiv:2407.03813  [pdf, other

    cs.CV

    PECTP: Parameter-Efficient Cross-Task Prompts for Incremental Vision Transformer

    Authors: Qian Feng, Hanbin Zhao, Chao Zhang, Jiahua Dong, Henghui Ding, Yu-Gang Jiang, Hui Qian

    Abstract: Incremental Learning (IL) aims to learn deep models on sequential tasks continually, where each new task includes a batch of new classes and deep models have no access to task-ID information at the inference time. Recent vast pre-trained models (PTMs) have achieved outstanding performance by prompt technique in practical IL without the old samples (rehearsal-free) and with a memory constraint (mem… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  4. arXiv:2407.00487  [pdf, other

    cs.CL

    It's Morphing Time: Unleashing the Potential of Multiple LLMs via Multi-objective Optimization

    Authors: Bingdong Li, Zixiang Di, Yanting Yang, Hong Qian, Peng Yang, Hao Hao, Ke Tang, Aimin Zhou

    Abstract: In this paper, we introduce a novel approach for large language model merging via black-box multi-objective optimization algorithms. The goal of model merging is to combine multiple models, each excelling in different tasks, into a single model that outperforms any of the individual source models. However, model merging faces two significant challenges: First, existing methods rely heavily on huma… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  5. arXiv:2406.18445  [pdf, other

    cs.LG cs.PF

    An Autotuning-based Optimization Framework for Mixed-kernel SVM Classifications in Smart Pixel Datasets and Heterojunction Transistors

    Authors: Xingfu Wu, Tupendra Oli, ustin H. Qian, Valerie Taylor, Mark C. Hersam, Vinod K. Sangwan

    Abstract: Support Vector Machine (SVM) is a state-of-the-art classification method widely used in science and engineering due to its high accuracy, its ability to deal with high dimensional data, and its flexibility in modeling diverse sources of data. In this paper, we propose an autotuning-based optimization framework to quantify the ranges of hyperparameters in SVMs to identify their optimal choices, and… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  6. arXiv:2406.12896  [pdf, other

    cs.AI cs.CY cs.LG

    Leveraging Pedagogical Theories to Understand Student Learning Process with Graph-based Reasonable Knowledge Tracing

    Authors: Jiajun Cui, Hong Qian, Bo Jiang, Wei Zhang

    Abstract: Knowledge tracing (KT) is a crucial task in intelligent education, focusing on predicting students' performance on given questions to trace their evolving knowledge. The advancement of deep learning in this field has led to deep-learning knowledge tracing (DLKT) models that prioritize high predictive accuracy. However, many existing DLKT methods overlook the fundamental goal of tracking students'… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: Preprint, accepted to appear in SIGKDD 2024, 12 pages. The source code is available at https://github.com/JJCui96/GRKT. Keywords: interpretable knowledge tracing, student behavior modeling, intelligence education

  7. arXiv:2406.03508  [pdf, other

    cs.LG cs.AI cs.CR

    Mutual Information Guided Backdoor Mitigation for Pre-trained Encoders

    Authors: Tingxu Han, Weisong Sun, Ziqi Ding, Chunrong Fang, Hanwei Qian, Jiaxun Li, Zhenyu Chen, Xiangyu Zhang

    Abstract: Self-supervised learning (SSL) is increasingly attractive for pre-training encoders without requiring labeled data. Downstream tasks built on top of those pre-trained encoders can achieve nearly state-of-the-art performance. The pre-trained encoders by SSL, however, are vulnerable to backdoor attacks as demonstrated by existing studies. Numerous backdoor mitigation techniques are designed for down… ▽ More

    Submitted 11 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

  8. arXiv:2405.17984  [pdf, other

    cs.LG

    Cross-Context Backdoor Attacks against Graph Prompt Learning

    Authors: Xiaoting Lyu, Yufei Han, Wei Wang, Hangwei Qian, Ivor Tsang, Xiangliang Zhang

    Abstract: Graph Prompt Learning (GPL) bridges significant disparities between pretraining and downstream applications to alleviate the knowledge transfer bottleneck in real-world graph learning. While GPL offers superior effectiveness in graph knowledge transfer and computational efficiency, the security risks posed by backdoor poisoning effects embedded in pretrained models remain largely unexplored. Our s… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: Accepted by KDD 2024

  9. arXiv:2405.15318  [pdf, other

    cs.CL cs.AI

    Are Long-LLMs A Necessity For Long-Context Tasks?

    Authors: Hongjin Qian, Zheng Liu, Peitian Zhang, Kelong Mao, Yujia Zhou, Xu Chen, Zhicheng Dou

    Abstract: The learning and deployment of long-LLMs remains a challenging problem despite recent progresses. In this work, we argue that the long-LLMs are not a necessity to solve long-context tasks, as common long-context tasks are short-context solvable, i.e. they can be solved by purely working with oracle short-contexts within the long-context tasks' inputs. On top of this argument, we propose a framewor… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 18 pages

  10. arXiv:2405.14741  [pdf, other

    math.OC cs.LG stat.ML

    Bagging Improves Generalization Exponentially

    Authors: Huajie Qian, Donghao Ying, Henry Lam, Wotao Yin

    Abstract: Bagging is a popular ensemble technique to improve the accuracy of machine learning models. It hinges on the well-established rationale that, by repeatedly retraining on resampled data, the aggregated model exhibits lower variance and hence higher stability, especially for discontinuous base learners. In this paper, we provide a new perspective on bagging: By suitably aggregating the base learners… ▽ More

    Submitted 29 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: Correct author list typo

  11. arXiv:2405.08674  [pdf, other

    cs.LG cs.AI

    Expensive Multi-Objective Bayesian Optimization Based on Diffusion Models

    Authors: Bingdong Li, Zixiang Di, Yongfan Lu, Hong Qian, Feng Wang, Peng Yang, Ke Tang, Aimin Zhou

    Abstract: Multi-objective Bayesian optimization (MOBO) has shown promising performance on various expensive multi-objective optimization problems (EMOPs). However, effectively modeling complex distributions of the Pareto optimal solutions is difficult with limited function evaluations. Existing Pareto set learning algorithms may exhibit considerable instability in such expensive scenarios, leading to signif… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

  12. arXiv:2405.08604  [pdf, other

    cs.LG cs.AI

    Towards Geometry-Aware Pareto Set Learning for Neural Multi-Objective Combinatorial Optimization

    Authors: Yongfan Lu, Zixiang Di, Bingdong Li, Shengcai Liu, Hong Qian, Peng Yang, Ke Tang, Aimin Zhou

    Abstract: Multi-objective combinatorial optimization (MOCO) problems are prevalent in various real-world applications. Most existing neural MOCO methods rely on problem decomposition to transform an MOCO problem into a series of singe-objective combinatorial optimization (SOCO) problems. However, these methods often approximate partial regions of the Pareto front and spend excessive time on diversity enhanc… ▽ More

    Submitted 23 May, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

  13. arXiv:2405.01567  [pdf, other

    cs.SE cs.AI

    CodeFort: Robust Training for Code Generation Models

    Authors: Yuhao Zhang, Shiqi Wang, Haifeng Qian, Zijian Wang, Mingyue Shang, Linbo Liu, Sanjay Krishna Gouda, Baishakhi Ray, Murali Krishna Ramanathan, Xiaofei Ma, Anoop Deoras

    Abstract: Code generation models are not robust to small perturbations, which often lead to inconsistent and incorrect generations and significantly degrade the performance of these models. Improving the robustness of code generation models is crucial to better user experience when these models are deployed in real-world applications. However, existing efforts have not addressed this issue for code generati… ▽ More

    Submitted 11 April, 2024; originally announced May 2024.

  14. arXiv:2404.19553  [pdf, other

    cs.CL

    Extending Llama-3's Context Ten-Fold Overnight

    Authors: Peitian Zhang, Ninglu Shao, Zheng Liu, Shitao Xiao, Hongjin Qian, Qiwei Ye, Zhicheng Dou

    Abstract: We extend the context length of Llama-3-8B-Instruct from 8K to 80K via QLoRA fine-tuning. The entire training cycle is super efficient, which takes 8 hours on one 8xA800 (80G) GPU machine. The resulted model exhibits superior performances across a broad range of evaluation tasks, such as NIHS, topic retrieval, and long-context language understanding; meanwhile, it also well preserves the original… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  15. arXiv:2404.15778  [pdf, other

    cs.LG cs.CL

    BASS: Batched Attention-optimized Speculative Sampling

    Authors: Haifeng Qian, Sujan Kumar Gonugondla, Sungsoo Ha, Mingyue Shang, Sanjay Krishna Gouda, Ramesh Nallapati, Sudipta Sengupta, Xiaofei Ma, Anoop Deoras

    Abstract: Speculative decoding has emerged as a powerful method to improve latency and throughput in hosting large language models. However, most existing implementations focus on generating a single sequence. Real-world generative AI applications often require multiple responses and how to perform speculative decoding in a batched setting while preserving its latency benefits poses non-trivial challenges.… ▽ More

    Submitted 26 June, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

  16. arXiv:2404.15284  [pdf, other

    eess.SP cs.AI

    Global 4D Ionospheric STEC Prediction based on DeepONet for GNSS Rays

    Authors: Dijia Cai, Zenghui Shi, Haiyang Fu, Huan Liu, Hongyi Qian, Yun Sui, Feng Xu, Ya-Qiu Jin

    Abstract: The ionosphere is a vitally dynamic charged particle region in the Earth's upper atmosphere, playing a crucial role in applications such as radio communication and satellite navigation. The Slant Total Electron Contents (STEC) is an important parameter for characterizing wave propagation, representing the integrated electron density along the ray of radio signals passing through the ionosphere. Th… ▽ More

    Submitted 12 March, 2024; originally announced April 2024.

  17. Optimal Structure of Receive Beamforming for Over-the-Air Computation

    Authors: Hongbin Zhu, Hua Qian

    Abstract: We investigate fast data aggregation via over-the-air computation (AirComp) over wireless networks. In this scenario, an access point (AP) with multiple antennas aims to recover the arithmetic mean of sensory data from multiple wireless devices. To minimize estimation distortion, we formulate a mean-squared-error (MSE) minimization problem that considers joint optimization of transmit scalars at w… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: Published on IEEE ICASSP 2024

  18. Inductive Cognitive Diagnosis for Fast Student Learning in Web-Based Online Intelligent Education Systems

    Authors: Shuo Liu, Junhao Shen, Hong Qian, Aimin Zhou

    Abstract: Cognitive diagnosis aims to gauge students' mastery levels based on their response logs. Serving as a pivotal module in web-based online intelligent education systems (WOIESs), it plays an upstream and fundamental role in downstream tasks like learning item recommendation and computerized adaptive testing. WOIESs are open learning environment where numerous new students constantly register and com… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: WWW 2024

  19. arXiv:2403.08845  [pdf, other

    cs.LG cs.AI

    Bifurcated Attention: Accelerating Massively Parallel Decoding with Shared Prefixes in LLMs

    Authors: Ben Athiwaratkun, Sujan Kumar Gonugondla, Sanjay Krishna Gouda, Haifeng Qian, Hantian Ding, Qing Sun, Jun Wang, Jiacheng Guo, Liangfu Chen, Parminder Bhatia, Ramesh Nallapati, Sudipta Sengupta, Bing Xiang

    Abstract: This study introduces bifurcated attention, a method designed to enhance language model inference in shared-context batch decoding scenarios. Our approach addresses the challenge of redundant memory IO costs, a critical factor contributing to latency in high batch sizes and extended context lengths. Bifurcated attention achieves this by strategically dividing the attention mechanism during increme… ▽ More

    Submitted 11 July, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  20. arXiv:2403.03846  [pdf, other

    cs.LG

    On the Effectiveness of Distillation in Mitigating Backdoors in Pre-trained Encoder

    Authors: Tingxu Han, Shenghan Huang, Ziqi Ding, Weisong Sun, Yebo Feng, Chunrong Fang, Jun Li, Hanwei Qian, Cong Wu, Quanjun Zhang, Yang Liu, Zhenyu Chen

    Abstract: In this paper, we study a defense against poisoned encoders in SSL called distillation, which is a defense used in supervised learning originally. Distillation aims to distill knowledge from a given model (a.k.a the teacher net) and transfer it to another (a.k.a the student net). Now, we use it to distill benign knowledge from poisoned pre-trained encoders and transfer it to a new encoder, resulti… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  21. arXiv:2403.01731  [pdf, other

    cs.CV cs.RO

    RISeg: Robot Interactive Object Segmentation via Body Frame-Invariant Features

    Authors: Howard H. Qian, Yangxiao Lu, Kejia Ren, Gaotian Wang, Ninad Khargonkar, Yu Xiang, Kaiyu Hang

    Abstract: In order to successfully perform manipulation tasks in new environments, such as grasping, robots must be proficient in segmenting unseen objects from the background and/or other objects. Previous works perform unseen object instance segmentation (UOIS) by training deep neural networks on large-scale data to learn RGB/RGB-D feature embeddings, where cluttered environments often result in inaccurat… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: 7 pages, 5 figures, ICRA 2024

  22. arXiv:2402.18264  [pdf, other

    cs.CL

    Retrieval-based Full-length Wikipedia Generation for Emergent Events

    Authors: Jiebin Zhang, Eugene J. Yu, Qinyu Chen, Chenhao Xiong, Dawei Zhu, Han Qian, Mingbo Song, Xiaoguang Li, Qun Liu, Sujian Li

    Abstract: In today's fast-paced world, the growing demand to quickly generate comprehensive and accurate Wikipedia documents for emerging events is both crucial and challenging. However, previous efforts in Wikipedia generation have often fallen short of meeting real-world requirements. Some approaches focus solely on generating segments of a complete Wikipedia document, while others overlook the importance… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

  23. arXiv:2402.17988  [pdf, other

    cs.PL cs.LG cs.SE

    Constrained Decoding for Code Language Models via Efficient Left and Right Quotienting of Context-Sensitive Grammars

    Authors: Daniel Melcer, Nathan Fulton, Sanjay Krishna Gouda, Haifeng Qian

    Abstract: Large Language Models are powerful tools for program synthesis and advanced auto-completion, but come with no guarantee that their output code is syntactically correct. This paper contributes an incremental parser that allows early rejection of syntactically incorrect code, as well as efficient detection of complete programs for fill-in-the-middle (FItM) tasks. We develop Earley-style parsers that… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: 20 pages, Code available at https://github.com/amazon-science/incremental-parsing

  24. arXiv:2402.17563  [pdf, other

    cs.CV cs.AI cs.LG

    Structure-Guided Adversarial Training of Diffusion Models

    Authors: Ling Yang, Haotian Qian, Zhilong Zhang, Jingwei Liu, Bin Cui

    Abstract: Diffusion models have demonstrated exceptional efficacy in various generative applications. While existing models focus on minimizing a weighted sum of denoising score matching losses for data distribution modeling, their training primarily emphasizes instance-level optimization, overlooking valuable structural information within each mini-batch, indicative of pair-wise relationships among samples… ▽ More

    Submitted 4 March, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: Accepted by CVPR 2024

  25. arXiv:2402.13750  [pdf, other

    cs.IR cs.AI cs.CL

    Breaking the Barrier: Utilizing Large Language Models for Industrial Recommendation Systems through an Inferential Knowledge Graph

    Authors: Qian Zhao, Hao Qian, Ziqi Liu, Gong-Duo Zhang, Lihong Gu

    Abstract: Recommendation systems are widely used in e-commerce websites and online platforms to address information overload. However, existing systems primarily rely on historical data and user feedback, making it difficult to capture user intent transitions. Recently, Knowledge Base (KB)-based models are proposed to incorporate expert knowledge, but it struggle to adapt to new items and the evolving e-com… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: 9 pages, 5 figures

  26. arXiv:2402.09760  [pdf, other

    cs.CL cs.AI cs.IR

    Grounding Language Model with Chunking-Free In-Context Retrieval

    Authors: Hongjin Qian, Zheng Liu, Kelong Mao, Yujia Zhou, Zhicheng Dou

    Abstract: This paper presents a novel Chunking-Free In-Context (CFIC) retrieval approach, specifically tailored for Retrieval-Augmented Generation (RAG) systems. Traditional RAG systems often struggle with grounding responses using precise evidence text due to the challenges of processing lengthy documents and filtering out irrelevant content. Commonly employed solutions, such as document chunking and adapt… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  27. arXiv:2402.06633  [pdf, other

    q-fin.ST cs.IR cs.LG

    MDGNN: Multi-Relational Dynamic Graph Neural Network for Comprehensive and Dynamic Stock Investment Prediction

    Authors: Hao Qian, Hongting Zhou, Qian Zhao, Hao Chen, Hongxiang Yao, Jingwei Wang, Ziqi Liu, Fei Yu, Zhiqiang Zhang, Jun Zhou

    Abstract: The stock market is a crucial component of the financial system, but predicting the movement of stock prices is challenging due to the dynamic and intricate relations arising from various aspects such as economic indicators, financial reports, global news, and investor sentiment. Traditional sequential methods and graph-based models have been applied in stock movement prediction, but they have lim… ▽ More

    Submitted 18 January, 2024; originally announced February 2024.

    Comments: 9 pages, 3 figures, accepted by AAAI 2024

  28. arXiv:2402.01666  [pdf, other

    cs.CY

    A Comprehensive Exploration of Personalized Learning in Smart Education: From Student Modeling to Personalized Recommendations

    Authors: Siyu Wu, Yang Cao, Jiajun Cui, Runze Li, Hong Qian, Bo Jiang, Wei Zhang

    Abstract: With the development of artificial intelligence, personalized learning has attracted much attention as an integral part of intelligent education. China, the United States, the European Union, and others have put forward the importance of personalized learning in recent years, emphasizing the realization of the organic combination of large-scale education and personalized training. The development… ▽ More

    Submitted 15 January, 2024; originally announced February 2024.

    Comments: 82 pages,5 figures

    MSC Class: 68-02 ACM Class: A.1

  29. arXiv:2401.15399  [pdf, other

    cs.RO

    Parallel Self-assembly for Modular USVs with Diverse Docking Mechanism Layouts

    Authors: Lianxin Zhang, Yang Jiao, Yihan Huang, Ziyou Wang, Huihuan Qian

    Abstract: Self-assembly enables multi-robot systems to merge diverse capabilities and accomplish tasks beyond the reach of individual robots. Incorporating varied docking mechanisms layouts (DMLs) can enhance robot versatility or reduce costs. However, assembling multiple heterogeneous robots with diverse DMLs is still a research gap. This paper addresses this problem by introducing CuBoat, an omnidirection… ▽ More

    Submitted 27 January, 2024; originally announced January 2024.

  30. arXiv:2401.14069  [pdf, other

    cs.LG

    Neural Sinkhorn Gradient Flow

    Authors: Huminhao Zhu, Fangyikang Wang, Chao Zhang, Hanbin Zhao, Hui Qian

    Abstract: Wasserstein Gradient Flows (WGF) with respect to specific functionals have been widely used in the machine learning literature. Recently, neural networks have been adopted to approximate certain intractable parts of the underlying Wasserstein gradient flow and result in efficient inference procedures. In this paper, we introduce the Neural Sinkhorn Gradient Flow (NSGF) model, which parametrizes th… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

  31. arXiv:2401.10840  [pdf, other

    cs.CY cs.AI cs.LG

    Symbolic Cognitive Diagnosis via Hybrid Optimization for Intelligent Education Systems

    Authors: Junhao Shen, Hong Qian, Wei Zhang, Aimin Zhou

    Abstract: Cognitive diagnosis assessment is a fundamental and crucial task for student learning. It models the student-exercise interaction, and discovers the students' proficiency levels on each knowledge attribute. In real-world intelligent education systems, generalization and interpretability of cognitive diagnosis methods are of equal importance. However, most existing methods can hardly make the best… ▽ More

    Submitted 30 December, 2023; originally announced January 2024.

    Journal ref: Published in AAAI 2024

  32. arXiv:2401.06470  [pdf, other

    cs.IR

    UNEX-RL: Reinforcing Long-Term Rewards in Multi-Stage Recommender Systems with UNidirectional EXecution

    Authors: Gengrui Zhang, Yao Wang, Xiaoshuang Chen, Hongyi Qian, Kaiqiao Zhan, Ben Wang

    Abstract: In recent years, there has been a growing interest in utilizing reinforcement learning (RL) to optimize long-term rewards in recommender systems. Since industrial recommender systems are typically designed as multi-stage systems, RL methods with a single agent face challenges when optimizing multiple stages simultaneously. The reason is that different stages have different observation spaces, and… ▽ More

    Submitted 12 January, 2024; originally announced January 2024.

    Comments: Accepted by AAAI2024

  33. arXiv:2312.16429  [pdf, other

    cs.LG cs.AI

    GAD-PVI: A General Accelerated Dynamic-Weight Particle-Based Variational Inference Framework

    Authors: Fangyikang Wang, Huminhao Zhu, Chao Zhang, Hanbin Zhao, Hui Qian

    Abstract: Particle-based Variational Inference (ParVI) methods approximate the target distribution by iteratively evolving finite weighted particle systems. Recent advances of ParVI methods reveal the benefits of accelerated position update strategies and dynamic weight adjustment approaches. In this paper, we propose the first ParVI framework that possesses both accelerated position update and dynamical we… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

  34. arXiv:2312.16066  [pdf, other

    cs.SE cs.AI

    A Prompt Learning Framework for Source Code Summarization

    Authors: Weisong Sun, Chunrong Fang, Yudu You, Yuchen Chen, Yi Liu, Chong Wang, Jian Zhang, Quanjun Zhang, Hanwei Qian, Wei Zhao, Yang Liu, Zhenyu Chen

    Abstract: (Source) code summarization is the task of automatically generating natural language summaries for given code snippets. Such summaries play a key role in helping developers understand and maintain source code. Recently, with the successful application of large language models (LLMs) in numerous fields, software engineering researchers have also attempted to adapt LLMs to solve code summarization t… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

    Comments: submitted to ACM Transactions on Software Engineering and Methodology

    MSC Class: 68-04; 68T30 ACM Class: D.2.3; I.2.2; I.2.4

  35. arXiv:2312.12191  [pdf, other

    cs.LG cs.AI stat.ML

    CUDC: A Curiosity-Driven Unsupervised Data Collection Method with Adaptive Temporal Distances for Offline Reinforcement Learning

    Authors: Chenyu Sun, Hangwei Qian, Chunyan Miao

    Abstract: Offline reinforcement learning (RL) aims to learn an effective policy from a pre-collected dataset. Most existing works are to develop sophisticated learning algorithms, with less emphasis on improving the data collection process. Moreover, it is even challenging to extend the single-task setting and collect a task-agnostic dataset that allows an agent to perform multiple downstream tasks. In this… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: Accepted at AAAI-24

  36. arXiv:2311.00447  [pdf, other

    cs.AI

    On the Opportunities of Green Computing: A Survey

    Authors: You Zhou, Xiujing Lin, Xiang Zhang, Maolin Wang, Gangwei Jiang, Huakang Lu, Yupeng Wu, Kai Zhang, Zhe Yang, Kehang Wang, Yongduo Sui, Fengwei Jia, Zuoli Tang, Yao Zhao, Hongxuan Zhang, Tiannuo Yang, Weibo Chen, Yunong Mao, Yi Li, De Bao, Yu Li, Hongrui Liao, Ting Liu, Jingwen Liu, Jinchi Guo , et al. (16 additional authors not shown)

    Abstract: Artificial Intelligence (AI) has achieved significant advancements in technology and research with the development over several decades, and is widely used in many areas including computing vision, natural language processing, time-series analysis, speech synthesis, etc. During the age of deep learning, especially with the arise of Large Language Models, a large majority of researchers' attention… ▽ More

    Submitted 8 November, 2023; v1 submitted 1 November, 2023; originally announced November 2023.

    Comments: 113 pages, 18 figures

  37. arXiv:2309.04676  [pdf, other

    cs.LG cs.AI stat.ME

    Flexible and Robust Counterfactual Explanations with Minimal Satisfiable Perturbations

    Authors: Yongjie Wang, Hangwei Qian, Yongjie Liu, Wei Guo, Chunyan Miao

    Abstract: Counterfactual explanations (CFEs) exemplify how to minimally modify a feature vector to achieve a different prediction for an instance. CFEs can enhance informational fairness and trustworthiness, and provide suggestions for users who receive adverse predictions. However, recent research has shown that multiple CFEs can be offered for the same instance or instances with slight differences. Multip… ▽ More

    Submitted 9 September, 2023; originally announced September 2023.

    Comments: Accepted by CIKM 2023

  38. arXiv:2308.15711  [pdf, other

    cs.CL cs.AI

    Optimizing Factual Accuracy in Text Generation through Dynamic Knowledge Selection

    Authors: Hongjin Qian, Zhicheng Dou, Jiejun Tan, Haonan Chen, Haoqi Gu, Ruofei Lai, Xinyu Zhang, Zhao Cao, Ji-Rong Wen

    Abstract: Language models (LMs) have revolutionized the way we interact with information, but they often generate nonfactual text, raising concerns about their reliability. Previous methods use external knowledge as references for text generation to enhance factuality but often struggle with the knowledge mix-up(e.g., entity mismatch) of irrelevant references. Besides,as the length of the output text grows,… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

    Comments: 15 pages

  39. arXiv:2308.10020  [pdf, other

    cs.CR

    Enhancing SCF with Privacy-Preserving and Splitting-Enabled E-Bills on Blockchain

    Authors: Hao Yang, Jie Fu, Zhili Cheng, Haifeng Qian

    Abstract: Electronic Bill (E-Bill) is a rucial negotiable instrument in the form of data messages, relying on the Electronic Bill System (EB System). Blockchain technology offers inherent data sharing capabilities, so it is increasingly being adopted by small and medium-sized enterprises (SMEs) in the supply chain to build EB systems. However, the blockchain-based E-Bill still face significant challenges: t… ▽ More

    Submitted 19 August, 2023; originally announced August 2023.

  40. arXiv:2308.04892  [pdf, other

    cs.CV eess.IV

    Transmission and Color-guided Network for Underwater Image Enhancement

    Authors: Pan Mu, Jing Fang, Haotian Qian, Cong Bai

    Abstract: In recent years, with the continuous development of the marine industry, underwater image enhancement has attracted plenty of attention. Unfortunately, the propagation of light in water will be absorbed by water bodies and scattered by suspended particles, resulting in color deviation and low contrast. To solve these two problems, we propose an Adaptive Transmission and Dynamic Color guided networ… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

    Comments: 6 pages; Accepted at IEEE ICME

  41. arXiv:2308.01578  [pdf, other

    cs.LG cs.AI

    Unsupervised Representation Learning for Time Series: A Review

    Authors: Qianwen Meng, Hangwei Qian, Yong Liu, Yonghui Xu, Zhiqi Shen, Lizhen Cui

    Abstract: Unsupervised representation learning approaches aim to learn discriminative feature representations from unlabeled data, without the requirement of annotating every sample. Enabling unsupervised representation learning is extremely crucial for time series data, due to its unique annotation bottleneck caused by its complex characteristics and lack of visual cues compared with other data modalities.… ▽ More

    Submitted 3 August, 2023; originally announced August 2023.

    Comments: In submission to IEEE

  42. Parallel Self-assembly for a Multi-USV System on Water Surface with Obstacles

    Authors: Lianxin Zhang, Yihan Huang, Zhongzhong Cao, Yang Jiao, Huihuan Qian

    Abstract: Parallel self-assembly is an efficient approach to accelerate the assembly process for modular robots. However, these approaches cannot accommodate complicated environments with obstacles, which restricts their applications. This paper considers the surrounding stationary obstacles and proposes a parallel self-assembly planning algorithm named SAPOA. With this algorithm, modular robots can avoid i… ▽ More

    Submitted 17 March, 2024; v1 submitted 30 June, 2023; originally announced July 2023.

  43. arXiv:2306.16738  [pdf, other

    cs.LG cs.CR cs.GT

    Towards Optimal Randomized Strategies in Adversarial Example Game

    Authors: Jiahao Xie, Chao Zhang, Weijie Liu, Wensong Bai, Hui Qian

    Abstract: The vulnerability of deep neural network models to adversarial example attacks is a practical challenge in many artificial intelligence applications. A recent line of work shows that the use of randomization in adversarial training is the key to find optimal strategies against adversarial example attacks. However, in a fully randomized setting where both the defender and the attacker can use rando… ▽ More

    Submitted 29 June, 2023; originally announced June 2023.

    Comments: Extended version of paper https://doi.org/10.1609/aaai.v37i9.26247 which appeared in AAAI 2023

  44. arXiv:2306.06637  [pdf, other

    cs.LG

    PACER: A Fully Push-forward-based Distributional Reinforcement Learning Algorithm

    Authors: Wensong Bai, Chao Zhang, Yichao Fu, Lingwei Peng, Hui Qian, Bin Dai

    Abstract: In this paper, we propose the first fully push-forward-based Distributional Reinforcement Learning algorithm, called Push-forward-based Actor-Critic EncourageR (PACER). Specifically, PACER establishes a stochastic utility value policy gradient theorem and simultaneously leverages the push-forward operator in the construction of both the actor and the critic. Moreover, based on maximum mean discrep… ▽ More

    Submitted 11 June, 2023; originally announced June 2023.

  45. arXiv:2306.06624  [pdf, other

    cs.CL

    RestGPT: Connecting Large Language Models with Real-World RESTful APIs

    Authors: Yifan Song, Weimin Xiong, Dawei Zhu, Wenhao Wu, Han Qian, Mingbo Song, Hailiang Huang, Cheng Li, Ke Wang, Rong Yao, Ye Tian, Sujian Li

    Abstract: Tool-augmented large language models (LLMs) have achieved remarkable progress in tackling a broad range of tasks. However, existing methods are mainly restricted to specifically designed tools and fail to fulfill complex instructions, having great limitations when confronted with real-world scenarios. In this paper, we explore a more realistic scenario by connecting LLMs with RESTful APIs, which a… ▽ More

    Submitted 26 August, 2023; v1 submitted 11 June, 2023; originally announced June 2023.

    Comments: Add RestBench to evaluate RestGPT

  46. arXiv:2305.18712  [pdf, other

    cs.CV

    Can We Evaluate Domain Adaptation Models Without Target-Domain Labels?

    Authors: Jianfei Yang, Hanjie Qian, Yuecong Xu, Kai Wang, Lihua Xie

    Abstract: Unsupervised domain adaptation (UDA) involves adapting a model trained on a label-rich source domain to an unlabeled target domain. However, in real-world scenarios, the absence of target-domain labels makes it challenging to evaluate the performance of UDA models. Furthermore, prevailing UDA methods relying on adversarial training and self-training could lead to model degeneration and negative tr… ▽ More

    Submitted 18 February, 2024; v1 submitted 29 May, 2023; originally announced May 2023.

    Comments: To be published at ICLR 2024, update formula and appendix, project and code available at https://sleepyseal.github.io/TransferScoreWeb/

  47. arXiv:2305.12865  [pdf, other

    cs.SE cs.AI

    Automatic Code Summarization via ChatGPT: How Far Are We?

    Authors: Weisong Sun, Chunrong Fang, Yudu You, Yun Miao, Yi Liu, Yuekang Li, Gelei Deng, Shenghan Huang, Yuchen Chen, Quanjun Zhang, Hanwei Qian, Yang Liu, Zhenyu Chen

    Abstract: To support software developers in understanding and maintaining programs, various automatic code summarization techniques have been proposed to generate a concise natural language comment for a given code snippet. Recently, the emergence of large language models (LLMs) has led to a great boost in the performance of natural language processing tasks. Among them, ChatGPT is the most popular one whic… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    MSC Class: 68T50 ACM Class: D.2.3

  48. arXiv:2304.11665  [pdf, ps, other

    cs.LG

    Accelerated Doubly Stochastic Gradient Algorithm for Large-scale Empirical Risk Minimization

    Authors: Zebang Shen, Hui Qian, Tongzhou Mu, Chao Zhang

    Abstract: Nowadays, algorithms with fast convergence, small memory footprints, and low per-iteration complexity are particularly favorable for artificial intelligence applications. In this paper, we propose a doubly stochastic algorithm with a novel accelerating multi-momentum technique to solve large scale empirical risk minimization problem for learning tasks. While enjoying a provably superior convergenc… ▽ More

    Submitted 23 April, 2023; originally announced April 2023.

    Comments: Accepted to IJCAI 2017. Corresponding author: Hui Qian

  49. arXiv:2304.11653  [pdf, ps, other

    cs.LG

    An Asynchronous Decentralized Algorithm for Wasserstein Barycenter Problem

    Authors: Chao Zhang, Hui Qian, Jiahao Xie

    Abstract: Wasserstein Barycenter Problem (WBP) has recently received much attention in the field of artificial intelligence. In this paper, we focus on the decentralized setting for WBP and propose an asynchronous decentralized algorithm (A$^2$DWB). A$^2$DWB is induced by a novel stochastic block coordinate descent method to optimize the dual of entropy regularized WBP. To our knowledge, A$^2$DWB is the fir… ▽ More

    Submitted 23 April, 2023; originally announced April 2023.

  50. arXiv:2304.04358  [pdf, other

    cs.CL cs.AI

    WebBrain: Learning to Generate Factually Correct Articles for Queries by Grounding on Large Web Corpus

    Authors: Hongjing Qian, Yutao Zhu, Zhicheng Dou, Haoqi Gu, Xinyu Zhang, Zheng Liu, Ruofei Lai, Zhao Cao, Jian-Yun Nie, Ji-Rong Wen

    Abstract: In this paper, we introduce a new NLP task -- generating short factual articles with references for queries by mining supporting evidence from the Web. In this task, called WebBrain, the ultimate goal is to generate a fluent, informative, and factually-correct short article (e.g., a Wikipedia article) for a factual query unseen in Wikipedia. To enable experiments on WebBrain, we construct a large-… ▽ More

    Submitted 9 April, 2023; originally announced April 2023.

    Comments: Codes in https://github.com/qhjqhj00/WebBrain