Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 153 results for author: Zhao, W X

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.18743  [pdf, other

    cs.CL

    Towards Effective and Efficient Continual Pre-training of Large Language Models

    Authors: Jie Chen, Zhipeng Chen, Jiapeng Wang, Kun Zhou, Yutao Zhu, Jinhao Jiang, Yingqian Min, Wayne Xin Zhao, Zhicheng Dou, Jiaxin Mao, Yankai Lin, Ruihua Song, Jun Xu, Xu Chen, Rui Yan, Zhewei Wei, Di Hu, Wenbing Huang, Ji-Rong Wen

    Abstract: Continual pre-training (CPT) has been an important approach for adapting language models to specific domains or tasks. To make the CPT approach more traceable, this paper presents a technical report for continually pre-training Llama-3 (8B), which significantly enhances the Chinese language ability and scientific reasoning ability of the backbone model. To enhance the new abilities while retaining… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

    Comments: 16 pages, 10 figures, 16 tables

    MSC Class: 68T50 ACM Class: I.2.7

  2. arXiv:2407.10804  [pdf, other

    cs.CL

    Mix-CPT: A Domain Adaptation Framework via Decoupling Knowledge Learning and Format Alignment

    Authors: Jinhao Jiang, Junyi Li, Wayne Xin Zhao, Yang Song, Tao Zhang, Ji-Rong Wen

    Abstract: Adapting general large language models (LLMs) to specialized domains presents great challenges due to varied data distributions. This adaptation typically requires continual pre-training on massive domain-specific corpora to facilitate knowledge memorization, followed by training to apply this knowledge following human instructions and preferences. However, this method may result in inefficient kn… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: LLM, CPT, knowledge learning, format alignment; work in progress

  3. arXiv:2407.05563  [pdf, other

    cs.CL

    LLMBox: A Comprehensive Library for Large Language Models

    Authors: Tianyi Tang, Yiwen Hu, Bingqian Li, Wenyang Luo, Zijing Qin, Haoxiang Sun, Jiapeng Wang, Shiyi Xu, Xiaoxue Cheng, Geyang Guo, Han Peng, Bowen Zheng, Yiru Tang, Yingqian Min, Yushuo Chen, Jie Chen, Yuanqian Zhao, Luran Ding, Yuhao Wang, Zican Dong, Chunxuan Xia, Junyi Li, Kun Zhou, Wayne Xin Zhao, Ji-Rong Wen

    Abstract: To facilitate the research on large language models (LLMs), this paper presents a comprehensive and unified library, LLMBox, to ease the development, use, and evaluation of LLMs. This library is featured with three main merits: (1) a unified data interface that supports the flexible implementation of various training strategies, (2) a comprehensive evaluation that covers extensive tasks, datasets,… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: Accepted by ACL 2024 Demo

  4. arXiv:2406.19853  [pdf, other

    cs.CL cs.AI

    YuLan: An Open-source Large Language Model

    Authors: Yutao Zhu, Kun Zhou, Kelong Mao, Wentong Chen, Yiding Sun, Zhipeng Chen, Qian Cao, Yihan Wu, Yushuo Chen, Feng Wang, Lei Zhang, Junyi Li, Xiaolei Wang, Lei Wang, Beichen Zhang, Zican Dong, Xiaoxue Cheng, Yuhan Chen, Xinyu Tang, Yupeng Hou, Qiangqiang Ren, Xincheng Pang, Shufang Xie, Wayne Xin Zhao, Zhicheng Dou , et al. (13 additional authors not shown)

    Abstract: Large language models (LLMs) have become the foundation of many applications, leveraging their extensive capabilities in processing and understanding natural language. While many open-source LLMs have been released with technical reports, the lack of training details hinders further research and development. This paper presents the development of YuLan, a series of open-source LLMs with $12$ billi… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  5. arXiv:2406.14129  [pdf, other

    cs.CV cs.CL cs.MM

    Towards Event-oriented Long Video Understanding

    Authors: Yifan Du, Kun Zhou, Yuqi Huo, Yifan Li, Wayne Xin Zhao, Haoyu Lu, Zijia Zhao, Bingning Wang, Weipeng Chen, Ji-Rong Wen

    Abstract: With the rapid development of video Multimodal Large Language Models (MLLMs), numerous benchmarks have been proposed to assess their video understanding capability. However, due to the lack of rich events in the videos, these datasets may suffer from the short-cut bias that the answers can be deduced from a few frames, without the need to watch the entire video. To address this issue, we introduce… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: Work on progress

  6. arXiv:2406.14022  [pdf, other

    cs.LG cs.CL

    Investigating the Pre-Training Dynamics of In-Context Learning: Task Recognition vs. Task Learning

    Authors: Xiaolei Wang, Xinyu Tang, Wayne Xin Zhao, Ji-Rong Wen

    Abstract: The emergence of in-context learning (ICL) is potentially attributed to two major abilities: task recognition (TR) for recognizing the task from demonstrations and utilizing pre-trained priors, and task learning (TL) for learning from demonstrations. However, relationships between the two abilities and how such relationships affect the emergence of ICL is unclear. In this paper, we take the first… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: work in progress

  7. arXiv:2406.13381  [pdf, other

    cs.CL

    CoAct: A Global-Local Hierarchy for Autonomous Agent Collaboration

    Authors: Xinming Hou, Mingming Yang, Wenxiang Jiao, Xing Wang, Zhaopeng Tu, Wayne Xin Zhao

    Abstract: Existing LLMs exhibit remarkable performance on various NLP tasks, but still struggle with complex real-world tasks, even equipped with advanced strategies like CoT and ReAct. In this work, we propose the CoAct framework, which transfers the hierarchical planning and collaboration patterns in human society to LLM systems. Specifically, our CoAct framework involves two agents: (1) A global planning… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 9 pages, 4 figures

  8. arXiv:2406.12606  [pdf, other

    cs.CL

    Low-Redundant Optimization for Large Language Model Alignment

    Authors: Zhipeng Chen, Kun Zhou, Wayne Xin Zhao, Jingyuan Wang, Ji-Rong Wen

    Abstract: Large language models (LLMs) are still struggling in aligning with human preference in complex tasks and scenarios. They are prone to overfit into the unexpected patterns or superficial styles in the training data. We conduct an empirical study that only selects the top-10\% most updated parameters in LLMs for alignment training, and see improvements in the convergence process and final performanc… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 14 pages, working in progress

  9. arXiv:2406.12397  [pdf, other

    cs.CL

    Unveiling the Flaws: Exploring Imperfections in Synthetic Data and Mitigation Strategies for Large Language Models

    Authors: Jie Chen, Yupeng Zhang, Bingning Wang, Wayne Xin Zhao, Ji-Rong Wen, Weipeng Chen

    Abstract: Synthetic data has been proposed as a solution to address the issue of high-quality data scarcity in the training of large language models (LLMs). Studies have shown that synthetic data can effectively improve the performance of LLMs on downstream benchmarks. However, despite its potential benefits, our analysis suggests that there may be inherent flaws in synthetic data. The uniform format of syn… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 15 pages

  10. arXiv:2406.11277  [pdf, other

    cs.CL

    Small Agent Can Also Rock! Empowering Small Language Models as Hallucination Detector

    Authors: Xiaoxue Cheng, Junyi Li, Wayne Xin Zhao, Hongzhi Zhang, Fuzheng Zhang, Di Zhang, Kun Gai, Ji-Rong Wen

    Abstract: Hallucination detection is a challenging task for large language models (LLMs), and existing studies heavily rely on powerful closed-source LLMs such as GPT-4. In this paper, we propose an autonomous LLM-based agent framework, called HaluAgent, which enables relatively smaller LLMs (e.g. Baichuan2-Chat 7B) to actively select suitable tools for detecting multiple hallucination types such as text, c… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  11. arXiv:2405.19654  [pdf, other

    cs.AI

    Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training

    Authors: Jinxia Yang, Bing Su, Wayne Xin Zhao, Ji-Rong Wen

    Abstract: Medical vision-language pre-training methods mainly leverage the correspondence between paired medical images and radiological reports. Although multi-view spatial images and temporal sequences of image-report pairs are available in off-the-shelf multi-modal medical datasets, most existing methods have not thoroughly tapped into such extensive supervision signals. In this paper, we introduce the M… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: Accepted at ICML 2024

  12. arXiv:2405.18009  [pdf, other

    cs.CL cs.LG

    Exploring Context Window of Large Language Models via Decomposed Positional Vectors

    Authors: Zican Dong, Junyi Li, Xin Men, Wayne Xin Zhao, Bingbing Wang, Zhen Tian, Weipeng Chen, Ji-Rong Wen

    Abstract: Transformer-based large language models (LLMs) typically have a limited context window, resulting in significant performance degradation when processing text beyond the length of the context window. Extensive studies have been proposed to extend the context window and achieve length extrapolation of LLMs, but there is still a lack of in-depth interpretation of these approaches. In this study, we e… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  13. arXiv:2405.14365  [pdf, other

    cs.CL cs.AI

    JiuZhang3.0: Efficiently Improving Mathematical Reasoning by Training Small Data Synthesis Models

    Authors: Kun Zhou, Beichen Zhang, Jiapeng Wang, Zhipeng Chen, Wayne Xin Zhao, Jing Sha, Zhichao Sheng, Shijin Wang, Ji-Rong Wen

    Abstract: Mathematical reasoning is an important capability of large language models~(LLMs) for real-world applications. To enhance this capability, existing work either collects large-scale math-related texts for pre-training, or relies on stronger LLMs (\eg GPT-4) to synthesize massive math problems. Both types of work generally lead to large costs in training or synthesis. To reduce the cost, based on op… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 28 pages, SOTA math LLM using Well-trained Data Synthesis LLM

  14. arXiv:2405.12591  [pdf, other

    cs.CL

    Unlocking Data-free Low-bit Quantization with Matrix Decomposition for KV Cache Compression

    Authors: Peiyu Liu, Ze-Feng Gao, Wayne Xin Zhao, Yipeng Ma, Tao Wang, Ji-Rong Wen

    Abstract: Key-value~(KV) caching is an important technique to accelerate the inference of large language models~(LLMs), but incurs significant memory overhead. To compress the size of KV cache, existing methods often compromise precision or require extra data for calibration, limiting their practicality in LLM deployment. In this paper, we introduce \textbf{DecoQuant}, a novel data-free low-bit quantization… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: 11 pages, 6 figures

  15. arXiv:2404.11502  [pdf, other

    cs.CL cs.AI

    Towards Coarse-to-Fine Evaluation of Inference Efficiency for Large Language Models

    Authors: Yushuo Chen, Tianyi Tang, Erge Xiang, Linjiang Li, Wayne Xin Zhao, Jing Wang, Yunpeng Chai, Ji-Rong Wen

    Abstract: In real world, large language models (LLMs) can serve as the assistant to help users accomplish their jobs, and also support the development of advanced applications. For the wide application of LLMs, the inference efficiency is an essential concern, which has been widely studied in existing work, and numerous optimization algorithms and code libraries have been proposed to improve it. Nonetheless… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  16. arXiv:2403.17729  [pdf, other

    cs.IR cs.LG

    EulerFormer: Sequential User Behavior Modeling with Complex Vector Attention

    Authors: Zhen Tian, Wayne Xin Zhao, Changwang Zhang, Xin Zhao, Zhongrui Ma, Ji-Rong Wen

    Abstract: To capture user preference, transformer models have been widely applied to model sequential user behavior data. The core of transformer architecture lies in the self-attention mechanism, which computes the pairwise attention scores in a sequence. Due to the permutation-equivariant nature, positional encoding is used to enhance the attention between token representations. In this setting, the pairw… ▽ More

    Submitted 4 April, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

    Comments: Accepted for publication in SIGIR'24

  17. arXiv:2403.14312  [pdf, other

    cs.CL

    ChainLM: Empowering Large Language Models with Improved Chain-of-Thought Prompting

    Authors: Xiaoxue Cheng, Junyi Li, Wayne Xin Zhao, Ji-Rong Wen

    Abstract: Chain-of-Thought (CoT) prompting can enhance the reasoning capabilities of large language models (LLMs), establishing itself as a primary approach to solving complex reasoning tasks. Existing CoT synthesis approaches usually focus on simpler reasoning tasks and thus result in low-quality and inconsistent CoT prompts. In response to this challenge, we present an empirical investigation of CoT promp… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: Accepted to LREC-COLING 2024

  18. arXiv:2403.13574  [pdf, other

    cs.IR cs.AI

    A Large Language Model Enhanced Sequential Recommender for Joint Video and Comment Recommendation

    Authors: Bowen Zheng, Zihan Lin, Enze Liu, Chen Yang, Enyang Bai, Cheng Ling, Wayne Xin Zhao, Ji-Rong Wen

    Abstract: In online video platforms, reading or writing comments on interesting videos has become an essential part of the video watching experience. However, existing video recommender systems mainly model users' interaction behaviors with videos, lacking consideration of comments in user behavior modeling. In this paper, we propose a novel recommendation approach called LSVCR by leveraging user interactio… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  19. arXiv:2403.09792  [pdf, other

    cs.CV cs.CL

    Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking Multimodal Large Language Models

    Authors: Yifan Li, Hangyu Guo, Kun Zhou, Wayne Xin Zhao, Ji-Rong Wen

    Abstract: In this paper, we study the harmlessness alignment problem of multimodal large language models (MLLMs). We conduct a systematic empirical analysis of the harmlessness performance of representative MLLMs and reveal that the image input poses the alignment vulnerability of MLLMs. Inspired by this, we propose a novel jailbreak method named HADES, which hides and amplifies the harmfulness of the malic… ▽ More

    Submitted 14 April, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

    Comments: Work in progress

  20. arXiv:2403.09559  [pdf, other

    cs.CL cs.CV

    Less is More: Data Value Estimation for Visual Instruction Tuning

    Authors: Zikang Liu, Kun Zhou, Wayne Xin Zhao, Dawei Gao, Yaliang Li, Ji-Rong Wen

    Abstract: Visual instruction tuning is the key to building multimodal large language models (MLLMs), which greatly improves the reasoning capabilities of large language models (LLMs) in vision scenario. However, existing MLLMs mostly rely on a mixture of multiple highly diverse visual instruction datasets for training (even more than a million instructions), which may introduce data redundancy. To investiga… ▽ More

    Submitted 21 March, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

  21. arXiv:2403.04399  [pdf, other

    cs.IR

    The 2nd Workshop on Recommendation with Generative Models

    Authors: Wenjie Wang, Yang Zhang, Xinyu Lin, Fuli Feng, Weiwen Liu, Yong Liu, Xiangyu Zhao, Wayne Xin Zhao, Yang Song, Xiangnan He

    Abstract: The rise of generative models has driven significant advancements in recommender systems, leaving unique opportunities for enhancing users' personalized recommendations. This workshop serves as a platform for researchers to explore and exchange innovative concepts related to the integration of generative models into recommender systems. It primarily focuses on five key perspectives: (i) improving… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  22. arXiv:2402.18166  [pdf, other

    cs.IR

    Sequence-level Semantic Representation Fusion for Recommender Systems

    Authors: Lanling Xu, Zhen Tian, Bingqian Li, Junjie Zhang, Jinpeng Wang, Mingchen Cai, Wayne Xin Zhao

    Abstract: With the rapid development of recommender systems, there is increasing side information that can be employed to improve the recommendation performance. Specially, we focus on the utilization of the associated \emph{textual data} of items (eg product title) and study how text features can be effectively fused with ID features in sequential recommendation. However, there exists distinct data charact… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: 8 pages, 5 figures

  23. arXiv:2402.17564  [pdf, other

    cs.CL

    Unleashing the Potential of Large Language Models as Prompt Optimizers: An Analogical Analysis with Gradient-based Model Optimizers

    Authors: Xinyu Tang, Xiaolei Wang, Wayne Xin Zhao, Siyuan Lu, Yaliang Li, Ji-Rong Wen

    Abstract: Automatic prompt optimization is an important approach to improving the performance of large language models (LLMs). Recent research demonstrates the potential of using LLMs as prompt optimizers, which can generate improved task prompts via iterative refinement. In this paper, we propose a novel perspective to investigate the design of LLM-based prompt optimizers, by drawing an analogy with gradie… ▽ More

    Submitted 16 April, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

  24. arXiv:2402.17505  [pdf, other

    cs.IR cs.CL

    BASES: Large-scale Web Search User Simulation with Large Language Model based Agents

    Authors: Ruiyang Ren, Peng Qiu, Yingqi Qu, Jing Liu, Wayne Xin Zhao, Hua Wu, Ji-Rong Wen, Haifeng Wang

    Abstract: Due to the excellent capacities of large language models (LLMs), it becomes feasible to develop LLM-based agents for reliable user simulation. Considering the scarcity and limit (e.g., privacy issues) of real user data, in this paper, we conduct large-scale user simulation for web search, to improve the analysis and modeling of user search behavior. Specially, we propose BASES, a novel user simula… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

  25. arXiv:2402.17497  [pdf, other

    cs.CL cs.IR

    REAR: A Relevance-Aware Retrieval-Augmented Framework for Open-Domain Question Answering

    Authors: Yuhao Wang, Ruiyang Ren, Junyi Li, Wayne Xin Zhao, Jing Liu, Ji-Rong Wen

    Abstract: Considering the limited internal parametric knowledge, retrieval-augmented generation (RAG) has been widely used to extend the knowledge scope of large language models (LLMs). Despite the extensive efforts on RAG research, in existing methods, LLMs cannot precisely assess the relevance of retrieved documents, thus likely leading to misleading or even incorrect utilization of external knowledge (i.… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

  26. arXiv:2402.16358  [pdf, other

    cs.LG cs.CL cs.IR

    An Integrated Data Processing Framework for Pretraining Foundation Models

    Authors: Yiding Sun, Feng Wang, Yutao Zhu, Wayne Xin Zhao, Jiaxin Mao

    Abstract: The ability of the foundation models heavily relies on large-scale, diverse, and high-quality pretraining data. In order to improve data quality, researchers and practitioners often have to manually curate datasets from difference sources and develop dedicated data cleansing pipeline for each data repository. Lacking a unified data processing framework, this process is repetitive and cumbersome. T… ▽ More

    Submitted 23 April, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: 6 pages, 2 figures; accepted by SIGIR'24 demo track

  27. arXiv:2402.11163  [pdf, other

    cs.CL

    KG-Agent: An Efficient Autonomous Agent Framework for Complex Reasoning over Knowledge Graph

    Authors: Jinhao Jiang, Kun Zhou, Wayne Xin Zhao, Yang Song, Chen Zhu, Hengshu Zhu, Ji-Rong Wen

    Abstract: In this paper, we aim to improve the reasoning ability of large language models (LLMs) over knowledge graphs (KGs) to answer complex questions. Inspired by existing methods that design the interaction strategy between LLMs and KG, we propose an autonomous LLM-based agent framework, called KG-Agent, which enables a small LLM to actively make decisions until finishing the reasoning process over KGs.… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: work in progress; efficient 7B LLM-based agent

  28. arXiv:2401.06081  [pdf, other

    cs.CL

    Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint

    Authors: Zhipeng Chen, Kun Zhou, Wayne Xin Zhao, Junchen Wan, Fuzheng Zhang, Di Zhang, Ji-Rong Wen

    Abstract: Reinforcement learning (RL) has been widely used in training large language models (LLMs) for preventing unexpected outputs, eg reducing harmfulness and errors. However, existing RL methods mostly adopt the instance-level reward, which is unable to provide fine-grained supervision for complex reasoning tasks, and can not focus on the few key tokens that lead to the incorrectness. To address it, we… ▽ More

    Submitted 17 June, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

    Comments: 18 pages, Findings of ACL2024

  29. arXiv:2401.04997  [pdf, other

    cs.IR

    Prompting Large Language Models for Recommender Systems: A Comprehensive Framework and Empirical Analysis

    Authors: Lanling Xu, Junjie Zhang, Bingqian Li, Jinpeng Wang, Mingchen Cai, Wayne Xin Zhao, Ji-Rong Wen

    Abstract: Recently, large language models such as ChatGPT have showcased remarkable abilities in solving general tasks, demonstrating the potential for applications in recommender systems. To assess how effectively LLMs can be used in recommendation tasks, our study primarily focuses on employing LLMs as recommender systems through prompting engineering. We propose a general framework for utilizing LLMs in… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    Comments: 40 pages, under review

  30. arXiv:2401.03563  [pdf, other

    cs.CL cs.IR

    Data-CUBE: Data Curriculum for Instruction-based Sentence Representation Learning

    Authors: Yingqian Min, Kun Zhou, Dawei Gao, Wayne Xin Zhao, He Hu, Yaliang Li

    Abstract: Recently, multi-task instruction tuning has been applied into sentence representation learning, which endows the capability of generating specific representations with the guidance of task instruction, exhibiting strong generalization ability on new tasks. However, these methods mostly neglect the potential interference problems across different tasks and instances, which may affect the training a… ▽ More

    Submitted 7 January, 2024; originally announced January 2024.

    Comments: 14 pages, working in progress

  31. arXiv:2401.03205  [pdf, other

    cs.CL

    The Dawn After the Dark: An Empirical Study on Factuality Hallucination in Large Language Models

    Authors: Junyi Li, Jie Chen, Ruiyang Ren, Xiaoxue Cheng, Wayne Xin Zhao, Jian-Yun Nie, Ji-Rong Wen

    Abstract: In the era of large language models (LLMs), hallucination (i.e., the tendency to generate factually incorrect content) poses great challenge to trustworthy and reliable deployment of LLMs in real-world applications. To tackle the LLM hallucination, three key questions should be well studied: how to detect hallucinations (detection), why do LLMs hallucinate (source), and what can be done to mitigat… ▽ More

    Submitted 6 January, 2024; originally announced January 2024.

    Comments: 24 pages, 8 figures, 13 tables

  32. arXiv:2401.00797  [pdf, other

    cs.IR

    Distillation is All You Need for Practically Using Different Pre-trained Recommendation Models

    Authors: Wenqi Sun, Ruobing Xie, Junjie Zhang, Wayne Xin Zhao, Leyu Lin, Ji-Rong Wen

    Abstract: Pre-trained recommendation models (PRMs) have attracted widespread attention recently. However, their totally different model structure, huge model size and computation cost hinder their application in practical recommender systems. Hence, it is highly essential to explore how to practically utilize PRMs in real-world recommendations. In this paper, we propose a novel joint knowledge distillation… ▽ More

    Submitted 1 January, 2024; originally announced January 2024.

  33. arXiv:2401.00158  [pdf, other

    cs.CL cs.AI

    ReasoningLM: Enabling Structural Subgraph Reasoning in Pre-trained Language Models for Question Answering over Knowledge Graph

    Authors: Jinhao Jiang, Kun Zhou, Wayne Xin Zhao, Yaliang Li, Ji-Rong Wen

    Abstract: Question Answering over Knowledge Graph (KGQA) aims to seek answer entities for the natural language question from a large-scale Knowledge Graph~(KG). To better perform reasoning on KG, recent work typically adopts a pre-trained language model~(PLM) to model the question, and a graph neural network~(GNN) based module to perform multi-hop reasoning on the KG. Despite the effectiveness, due to the d… ▽ More

    Submitted 30 December, 2023; originally announced January 2024.

    Comments: EMNLP-23-Main; simple but effective SOTA on CWQ under a weak-supervised setting

  34. arXiv:2311.15493  [pdf, other

    cs.IR

    UFIN: Universal Feature Interaction Network for Multi-Domain Click-Through Rate Prediction

    Authors: Zhen Tian, Changwang Zhang, Wayne Xin Zhao, Xin Zhao, Ji-Rong Wen, Zhao Cao

    Abstract: Click-Through Rate (CTR) prediction, which aims to estimate the probability of a user clicking on an item, is a key task in online advertising. Numerous existing CTR models concentrate on modeling the feature interactions within a solitary domain, thereby rendering them inadequate for fulfilling the requisites of multi-domain recommendations in real industrial scenarios. Some recent approaches pro… ▽ More

    Submitted 26 November, 2023; originally announced November 2023.

  35. arXiv:2311.11351  [pdf, other

    cs.IR

    Scaling Law of Large Sequential Recommendation Models

    Authors: Gaowei Zhang, Yupeng Hou, Hongyu Lu, Yu Chen, Wayne Xin Zhao, Ji-Rong Wen

    Abstract: Scaling of neural networks has recently shown great potential to improve the model capacity in various fields. Specifically, model performance has a power-law relationship with model size or data size, which provides important guidance for the development of large-scale models. However, there is still limited understanding on the scaling effect of user behavior models in recommender systems, where… ▽ More

    Submitted 19 November, 2023; originally announced November 2023.

  36. arXiv:2311.09049  [pdf, other

    cs.IR

    Adapting Large Language Models by Integrating Collaborative Semantics for Recommendation

    Authors: Bowen Zheng, Yupeng Hou, Hongyu Lu, Yu Chen, Wayne Xin Zhao, Ming Chen, Ji-Rong Wen

    Abstract: Recently, large language models (LLMs) have shown great potential in recommender systems, either improving existing recommendation models or serving as the backbone. However, there exists a large semantic gap between LLMs and recommender systems, since items to be recommended are often indexed by discrete identifiers (item ID) out of the LLM's vocabulary. In essence, LLMs capture language semantic… ▽ More

    Submitted 19 April, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: Accepted by ICDE 2024

  37. arXiv:2311.04072  [pdf, other

    cs.CL

    Beyond Imitation: Leveraging Fine-grained Quality Signals for Alignment

    Authors: Geyang Guo, Ranchi Zhao, Tianyi Tang, Wayne Xin Zhao, Ji-Rong Wen

    Abstract: Alignment with human preference is a desired property of large language models (LLMs). Currently, the main alignment approach is based on reinforcement learning from human feedback (RLHF). Despite the effectiveness of RLHF, it is intricate to implement and train, thus recent studies explore how to develop alternative alignment approaches based on supervised fine-tuning (SFT). A major limitation of… ▽ More

    Submitted 15 April, 2024; v1 submitted 7 November, 2023; originally announced November 2023.

  38. arXiv:2311.01964  [pdf, other

    cs.CL cs.AI

    Don't Make Your LLM an Evaluation Benchmark Cheater

    Authors: Kun Zhou, Yutao Zhu, Zhipeng Chen, Wentong Chen, Wayne Xin Zhao, Xu Chen, Yankai Lin, Ji-Rong Wen, Jiawei Han

    Abstract: Large language models~(LLMs) have greatly advanced the frontiers of artificial intelligence, attaining remarkable improvement in model capacity. To assess the model performance, a typical approach is to construct evaluation benchmarks for measuring the ability level of LLMs in different aspects. Despite that a number of high-quality benchmarks have been released, the concerns about the appropriate… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

    Comments: 11 pages

  39. arXiv:2311.01831  [pdf, other

    cs.IR

    Universal Multi-modal Multi-domain Pre-trained Recommendation

    Authors: Wenqi Sun, Ruobing Xie, Shuqing Bian, Wayne Xin Zhao, Jie Zhou

    Abstract: There is a rapidly-growing research interest in modeling user preferences via pre-training multi-domain interactions for recommender systems. However, Existing pre-trained multi-domain recommendations mostly select the item texts to be bridges across domains, and simply explore the user behaviors in target domains. Hence, they ignore other informative multi-modal item contents (e.g., visual inform… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

  40. arXiv:2311.01487  [pdf, other

    cs.CV cs.CL

    What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Instruction Tuning

    Authors: Yifan Du, Hangyu Guo, Kun Zhou, Wayne Xin Zhao, Jinpeng Wang, Chuyuan Wang, Mingchen Cai, Ruihua Song, Ji-Rong Wen

    Abstract: Visual instruction tuning is an essential approach to improving the zero-shot generalization capability of Multi-modal Large Language Models (MLLMs). A surge of visual instruction datasets with various focuses and characteristics have been proposed recently, enabling MLLMs to achieve surprising results on evaluation benchmarks. To develop more capable MLLMs, in this paper, we aim to investigate a… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

    Comments: Work in progress

  41. arXiv:2310.09233  [pdf, other

    cs.IR cs.CL

    AgentCF: Collaborative Learning with Autonomous Language Agents for Recommender Systems

    Authors: Junjie Zhang, Yupeng Hou, Ruobing Xie, Wenqi Sun, Julian McAuley, Wayne Xin Zhao, Leyu Lin, Ji-Rong Wen

    Abstract: Recently, there has been an emergence of employing LLM-powered agents as believable human proxies, based on their remarkable decision-making capability. However, existing studies mainly focus on simulating human dialogue. Human non-verbal behaviors, such as item clicking in recommender systems, although implicitly exhibiting user preferences and could enhance the modeling of users, have not been d… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

  42. arXiv:2310.07301  [pdf, other

    cs.CL

    Parrot: Enhancing Multi-Turn Instruction Following for Large Language Models

    Authors: Yuchong Sun, Che Liu, Kun Zhou, Jinwen Huang, Ruihua Song, Wayne Xin Zhao, Fuzheng Zhang, Di Zhang, Kun Gai

    Abstract: Humans often interact with large language models (LLMs) in multi-turn interaction to obtain desired answers or more information. However, most existing studies overlook the multi-turn instruction following ability of LLMs, in terms of training dataset, training method, and evaluation benchmark. In this paper, we introduce Parrot, a solution aiming to enhance multi-turn instruction following for LL… ▽ More

    Submitted 23 May, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

  43. arXiv:2309.13345  [pdf, other

    cs.CL

    BAMBOO: A Comprehensive Benchmark for Evaluating Long Text Modeling Capacities of Large Language Models

    Authors: Zican Dong, Tianyi Tang, Junyi Li, Wayne Xin Zhao, Ji-Rong Wen

    Abstract: Large language models (LLMs) have achieved dramatic proficiency over NLP tasks with normal length. Recently, multiple studies have committed to extending the context length and enhancing the long text modeling capabilities of LLMs. To comprehensively evaluate the long context ability of LLMs, we propose BAMBOO, a multi-task long context benchmark. BAMBOO has been designed with four principles: com… ▽ More

    Submitted 19 March, 2024; v1 submitted 23 September, 2023; originally announced September 2023.

    Comments: Accepted for the Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING) 2024

  44. arXiv:2308.12899  [pdf, other

    cs.LG

    Unified Data Management and Comprehensive Performance Evaluation for Urban Spatial-Temporal Prediction [Experiment, Analysis & Benchmark]

    Authors: Jiawei Jiang, Chengkai Han, Wayne Xin Zhao, Jingyuan Wang

    Abstract: The field of urban spatial-temporal prediction is advancing rapidly with the development of deep learning techniques and the availability of large-scale datasets. However, challenges persist in accessing and utilizing diverse urban spatial-temporal datasets from different sources and stored in different formats, as well as determining effective model structures and components with the proliferatio… ▽ More

    Submitted 7 March, 2024; v1 submitted 24 August, 2023; originally announced August 2023.

    Comments: 14 pages, 3 figures, VLDB under review

  45. A Survey on Large Language Model based Autonomous Agents

    Authors: Lei Wang, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, Zhiyuan Chen, Jiakai Tang, Xu Chen, Yankai Lin, Wayne Xin Zhao, Zhewei Wei, Ji-Rong Wen

    Abstract: Autonomous agents have long been a prominent research focus in both academic and industry communities. Previous research in this field often focuses on training agents with limited knowledge within isolated environments, which diverges significantly from human learning processes, and thus makes the agents hard to achieve human-like decisions. Recently, through the acquisition of vast amounts of we… ▽ More

    Submitted 3 April, 2024; v1 submitted 22 August, 2023; originally announced August 2023.

    Comments: 35 pages, 5 figures, 3 tables, has been accepted by frontiers of computer science (FCS), doi={10.1007/s11704-024-40231-1}

  46. arXiv:2308.00240  [pdf, other

    cs.CL

    Towards Effective Ancient Chinese Translation: Dataset, Model, and Evaluation

    Authors: Geyang Guo, Jiarong Yang, Fengyuan Lu, Jiaxin Qin, Tianyi Tang, Wayne Xin Zhao

    Abstract: Interpreting ancient Chinese has been the key to comprehending vast Chinese literature, tradition, and civilization. In this paper, we propose Erya for ancient Chinese translation. From a dataset perspective, we collect, clean, and classify ancient Chinese materials from various sources, forming the most extensive ancient Chinese resource to date. From a model perspective, we devise Erya training… ▽ More

    Submitted 31 July, 2023; originally announced August 2023.

    Comments: Accepted by NLPCC 2023

  47. Alleviating the Long-Tail Problem in Conversational Recommender Systems

    Authors: Zhipeng Zhao, Kun Zhou, Xiaolei Wang, Wayne Xin Zhao, Fan Pan, Zhao Cao, Ji-Rong Wen

    Abstract: Conversational recommender systems (CRS) aim to provide the recommendation service via natural language conversations. To develop an effective CRS, high-quality CRS datasets are very crucial. However, existing CRS datasets suffer from the long-tail issue, \ie a large proportion of items are rarely (or even never) mentioned in the conversations, which are called long-tail items. As a result, the CR… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

    Comments: work in progress

  48. arXiv:2307.11019  [pdf, other

    cs.CL cs.IR

    Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation

    Authors: Ruiyang Ren, Yuhao Wang, Yingqi Qu, Wayne Xin Zhao, Jing Liu, Hao Tian, Hua Wu, Ji-Rong Wen, Haifeng Wang

    Abstract: Knowledge-intensive tasks (e.g., open-domain question answering (QA)) require a substantial amount of factual knowledge and often rely on external information for assistance. Recently, large language models (LLMs) (e.g., ChatGPT), have demonstrated impressive prowess in solving a wide range of tasks with world knowledge, including knowledge-intensive tasks. However, it remains unclear how well LLM… ▽ More

    Submitted 23 July, 2023; v1 submitted 20 July, 2023; originally announced July 2023.

  49. arXiv:2307.08072  [pdf, other

    cs.CL cs.AI

    Do Emergent Abilities Exist in Quantized Large Language Models: An Empirical Study

    Authors: Peiyu Liu, Zikang Liu, Ze-Feng Gao, Dawei Gao, Wayne Xin Zhao, Yaliang Li, Bolin Ding, Ji-Rong Wen

    Abstract: Despite the superior performance, Large Language Models~(LLMs) require significant computational resources for deployment and use. To overcome this issue, quantization methods have been widely applied to reduce the memory footprint of LLMs as well as increasing the inference rate. However, a major challenge is that low-bit quantization methods often lead to performance degradation. It is important… ▽ More

    Submitted 26 July, 2023; v1 submitted 16 July, 2023; originally announced July 2023.

    Comments: 15 pages, 4 figures

  50. arXiv:2306.14712  [pdf, other

    cs.IR

    Reciprocal Sequential Recommendation

    Authors: Bowen Zheng, Yupeng Hou, Wayne Xin Zhao, Yang Song, Hengshu Zhu

    Abstract: Reciprocal recommender system (RRS), considering a two-way matching between two parties, has been widely applied in online platforms like online dating and recruitment. Existing RRS models mainly capture static user preferences, which have neglected the evolving user tastes and the dynamic matching relation between the two parties. Although dynamic user modeling has been well-studied in sequential… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

    Comments: Accepted by RecSys 2023