Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 147 results for author: Nie, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.10979  [pdf, ps, other

    cs.NI

    Diffusion Model-based Incentive Mechanism with Prospect Theory for Edge AIGC Services in 6G IoT

    Authors: Jinbo Wen, Jiangtian Nie, Yue Zhong, Changyan Yi, Xiaohuan Li, Jiangming Jin, Yang Zhang, Dusit Niyato

    Abstract: The fusion of Internet of Things (IoT) with Sixth-Generation (6G) technology has significant potential to revolutionize the IoT landscape. Utilizing the ultra-reliable and low-latency communication capabilities of 6G, 6G-IoT networks can transmit high-quality and diverse data to enhance edge learning. Artificial Intelligence-Generated Content (AIGC) harnesses advanced AI algorithms to automaticall… ▽ More

    Submitted 10 June, 2024; originally announced July 2024.

  2. arXiv:2407.05238  [pdf, other

    cs.CV

    P2P: Part-to-Part Motion Cues Guide a Strong Tracking Framework for LiDAR Point Clouds

    Authors: Jiahao Nie, Fei Xie, Sifan Zhou, Xueyi Zhou, Dong-Kyu Chae, Zhiwei He

    Abstract: 3D single object tracking (SOT) methods based on appearance matching has long suffered from insufficient appearance information incurred by incomplete, textureless and semantically deficient LiDAR point clouds. While motion paradigm exploits motion cues instead of appearance matching for tracking, it incurs complex multi-stage processing and segmentation module. In this paper, we first provide in-… ▽ More

    Submitted 8 July, 2024; v1 submitted 6 July, 2024; originally announced July 2024.

    Comments: The source code and pre-trained models are available at https://github.com/haooozi/P2P

  3. arXiv:2407.05083  [pdf, other

    cs.SI

    Exploring agent interaction patterns in the comment sections of fake and real news

    Authors: Kailun Zhu, Songtao Peng, Jiaqi Nie, Zhongyuan Ruan, Shanqing Yu, Qi Xuan

    Abstract: User comments on social media have been recognized as a crucial factor in distinguishing between fake and real news, with many studies focusing on the textual content of user reactions. However, the interactions among agents in the comment sections for fake and real news have not been fully explored. In this study, we analyze a dataset comprising both fake and real news from Reddit to investigate… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  4. arXiv:2407.03040  [pdf, other

    cs.CL cs.AI

    Raw Text is All you Need: Knowledge-intensive Multi-turn Instruction Tuning for Large Language Model

    Authors: Xia Hou, Qifeng Li, Jian Yang, Tongliang Li, Linzheng Chai, Xianjie Wu, Hangyuan Ji, Zhoujun Li, Jixuan Nie, Jingbo Dun, Wenfeng Song

    Abstract: Instruction tuning as an effective technique aligns the outputs of large language models (LLMs) with human preference. But how to generate the seasonal multi-turn dialogues from raw documents for instruction tuning still requires further exploration. In this paper, we present a novel framework named R2S that leverages the CoD-Chain of Dialogue logic to guide large language models (LLMs) in generat… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 11 pages, 3 figures

    MSC Class: 68T50 ACM Class: I.2.7

  5. arXiv:2407.02719  [pdf, other

    cs.CL

    Boosting Biomedical Concept Extraction by Rule-Based Data Augmentation

    Authors: Qiwei Shao, Fengran Mo, Jian-Yun Nie

    Abstract: Document-level biomedical concept extraction is the task of identifying biomedical concepts mentioned in a given document. Recent advancements have adapted pre-trained language models for this task. However, the scarcity of domain-specific data and the deviation of concepts from their canonical names often hinder these models' effectiveness. To tackle this issue, we employ MetaMapLite, an existing… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  6. arXiv:2406.18868  [pdf, other

    cs.CV

    Advancing Cross-domain Discriminability in Continual Learning of Vison-Language Models

    Authors: Yicheng Xu, Yuxin Chen, Jiahao Nie, Yusong Wang, Huiping Zhuang, Manabu Okumura

    Abstract: Continual learning (CL) with Vision-Language Models (VLMs) has overcome the constraints of traditional CL, which only focuses on previously encountered classes. During the CL of VLMs, we need not only to prevent the catastrophic forgetting on incrementally learned knowledge but also to preserve the zero-shot ability of VLMs. However, existing methods require additional reference datasets to mainta… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  7. Unifying Graph Convolution and Contrastive Learning in Collaborative Filtering

    Authors: Yihong Wu, Le Zhang, Fengran Mo, Tianyu Zhu, Weizhi Ma, Jian-Yun Nie

    Abstract: Graph-based models and contrastive learning have emerged as prominent methods in Collaborative Filtering (CF). While many existing models in CF incorporate these methods in their design, there seems to be a limited depth of analysis regarding the foundational principles behind them. This paper bridges graph convolution, a pivotal element of graph-based models, with contrastive learning through a t… ▽ More

    Submitted 21 June, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

    Comments: KDD 2024

  8. arXiv:2406.12718  [pdf, other

    cs.CV cs.AI cs.CL

    AGLA: Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention

    Authors: Wenbin An, Feng Tian, Sicong Leng, Jiahao Nie, Haonan Lin, QianYing Wang, Guang Dai, Ping Chen, Shijian Lu

    Abstract: Despite their great success across various multimodal tasks, Large Vision-Language Models (LVLMs) are facing a prevalent problem with object hallucinations, where the generated textual responses are inconsistent with ground-truth objects in the given image. This paper investigates various LVLMs and pinpoints attention deficiency toward discriminative local image features as one root cause of objec… ▽ More

    Submitted 21 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

  9. arXiv:2406.09121  [pdf, other

    cs.CV

    MMRel: A Relation Understanding Dataset and Benchmark in the MLLM Era

    Authors: Jiahao Nie, Gongjie Zhang, Wenbin An, Yap-Peng Tan, Alex C. Kot, Shijian Lu

    Abstract: Despite the recent advancements in Multi-modal Large Language Models (MLLMs), understanding inter-object relations, i.e., interactions or associations between distinct objects, remains a major challenge for such models. This issue significantly hinders their advanced reasoning capabilities and is primarily due to the lack of large-scale, high-quality, and diverse multi-modal data essential for tra… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  10. arXiv:2406.05013  [pdf, other

    cs.IR cs.CL

    CHIQ: Contextual History Enhancement for Improving Query Rewriting in Conversational Search

    Authors: Fengran Mo, Abbas Ghaddar, Kelong Mao, Mehdi Rezagholizadeh, Boxing Chen, Qun Liu, Jian-Yun Nie

    Abstract: In this paper, we study how open-source large language models (LLMs) can be effectively deployed for improving query rewriting in conversational search, especially for ambiguous queries. We introduce CHIQ, a two-step method that leverages the capabilities of LLMs to resolve ambiguities in the conversation history before query rewriting. This approach contrasts with prior studies that predominantly… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  11. arXiv:2406.03249  [pdf, other

    cs.LG

    Near-field Beamforming for Extremely Large-scale MIMO Based on Unsupervised Deep Learning

    Authors: Jiali Nie, Yuanhao Cui, Zhaohui Yang, Weijie Yuan, Xiaojun Jing

    Abstract: Extremely Large-scale Array (ELAA) is considered a frontier technology for future communication systems, pivotal in improving wireless systems' rate and spectral efficiency. However, as ELAA employs a multitude of antennas operating at higher frequencies, users are typically situated in the near-field region where the spherical wavefront propagates. This inevitably leads to a significant increase… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  12. arXiv:2405.16671  [pdf, other

    cs.LG cs.AI

    Mixture of Experts Using Tensor Products

    Authors: Zhan Su, Fengran Mo, Prayag Tiwari, Benyou Wang, Jian-Yun Nie, Jakob Grue Simonsen

    Abstract: In multi-task learning, the conventional approach involves training a model on multiple tasks simultaneously. However, the training signals from different tasks can interfere with one another, potentially leading to \textit{negative transfer}. To mitigate this, we investigate if modular language models can facilitate positive transfer and systematic generalization. Specifically, we propose a novel… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  13. arXiv:2405.15829  [pdf, other

    cs.LG cs.AI

    Spatio-temporal Value Semantics-based Abstraction for Dense Deep Reinforcement Learning

    Authors: Jihui Nie, Dehui Du, Jiangnan Zhao

    Abstract: Intelligent Cyber-Physical Systems (ICPS) represent a specialized form of Cyber-Physical System (CPS) that incorporates intelligent components, notably Convolutional Neural Networks (CNNs) and Deep Reinforcement Learning (DRL), to undertake multifaceted tasks encompassing perception, decision-making, and control. The utilization of DRL for decision-making facilitates dynamic interaction with the e… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 24 pages, 7 figures, conference

    MSC Class: 68N30 ACM Class: D.2.4

  14. arXiv:2405.13325  [pdf, other

    cs.CL cs.AI cs.IR

    DEGAP: Dual Event-Guided Adaptive Prefixes for Templated-Based Event Argument Extraction with Slot Querying

    Authors: Guanghui Wang, Dexi Liu, Jian-Yun Nie, Qizhi Wan, Rong Hu, Xiping Liu, Wanlong Liu, Jiaming Liu

    Abstract: Recent advancements in event argument extraction (EAE) involve incorporating useful auxiliary information into models during training and inference, such as retrieved instances and event templates. These methods face two challenges: (1) the retrieval results may be irrelevant and (2) templates are developed independently for each event without considering their possible relationship. In this work,… ▽ More

    Submitted 15 June, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

  15. arXiv:2405.10936  [pdf, other

    cs.CL cs.AI

    A Survey on Large Language Models with Multilingualism: Recent Advances and New Frontiers

    Authors: Kaiyu Huang, Fengran Mo, Hongliang Li, You Li, Yuanchi Zhang, Weijian Yi, Yulong Mao, Jinchen Liu, Yuzhuang Xu, Jinan Xu, Jian-Yun Nie, Yang Liu

    Abstract: The rapid development of Large Language Models (LLMs) demonstrates remarkable multilingual capabilities in natural language processing, attracting global attention in both academia and industry. To mitigate potential discrimination and enhance the overall usability and accessibility for diverse language user groups, it is important for the development of language-fair technology. Despite the break… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: 54 pages, Work in Progress

  16. arXiv:2405.09487  [pdf, other

    cs.CV

    Color Space Learning for Cross-Color Person Re-Identification

    Authors: Jiahao Nie, Shan Lin, Alex C. Kot

    Abstract: The primary color profile of the same identity is assumed to remain consistent in typical Person Re-identification (Person ReID) tasks. However, this assumption may be invalid in real-world situations and images hold variant color profiles, because of cross-modality cameras or identity with different clothing. To address this issue, we propose Color Space Learning (CSL) for those Cross-Color Perso… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: Accepted by ICME 2024 (Oral)

  17. arXiv:2405.03228  [pdf, other

    cs.LG

    TED: Accelerate Model Training by Internal Generalization

    Authors: Jinying Xiao, Ping Li, Jie Nie

    Abstract: Large language models have demonstrated strong performance in recent years, but the high cost of training drives the need for efficient methods to compress dataset sizes. We propose TED pruning, a method that addresses the challenge of overfitting under high pruning ratios by quantifying the model's ability to improve performance on pruned data while fitting retained data, known as Internal Genera… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  18. arXiv:2404.17199  [pdf, other

    cs.CV

    Few-shot Calligraphy Style Learning

    Authors: Fangda Chen, Jiacheng Nie, Lichuan Jiang, Zhuoer Zeng

    Abstract: We introduced "Presidifussion," a novel approach to learning and replicating the unique style of calligraphy of President Xu, using a pretrained diffusion model adapted through a two-stage training process. Initially, our model is pretrained on a diverse dataset containing works from various calligraphers. This is followed by fine-tuning on a smaller, specialized dataset of President Xu's calligra… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  19. arXiv:2404.13940  [pdf, other

    cs.CL

    A User-Centric Benchmark for Evaluating Large Language Models

    Authors: Jiayin Wang, Fengran Mo, Weizhi Ma, Peijie Sun, Min Zhang, Jian-Yun Nie

    Abstract: Large Language Models (LLMs) are essential tools to collaborate with users on different tasks. Evaluating their performance to serve users' needs in real-world scenarios is important. While many benchmarks have been created, they mainly focus on specific predefined model abilities. Few have covered the intended utilization of LLMs by real users. To address this oversight, we propose benchmarking L… ▽ More

    Submitted 22 April, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

  20. arXiv:2404.09431  [pdf, other

    cs.CV

    VFMM3D: Releasing the Potential of Image by Vision Foundation Model for Monocular 3D Object Detection

    Authors: Bonan Ding, Jin Xie, Jing Nie, Jiale Cao

    Abstract: Due to its cost-effectiveness and widespread availability, monocular 3D object detection, which relies solely on a single camera during inference, holds significant importance across various applications, including autonomous driving and robotics. Nevertheless, directly predicting the coordinates of objects in 3D space from monocular images poses challenges. Therefore, an effective solution involv… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: 10 pages, 5 figures

  21. arXiv:2404.01780  [pdf, other

    astro-ph.IM astro-ph.GA cs.CV

    CSST Strong Lensing Preparation: a Framework for Detecting Strong Lenses in the Multi-color Imaging Survey by the China Survey Space Telescope (CSST)

    Authors: Xu Li, Ruiqi Sun, Jiameng Lv, Peng Jia, Nan Li, Chengliang Wei, Zou Hu, Xinzhong Er, Yun Chen, Zhang Ban, Yuedong Fang, Qi Guo, Dezi Liu, Guoliang Li, Lin Lin, Ming Li, Ran Li, Xiaobo Li, Yu Luo, Xianmin Meng, Jundan Nie, Zhaoxiang Qi, Yisheng Qiu, Li Shao, Hao Tian , et al. (7 additional authors not shown)

    Abstract: Strong gravitational lensing is a powerful tool for investigating dark matter and dark energy properties. With the advent of large-scale sky surveys, we can discover strong lensing systems on an unprecedented scale, which requires efficient tools to extract them from billions of astronomical objects. The existing mainstream lens-finding tools are based on machine learning algorithms and applied to… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: The paper is accepted by the AJ. The complete code could be downloaded with DOI of: 10.12149/101393. Comments are welcome

  22. arXiv:2404.00611  [pdf, ps, other

    cs.CV

    Object-level Copy-Move Forgery Image Detection based on Inconsistency Mining

    Authors: Jingyu Wang, Niantai Jing, Ziyao Liu, Jie Nie, Yuxin Qi, Chi-Hung Chi, Kwok-Yan Lam

    Abstract: In copy-move tampering operations, perpetrators often employ techniques, such as blurring, to conceal tampering traces, posing significant challenges to the detection of object-level targets with intact structures. Focus on these challenges, this paper proposes an Object-level Copy-Move Forgery Image Detection based on Inconsistency Mining (IMNet). To obtain complete object-level targets, we custo… ▽ More

    Submitted 3 April, 2024; v1 submitted 31 March, 2024; originally announced April 2024.

    Comments: 4 pages, 2 figures, Accepted to WWW 2024

  23. arXiv:2403.19983  [pdf, other

    eess.IV cs.CV

    A multi-stage semi-supervised learning for ankle fracture classification on CT images

    Authors: Hongzhi Liu, Guicheng Li, Jiacheng Nie, Hui Tang, Chunfeng Yang, Qianjin Feng, Hailin Xu, Yang Chen

    Abstract: Because of the complicated mechanism of ankle injury, it is very difficult to diagnose ankle fracture in clinic. In order to simplify the process of fracture diagnosis, an automatic diagnosis model of ankle fracture was proposed. Firstly, a tibia-fibula segmentation network is proposed for the joint tibiofibular region of the ankle joint, and the corresponding segmentation dataset is established o… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

  24. arXiv:2403.15285  [pdf, other

    cs.NI cs.CR cs.HC cs.LG

    Blockchain-based Pseudonym Management for Vehicle Twin Migrations in Vehicular Edge Metaverse

    Authors: Jiawen Kang, Xiaofeng Luo, Jiangtian Nie, Tianhao Wu, Haibo Zhou, Yonghua Wang, Dusit Niyato, Shiwen Mao, Shengli Xie

    Abstract: Driven by the great advances in metaverse and edge computing technologies, vehicular edge metaverses are expected to disrupt the current paradigm of intelligent transportation systems. As highly computerized avatars of Vehicular Metaverse Users (VMUs), the Vehicle Twins (VTs) deployed in edge servers can provide valuable metaverse services to improve driving safety and on-board satisfaction for th… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: 14 pages, 9 figures

  25. arXiv:2403.12690  [pdf, other

    cs.LG

    LNPT: Label-free Network Pruning and Training

    Authors: Jinying Xiao, Ping Li, Zhe Tang, Jie Nie

    Abstract: Pruning before training enables the deployment of neural networks on smart devices. By retaining weights conducive to generalization, pruned networks can be accommodated on resource-constrained smart devices. It is commonly held that the distance on weight norms between the initialized and the fully-trained networks correlates with generalization performance. However, as we have uncovered, inconsi… ▽ More

    Submitted 20 March, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

    Comments: 8 pages,7 figures

  26. arXiv:2403.12688  [pdf, other

    cs.LG

    SEVEN: Pruning Transformer Model by Reserving Sentinels

    Authors: Jinying Xiao, Ping Li, Jie Nie, Zhe Tang

    Abstract: Large-scale Transformer models (TM) have demonstrated outstanding performance across various tasks. However, their considerable parameter size restricts their applicability, particularly on mobile devices. Due to the dynamic and intricate nature of gradients on TM compared to Convolutional Neural Networks, commonly used pruning methods tend to retain weights with larger gradient noise. This result… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: 9 pages,6 figures

  27. arXiv:2403.11335  [pdf, other

    cs.IR cs.CL

    ConvSDG: Session Data Generation for Conversational Search

    Authors: Fengran Mo, Bole Yi, Kelong Mao, Chen Qu, Kaiyu Huang, Jian-Yun Nie

    Abstract: Conversational search provides a more convenient interface for users to search by allowing multi-turn interaction with the search engine. However, the effectiveness of the conversational dense retrieval methods is limited by the scarcity of training data required for their fine-tuning. Thus, generating more training conversational sessions with relevant labels could potentially improve search perf… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: Accepted by WWW 2024 Workshop

  28. arXiv:2403.10779  [pdf, other

    cs.CL

    LLM-based Conversational AI Therapist for Daily Functioning Screening and Psychotherapeutic Intervention via Everyday Smart Devices

    Authors: Jingping Nie, Hanya Shao, Yuang Fan, Qijia Shao, Haoxuan You, Matthias Preindl, Xiaofan Jiang

    Abstract: Despite the global mental health crisis, access to screenings, professionals, and treatments remains high. In collaboration with licensed psychotherapists, we propose a Conversational AI Therapist with psychotherapeutic Interventions (CaiTI), a platform that leverages large language models (LLM)s and smart devices to enable better mental health self-care. CaiTI can screen the day-to-day functionin… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  29. arXiv:2403.04066  [pdf, ps, other

    cs.CV

    LoDisc: Learning Global-Local Discriminative Features for Self-Supervised Fine-Grained Visual Recognition

    Authors: Jialu Shi, Zhiqiang Wei, Jie Nie, Lei Huang

    Abstract: Self-supervised contrastive learning strategy has attracted remarkable attention due to its exceptional ability in representation learning. However, current contrastive learning tends to learn global coarse-grained representations of the image that benefit generic object recognition, whereas such coarse-grained features are insufficient for fine-grained visual recognition. In this paper, we presen… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: 11 pages, submitted

    MSC Class: 68U10 ACM Class: I.4

  30. arXiv:2403.02545  [pdf, other

    cs.LG cs.AI

    Wukong: Towards a Scaling Law for Large-Scale Recommendation

    Authors: Buyun Zhang, Liang Luo, Yuxin Chen, Jade Nie, Xi Liu, Daifeng Guo, Yanli Zhao, Shen Li, Yuchen Hao, Yantao Yao, Guna Lakshminarayanan, Ellie Dingqiao Wen, Jongsoo Park, Maxim Naumov, Wenlin Chen

    Abstract: Scaling laws play an instrumental role in the sustainable improvement in model quality. Unfortunately, recommendation models to date do not exhibit such laws similar to those observed in the domain of large language models, due to the inefficiencies of their upscaling mechanisms. This limitation poses significant challenges in adapting these models to increasingly more complex real-world datasets.… ▽ More

    Submitted 4 June, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: 12 pages

  31. arXiv:2402.16608  [pdf, other

    cs.CL cs.IR

    PAQA: Toward ProActive Open-Retrieval Question Answering

    Authors: Pierre Erbacher, Jian-Yun Nie, Philippe Preux, Laure Soulier

    Abstract: Conversational systems have made significant progress in generating natural language responses. However, their potential as conversational search systems is currently limited due to their passive role in the information-seeking process. One major limitation is the scarcity of datasets that provide labelled ambiguous questions along with a supporting corpus of documents and relevant clarifying ques… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  32. arXiv:2402.11626  [pdf, other

    cs.CL cs.IR

    Metacognitive Retrieval-Augmented Large Language Models

    Authors: Yujia Zhou, Zheng Liu, Jiajie Jin, Jian-Yun Nie, Zhicheng Dou

    Abstract: Retrieval-augmented generation have become central in natural language processing due to their efficacy in generating factual content. While traditional methods employ single-time retrieval, more recent approaches have shifted towards multi-time retrieval for multi-hop reasoning tasks. However, these strategies are bound by predefined reasoning steps, potentially leading to inaccuracies in respons… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

    Comments: Accepted by WWW 2024

  33. arXiv:2401.16659  [pdf, other

    cs.IR cs.CL

    History-Aware Conversational Dense Retrieval

    Authors: Fengran Mo, Chen Qu, Kelong Mao, Tianyu Zhu, Zhan Su, Kaiyu Huang, Jian-Yun Nie

    Abstract: Conversational search facilitates complex information retrieval by enabling multi-turn interactions between users and the system. Supporting such interactions requires a comprehensive understanding of the conversational inputs to formulate a good search query based on historical information. In particular, the search query should include the relevant information from the previous conversation turn… ▽ More

    Submitted 28 May, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

    Comments: Accepted to Findings of ACL 2024

  34. arXiv:2401.11471  [pdf, other

    cs.DC cs.AI

    LR-CNN: Lightweight Row-centric Convolutional Neural Network Training for Memory Reduction

    Authors: Zhigang Wang, Hangyu Yang, Ning Wang, Chuanfei Xu, Jie Nie, Zhiqiang Wei, Yu Gu, Ge Yu

    Abstract: In the last decade, Convolutional Neural Network with a multi-layer architecture has advanced rapidly. However, training its complex network is very space-consuming, since a lot of intermediate data are preserved across layers, especially when processing high-dimension inputs with a big batch size. That poses great challenges to the limited memory capacity of current accelerators (e.g., GPUs). Exi… ▽ More

    Submitted 21 January, 2024; originally announced January 2024.

  35. arXiv:2401.11469  [pdf, other

    cs.DC

    Accelerating Heterogeneous Tensor Parallelism via Flexible Workload Control

    Authors: Zhigang Wang, Xu Zhang, Ning Wang, Chuanfei Xu, Jie Nie, Zhiqiang Wei, Yu Gu, Ge Yu

    Abstract: Transformer-based models are becoming deeper and larger recently. For better scalability, an underlying training solution in industry is to split billions of parameters (tensors) into many tasks and then run them across homogeneous accelerators (e.g., GPUs). However, such dedicated compute cluster is prohibitively expensive in academia and moderate companies. An economic replacement is to aggregat… ▽ More

    Submitted 21 January, 2024; originally announced January 2024.

    Comments: 13 pages

  36. arXiv:2401.11204  [pdf, other

    cs.CV

    Towards Category Unification of 3D Single Object Tracking on Point Clouds

    Authors: Jiahao Nie, Zhiwei He, Xudong Lv, Xueyi Zhou, Dong-Kyu Chae, Fei Xie

    Abstract: Category-specific models are provenly valuable methods in 3D single object tracking (SOT) regardless of Siamese or motion-centric paradigms. However, such over-specialized model designs incur redundant parameters, thus limiting the broader applicability of 3D SOT task. This paper first introduces unified models that can simultaneously track objects across all categories using a single network with… ▽ More

    Submitted 20 January, 2024; originally announced January 2024.

    Comments: Accepted by ICLR2024 (poster)

  37. arXiv:2401.09680  [pdf, ps, other

    cs.AI cs.GT

    Tiny Multi-Agent DRL for Twins Migration in UAV Metaverses: A Multi-Leader Multi-Follower Stackelberg Game Approach

    Authors: Jiawen Kang, Yue Zhong, Minrui Xu, Jiangtian Nie, Jinbo Wen, Hongyang Du, Dongdong Ye, Xumin Huang, Dusit Niyato, Shengli Xie

    Abstract: The synergy between Unmanned Aerial Vehicles (UAVs) and metaverses is giving rise to an emerging paradigm named UAV metaverses, which create a unified ecosystem that blends physical and virtual spaces, transforming drone interaction and virtual exploration. UAV Twins (UTs), as the digital twins of UAVs that revolutionize UAV applications by making them more immersive, realistic, and informative, a… ▽ More

    Submitted 8 April, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

  38. arXiv:2401.08407  [pdf, other

    cs.CV

    Cross-Domain Few-Shot Segmentation via Iterative Support-Query Correspondence Mining

    Authors: Jiahao Nie, Yun Xing, Gongjie Zhang, Pei Yan, Aoran Xiao, Yap-Peng Tan, Alex C. Kot, Shijian Lu

    Abstract: Cross-Domain Few-Shot Segmentation (CD-FSS) poses the challenge of segmenting novel categories from a distinct domain using only limited exemplars. In this paper, we undertake a comprehensive study of CD-FSS and uncover two crucial insights: (i) the necessity of a fine-tuning stage to effectively transfer the learned meta-knowledge across domains, and (ii) the overfitting risk during the naïve fin… ▽ More

    Submitted 13 March, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: Accepted by CVPR 2024

  39. arXiv:2401.08329  [pdf, other

    cs.HC

    Understanding User Experience in Large Language Model Interactions

    Authors: Jiayin Wang, Weizhi Ma, Peijie Sun, Min Zhang, Jian-Yun Nie

    Abstract: In the rapidly evolving landscape of large language models (LLMs), most research has primarily viewed them as independent individuals, focusing on assessing their capabilities through standardized benchmarks and enhancing their general intelligence. This perspective, however, tends to overlook the vital role of LLMs as user-centric services in human-AI collaboration. This gap in research becomes i… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: 15 pages + 3 page references + 2 page Appendix

  40. arXiv:2401.06311  [pdf, other

    cs.IR

    Exploring the Best Practices of Query Expansion with Large Language Models

    Authors: Le Zhang, Yihong Wu, Qian Yang, Jian-Yun Nie

    Abstract: Large Language Models (LLMs) are foundational in language technologies, particularly in information retrieval (IR). Previous studies have utilized LLMs for query expansion, achieving notable improvements in IR. In this paper, we thoroughly explore the best practice of leveraging LLMs for query expansion. To this end, we introduce a training-free, straightforward yet effective framework called Mult… ▽ More

    Submitted 29 June, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

  41. arXiv:2401.03205  [pdf, other

    cs.CL

    The Dawn After the Dark: An Empirical Study on Factuality Hallucination in Large Language Models

    Authors: Junyi Li, Jie Chen, Ruiyang Ren, Xiaoxue Cheng, Wayne Xin Zhao, Jian-Yun Nie, Ji-Rong Wen

    Abstract: In the era of large language models (LLMs), hallucination (i.e., the tendency to generate factually incorrect content) poses great challenge to trustworthy and reliable deployment of LLMs in real-world applications. To tackle the LLM hallucination, three key questions should be well studied: how to detect hallucinations (detection), why do LLMs hallucinate (source), and what can be done to mitigat… ▽ More

    Submitted 6 January, 2024; originally announced January 2024.

    Comments: 24 pages, 8 figures, 13 tables

  42. arXiv:2312.16180  [pdf, other

    cs.SD cs.AI cs.CL cs.LG

    Investigating salient representations and label Variance in Dimensional Speech Emotion Analysis

    Authors: Vikramjit Mitra, Jingping Nie, Erdrin Azemi

    Abstract: Representations derived from models such as BERT (Bidirectional Encoder Representations from Transformers) and HuBERT (Hidden units BERT), have helped to achieve state-of-the-art performance in dimensional speech emotion recognition. Despite their large dimensionality, and even though these representations are not tailored for emotion recognition tasks, they are frequently used to train large spee… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

    Comments: 5 pages

    Journal ref: ICASSP 2024

  43. arXiv:2312.12063  [pdf, other

    cs.NI cs.AI cs.GT

    Resource-efficient Generative Mobile Edge Networks in 6G Era: Fundamentals, Framework and Case Study

    Authors: Bingkun Lai, Jinbo Wen, Jiawen Kang, Hongyang Du, Jiangtian Nie, Changyan Yi, Dong In Kim, Shengli Xie

    Abstract: As the next-generation wireless communication system, Sixth-Generation (6G) technologies are emerging, enabling various mobile edge networks that can revolutionize wireless communication and connectivity. By integrating Generative Artificial Intelligence (GAI) with mobile edge networks, generative mobile edge networks possess immense potential to enhance the intelligence and efficiency of wireless… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

  44. arXiv:2311.17335  [pdf, other

    cs.CV cs.MM

    eMotions: A Large-Scale Dataset for Emotion Recognition in Short Videos

    Authors: Xuecheng Wu, Heli Sun, Junxiao Xue, Ruofan Zhai, Xiangyan Kong, Jiayu Nie, Liang He

    Abstract: Nowadays, short videos (SVs) are essential to information acquisition and sharing in our life. The prevailing use of SVs to spread emotions leads to the necessity of emotion recognition in SVs. Considering the lack of SVs emotion data, we introduce a large-scale dataset named eMotions, comprising 27,996 videos. Meanwhile, we alleviate the impact of subjectivities on labeling quality by emphasizing… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

  45. arXiv:2311.06119  [pdf, other

    cs.IR

    Augmenting Ad-Hoc IR Dataset for Interactive Conversational Search

    Authors: Pierre Erbacher, Jian-Yun Nie, Philippe Preux, Laure Soulier

    Abstract: A peculiarity of conversational search systems is that they involve mixed-initiatives such as system-generated query clarifying questions. Evaluating those systems at a large scale on the end task of IR is very challenging, requiring adequate datasets containing such interactions. However, current datasets only focus on either traditional ad-hoc IR tasks or query clarification tasks, the latter be… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

  46. arXiv:2311.05374  [pdf, other

    cs.CL cs.AI

    TencentLLMEval: A Hierarchical Evaluation of Real-World Capabilities for Human-Aligned LLMs

    Authors: Shuyi Xie, Wenlin Yao, Yong Dai, Shaobo Wang, Donlin Zhou, Lifeng Jin, Xinhua Feng, Pengzhi Wei, Yujie Lin, Zhichao Hu, Dong Yu, Zhengyou Zhang, Jing Nie, Yuhong Liu

    Abstract: Large language models (LLMs) have shown impressive capabilities across various natural language tasks. However, evaluating their alignment with human preferences remains a challenge. To this end, we propose a comprehensive human evaluation framework to assess LLMs' proficiency in following instructions on diverse real-world tasks. We construct a hierarchical task tree encompassing 7 major areas co… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

  47. Collaboration and Transition: Distilling Item Transitions into Multi-Query Self-Attention for Sequential Recommendation

    Authors: Tianyu Zhu, Yansong Shi, Yuan Zhang, Yihong Wu, Fengran Mo, Jian-Yun Nie

    Abstract: Modern recommender systems employ various sequential modules such as self-attention to learn dynamic user interests. However, these methods are less effective in capturing collaborative and transitional signals within user interaction sequences. First, the self-attention architecture uses the embedding of a single item as the attention query, making it challenging to capture collaborative signals.… ▽ More

    Submitted 25 December, 2023; v1 submitted 2 November, 2023; originally announced November 2023.

    Comments: WSDM 2024 Oral Presentation

  48. arXiv:2310.18382  [pdf, other

    cs.LG cs.GT cs.NI

    From Generative AI to Generative Internet of Things: Fundamentals, Framework, and Outlooks

    Authors: Jinbo Wen, Jiangtian Nie, Jiawen Kang, Dusit Niyato, Hongyang Du, Yang Zhang, Mohsen Guizani

    Abstract: Generative Artificial Intelligence (GAI) possesses the capabilities of generating realistic data and facilitating advanced decision-making. By integrating GAI into modern Internet of Things (IoT), Generative Internet of Things (GIoT) is emerging and holds immense potential to revolutionize various aspects of society, enabling more efficient and intelligent IoT applications, such as smart surveilla… ▽ More

    Submitted 23 January, 2024; v1 submitted 26 October, 2023; originally announced October 2023.

  49. arXiv:2310.13265  [pdf, other

    cs.CL

    MoqaGPT : Zero-Shot Multi-modal Open-domain Question Answering with Large Language Model

    Authors: Le Zhang, Yihong Wu, Fengran Mo, Jian-Yun Nie, Aishwarya Agrawal

    Abstract: Multi-modal open-domain question answering typically requires evidence retrieval from databases across diverse modalities, such as images, tables, passages, etc. Even Large Language Models (LLMs) like GPT-4 fall short in this task. To enable LLMs to tackle the task in a zero-shot manner, we introduce MoqaGPT, a straightforward and flexible framework. Using a divide-and-conquer strategy that bypass… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: Accepted into EMNLP2023 Findings

  50. arXiv:2310.12168  [pdf, other

    cs.LG cs.AI cs.CV

    RK-core: An Established Methodology for Exploring the Hierarchical Structure within Datasets

    Authors: Yao Lu, Yutian Huang, Jiaqi Nie, Zuohui Chen, Qi Xuan

    Abstract: Recently, the field of machine learning has undergone a transition from model-centric to data-centric. The advancements in diverse learning tasks have been propelled by the accumulation of more extensive datasets, subsequently facilitating the training of larger models on these datasets. However, these datasets remain relatively under-explored. To this end, we introduce a pioneering approach known… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.