Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 58 results for author: Shu, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.06592  [pdf, other

    cs.CL cs.LG

    Improve Mathematical Reasoning in Language Models by Automated Process Supervision

    Authors: Liangchen Luo, Yinxiao Liu, Rosanne Liu, Samrat Phatale, Harsh Lara, Yunxuan Li, Lei Shu, Yun Zhu, Lei Meng, Jiao Sun, Abhinav Rastogi

    Abstract: Complex multi-step reasoning tasks, such as solving mathematical problems or generating code, remain a significant hurdle for even the most advanced large language models (LLMs). Verifying LLM outputs with an Outcome Reward Model (ORM) is a standard inference-time technique aimed at enhancing the reasoning performance of LLMs. However, this still proves insufficient for reasoning tasks with a leng… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 18 pages, 5 figures, 1 table

  2. arXiv:2405.16178  [pdf, other

    cs.CL

    Accelerating Inference of Retrieval-Augmented Generation via Sparse Context Selection

    Authors: Yun Zhu, Jia-Chen Gu, Caitlin Sikora, Ho Ko, Yinxiao Liu, Chu-Cheng Lin, Lei Shu, Liangchen Luo, Lei Meng, Bang Liu, Jindong Chen

    Abstract: Large language models (LLMs) augmented with retrieval exhibit robust performance and extensive versatility by incorporating external contexts. However, the input length grows linearly in the number of retrieved documents, causing a dramatic increase in latency. In this paper, we propose a novel paradigm named Sparse RAG, which seeks to cut computation costs through sparsity. Specifically, Sparse R… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  3. arXiv:2405.07429  [pdf, other

    cs.RO

    JointLoc: A Real-time Visual Localization Framework for Planetary UAVs Based on Joint Relative and Absolute Pose Estimation

    Authors: Xubo Luo, Xue Wan, Yixing Gao, Yaolin Tian, Wei Zhang, Leizheng Shu

    Abstract: Unmanned aerial vehicles (UAVs) visual localization in planetary aims to estimate the absolute pose of the UAV in the world coordinate system through satellite maps and images captured by on-board cameras. However, since planetary scenes often lack significant landmarks and there are modal differences between satellite maps and UAV images, the accuracy and real-time performance of UAV positioning… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

    Comments: 8 pages

  4. arXiv:2403.09030  [pdf

    cs.SD cs.LG eess.AS

    An AI-Driven Approach to Wind Turbine Bearing Fault Diagnosis from Acoustic Signals

    Authors: Zhao Wang, Xiaomeng Li, Na Li, Longlong Shu

    Abstract: This study aimed to develop a deep learning model for the classification of bearing faults in wind turbine generators from acoustic signals. A convolutional LSTM model was successfully constructed and trained by using audio data from five predefined fault types for both training and validation. To create the dataset, raw audio signal data was collected and processed in frames to capture time and f… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  5. arXiv:2401.07382  [pdf, other

    cs.CL cs.AI

    Beyond Sparse Rewards: Enhancing Reinforcement Learning with Language Model Critique in Text Generation

    Authors: Meng Cao, Lei Shu, Lei Yu, Yun Zhu, Nevan Wichers, Yinxiao Liu, Lei Meng

    Abstract: Reinforcement learning (RL) can align language models with non-differentiable reward signals, such as human preferences. However, a major challenge arises from the sparsity of these reward signals - typically, there is only a single reward for an entire output. This sparsity of rewards can lead to inefficient and unstable learning. To address this challenge, our paper introduces an novel framework… ▽ More

    Submitted 19 February, 2024; v1 submitted 14 January, 2024; originally announced January 2024.

  6. arXiv:2311.16344  [pdf, other

    cs.CV cs.GR

    Spatially Adaptive Cloth Regression with Implicit Neural Representations

    Authors: Lei Shu, Vinicius Azevedo, Barbara Solenthaler, Markus Gross

    Abstract: The accurate representation of fine-detailed cloth wrinkles poses significant challenges in computer graphics. The inherently non-uniform structure of cloth wrinkles mandates the employment of intricate discretization strategies, which are frequently characterized by high computational demands and complex methodologies. Addressing this, the research introduced in this paper elucidates a novel anis… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: 16 pages, 13 figures

    MSC Class: 68T07 ACM Class: I.3.0

  7. arXiv:2311.09204  [pdf, other

    cs.CL cs.AI

    Fusion-Eval: Integrating Assistant Evaluators with LLMs

    Authors: Lei Shu, Nevan Wichers, Liangchen Luo, Yun Zhu, Yinxiao Liu, Jindong Chen, Lei Meng

    Abstract: Evaluating natural language systems poses significant challenges, particularly in the realms of natural language understanding and high-level reasoning. In this paper, we introduce 'Fusion-Eval', an innovative approach that leverages Large Language Models (LLMs) to integrate insights from various assistant evaluators. The LLM is given the example to evaluate along with scores from the assistant ev… ▽ More

    Submitted 6 June, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

  8. arXiv:2311.09179  [pdf, other

    cs.CL

    SiRA: Sparse Mixture of Low Rank Adaptation

    Authors: Yun Zhu, Nevan Wichers, Chu-Cheng Lin, Xinyi Wang, Tianlong Chen, Lei Shu, Han Lu, Canoee Liu, Liangchen Luo, Jindong Chen, Lei Meng

    Abstract: Parameter Efficient Tuning has been an prominent approach to adapt the Large Language Model to downstream tasks. Most previous works considers adding the dense trainable parameters, where all parameters are used to adapt certain task. We found this less effective empirically using the example of LoRA that introducing more trainable parameters does not help. Motivated by this we investigate the imp… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  9. arXiv:2310.04815  [pdf, other

    cs.LG

    Critique Ability of Large Language Models

    Authors: Liangchen Luo, Zi Lin, Yinxiao Liu, Lei Shu, Yun Zhu, Jingbo Shang, Lei Meng

    Abstract: Critical thinking is essential for rational decision-making and problem-solving. This skill hinges on the ability to provide precise and reasoned critiques and is a hallmark of human intelligence. In the era of large language models (LLMs), this study explores the ability of LLMs to deliver accurate critiques across various tasks. We are interested in this topic as a capable critic model could not… ▽ More

    Submitted 7 October, 2023; originally announced October 2023.

  10. arXiv:2308.11807  [pdf, other

    cs.CL

    Towards an On-device Agent for Text Rewriting

    Authors: Yun Zhu, Yinxiao Liu, Felix Stahlberg, Shankar Kumar, Yu-hui Chen, Liangchen Luo, Lei Shu, Renjie Liu, Jindong Chen, Lei Meng

    Abstract: Large Language Models (LLMs) have demonstrated impressive capabilities for text rewriting. Nonetheless, the large sizes of these models make them impractical for on-device inference, which would otherwise allow for enhanced privacy and economical inference. Creating a smaller yet potent language model for text rewriting presents a formidable challenge because it requires balancing the need for a s… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

  11. arXiv:2305.15685  [pdf, other

    cs.CL cs.AI

    RewriteLM: An Instruction-Tuned Large Language Model for Text Rewriting

    Authors: Lei Shu, Liangchen Luo, Jayakumar Hoskere, Yun Zhu, Yinxiao Liu, Simon Tong, Jindong Chen, Lei Meng

    Abstract: Large Language Models (LLMs) have demonstrated impressive capabilities in creative tasks such as storytelling and E-mail generation. However, as LLMs are primarily trained on final text results rather than intermediate revisions, it might be challenging for them to perform text rewriting tasks. Most studies in the rewriting tasks focus on a particular transformation type within the boundaries of s… ▽ More

    Submitted 19 December, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Journal ref: AAAI 2024

  12. arXiv:2304.11658  [pdf, other

    cs.LG

    Capturing Fine-grained Semantics in Contrastive Graph Representation Learning

    Authors: Lin Shu, Chuan Chen, Zibin Zheng

    Abstract: Graph contrastive learning defines a contrastive task to pull similar instances close and push dissimilar instances away. It learns discriminative node embeddings without supervised labels, which has aroused increasing attention in the past few years. Nevertheless, existing methods of graph contrastive learning ignore the differences between diverse semantics existed in graphs, which learn coarse-… ▽ More

    Submitted 23 April, 2023; originally announced April 2023.

  13. arXiv:2301.08986  [pdf, other

    cs.CL cs.AI cs.LG cs.NE

    Adapting a Language Model While Preserving its General Knowledge

    Authors: Zixuan Ke, Yijia Shao, Haowei Lin, Hu Xu, Lei Shu, Bing Liu

    Abstract: Domain-adaptive pre-training (or DA-training for short), also known as post-training, aims to train a pre-trained general-purpose language model (LM) using an unlabeled corpus of a particular domain to adapt the LM so that end-tasks in the domain can give improved performances. However, existing DA-training methods are in some sense blind as they do not explicitly identify what knowledge in the LM… ▽ More

    Submitted 21 January, 2023; originally announced January 2023.

    Comments: EMNLP 2022

  14. arXiv:2210.05549  [pdf, other

    cs.CL cs.AI cs.LG cs.NE

    Continual Training of Language Models for Few-Shot Learning

    Authors: Zixuan Ke, Haowei Lin, Yijia Shao, Hu Xu, Lei Shu, Bing Liu

    Abstract: Recent work on applying large language models (LMs) achieves impressive performance in many NLP applications. Adapting or posttraining an LM using an unlabeled domain corpus can produce even better performance for end-tasks in the domain. This paper proposes the problem of continually extending an LM by incrementally post-train the LM with a sequence of unlabeled domain corpora to expand its knowl… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

    Journal ref: EMNLP 2022

  15. arXiv:2208.13685  [pdf, other

    cs.LG cs.CR

    FedEgo: Privacy-preserving Personalized Federated Graph Learning with Ego-graphs

    Authors: Taolin Zhang, Chuan Chen, Yaomin Chang, Lin Shu, Zibin Zheng

    Abstract: As special information carriers containing both structure and feature information, graphs are widely used in graph mining, e.g., Graph Neural Networks (GNNs). However, in some practical scenarios, graph data are stored separately in multiple distributed parties, which may not be directly shared due to conflicts of interest. Hence, federated graph neural networks are proposed to address such data s… ▽ More

    Submitted 9 September, 2022; v1 submitted 29 August, 2022; originally announced August 2022.

    Comments: 25 pages, submitted to ACM Transactions on Knowledge Discovery from Data (TKDD)

  16. arXiv:2203.13238  [pdf, other

    cs.CV cs.AI

    Open-set Recognition via Augmentation-based Similarity Learning

    Authors: Sepideh Esmaeilpour, Lei Shu, Bing Liu

    Abstract: The primary assumption of conventional supervised learning or classification is that the test samples are drawn from the same distribution as the training samples, which is called closed set learning or classification. In many practical scenarios, this is not the case because there are unknowns or unseen class samples in the test data, which is called the open set scenario, and the unknowns need t… ▽ More

    Submitted 21 August, 2022; v1 submitted 24 March, 2022; originally announced March 2022.

  17. arXiv:2202.02976  [pdf, other

    cs.CL cs.AI cs.LG

    Measuring and Reducing Model Update Regression in Structured Prediction for NLP

    Authors: Deng Cai, Elman Mansimov, Yi-An Lai, Yixuan Su, Lei Shu, Yi Zhang

    Abstract: Recent advance in deep learning has led to the rapid adoption of machine learning-based NLP models in a wide range of applications. Despite the continuous gain in accuracy, backward compatibility is also an important aspect for industrial applications, yet it received little research attention. Backward compatibility requires that the new model does not regress on cases that were correctly handled… ▽ More

    Submitted 8 October, 2022; v1 submitted 7 February, 2022; originally announced February 2022.

    Comments: NeurIPS2022

  18. arXiv:2202.01924  [pdf, other

    cs.CL cs.AI

    Zero-Shot Aspect-Based Sentiment Analysis

    Authors: Lei Shu, Hu Xu, Bing Liu, Jiahua Chen

    Abstract: Aspect-based sentiment analysis (ABSA) typically requires in-domain annotated data for supervised training/fine-tuning. It is a big challenge to scale ABSA to a large number of new domains. This paper aims to train a unified model that can perform zero-shot ABSA without using any annotated data for a new domain. We propose a method called contrastive post-training on review Natural Language Infere… ▽ More

    Submitted 14 February, 2022; v1 submitted 3 February, 2022; originally announced February 2022.

  19. arXiv:2112.10021  [pdf, other

    cs.CL cs.AI cs.LG cs.NE

    Continual Learning with Knowledge Transfer for Sentiment Classification

    Authors: Zixuan Ke, Bing Liu, Hao Wang, Lei Shu

    Abstract: This paper studies continual learning (CL) for sentiment classification (SC). In this setting, the CL system learns a sequence of SC tasks incrementally in a neural network, where each task builds a classifier to classify the sentiment of reviews of a particular product category or domain. Two natural questions are: Can the system transfer the knowledge learned in the past from the previous tasks… ▽ More

    Submitted 18 December, 2021; originally announced December 2021.

    Journal ref: ECML-PKDD 2020

  20. arXiv:2112.02714  [pdf, other

    cs.CL cs.AI cs.LG cs.NE

    CLASSIC: Continual and Contrastive Learning of Aspect Sentiment Classification Tasks

    Authors: Zixuan Ke, Bing Liu, Hu Xu, Lei Shu

    Abstract: This paper studies continual learning (CL) of a sequence of aspect sentiment classification(ASC) tasks in a particular CL setting called domain incremental learning (DIL). Each task is from a different domain or product. The DIL setting is particularly suited to ASC because in testing the system needs not know the task/domain to which the test data belongs. To our knowledge, this setting has not b… ▽ More

    Submitted 5 December, 2021; originally announced December 2021.

    Journal ref: EMNLP 2021

  21. arXiv:2112.02706  [pdf, other

    cs.CL cs.AI cs.LG cs.NE

    Achieving Forgetting Prevention and Knowledge Transfer in Continual Learning

    Authors: Zixuan Ke, Bing Liu, Nianzu Ma, Hu Xu, Lei Shu

    Abstract: Continual learning (CL) learns a sequence of tasks incrementally with the goal of achieving two main objectives: overcoming catastrophic forgetting (CF) and encouraging knowledge transfer (KT) across tasks. However, most existing techniques focus only on overcoming CF and have no mechanism to encourage KT, and thus do not do well in KT. Although several papers have tried to deal with both CF and K… ▽ More

    Submitted 5 December, 2021; originally announced December 2021.

    Journal ref: NeurIPS 2021

  22. arXiv:2111.04198  [pdf, other

    cs.CL

    TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning

    Authors: Yixuan Su, Fangyu Liu, Zaiqiao Meng, Tian Lan, Lei Shu, Ehsan Shareghi, Nigel Collier

    Abstract: Masked language models (MLMs) such as BERT and RoBERTa have revolutionized the field of Natural Language Understanding in the past few years. However, existing pre-trained MLMs often output an anisotropic distribution of token representations that occupies a narrow subset of the entire representation space. Such token representations are not ideal, especially for tasks that demand discriminative s… ▽ More

    Submitted 28 April, 2022; v1 submitted 7 November, 2021; originally announced November 2021.

    Comments: Camera-ready for NAACL 2022

  23. arXiv:2111.02724  [pdf

    cs.CV cs.AI

    Tea Chrysanthemum Detection under Unstructured Environments Using the TC-YOLO Model

    Authors: Chao Qi, Junfeng Gao, Simon Pearson, Helen Harman, Kunjie Chen, Lei Shu

    Abstract: Tea chrysanthemum detection at its flowering stage is one of the key components for selective chrysanthemum harvesting robot development. However, it is a challenge to detect flowering chrysanthemums under unstructured field environments given the variations on illumination, occlusion and object scale. In this context, we propose a highly fused and lightweight deep learning architecture based on Y… ▽ More

    Submitted 4 November, 2021; originally announced November 2021.

  24. arXiv:2109.14739  [pdf, other

    cs.CL

    Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System

    Authors: Yixuan Su, Lei Shu, Elman Mansimov, Arshit Gupta, Deng Cai, Yi-An Lai, Yi Zhang

    Abstract: Pre-trained language models have been recently shown to benefit task-oriented dialogue (TOD) systems. Despite their success, existing methods often formulate this task as a cascaded generation problem which can lead to error accumulation across different sub-tasks and greater data annotation overhead. In this study, we present PPTOD, a unified plug-and-play model for task-oriented dialogue. In add… ▽ More

    Submitted 1 March, 2022; v1 submitted 29 September, 2021; originally announced September 2021.

    Comments: Camera-ready for ACL2022 main conference

  25. Zero-Shot Out-of-Distribution Detection Based on the Pre-trained Model CLIP

    Authors: Sepideh Esmaeilpour, Bing Liu, Eric Robertson, Lei Shu

    Abstract: In an out-of-distribution (OOD) detection problem, samples of known classes(also called in-distribution classes) are used to train a special classifier. In testing, the classifier can (1) classify the test samples of known classes to their respective classes and also (2) detect samples that do not belong to any of the known classes (i.e., they belong to some unknown or OOD classes). This paper stu… ▽ More

    Submitted 22 March, 2022; v1 submitted 6 September, 2021; originally announced September 2021.

  26. arXiv:2011.00169  [pdf, other

    cs.CL

    Understanding Pre-trained BERT for Aspect-based Sentiment Analysis

    Authors: Hu Xu, Lei Shu, Philip S. Yu, Bing Liu

    Abstract: This paper analyzes the pre-trained hidden representations learned from reviews on BERT for tasks in aspect-based sentiment analysis (ABSA). Our work is motivated by the recent progress in BERT-based language models for ABSA. However, it is not clear how the general proxy task of (masked) language model trained on unlabeled corpus without annotations of aspects or opinions can provide important fe… ▽ More

    Submitted 30 October, 2020; originally announced November 2020.

    Comments: COLING 2020

  27. arXiv:2009.12046  [pdf, other

    cs.CL

    Controllable Text Generation with Focused Variation

    Authors: Lei Shu, Alexandros Papangelis, Yi-Chia Wang, Gokhan Tur, Hu Xu, Zhaleh Feizollahi, Bing Liu, Piero Molino

    Abstract: This work introduces Focused-Variation Network (FVN), a novel model to control language generation. The main problems in previous controlled language generation models range from the difficulty of generating text according to the given attributes, to the lack of diversity of the generated texts. FVN addresses these issues by learning disjoint discrete latent spaces for each attribute inside codebo… ▽ More

    Submitted 25 September, 2020; originally announced September 2020.

  28. arXiv:2004.13816  [pdf, other

    cs.CL

    DomBERT: Domain-oriented Language Model for Aspect-based Sentiment Analysis

    Authors: Hu Xu, Bing Liu, Lei Shu, Philip S. Yu

    Abstract: This paper focuses on learning domain-oriented language models driven by end tasks, which aims to combine the worlds of both general-purpose language models (such as ELMo and BERT) and domain-specific language understanding. We propose DomBERT, an extension of BERT to learn from both in-domain corpus and relevant domain corpora. This helps in learning domain language models with low-resources. Exp… ▽ More

    Submitted 28 April, 2020; originally announced April 2020.

  29. Towards Smart Wireless Communications via Intelligent Reflecting Surfaces: A Contemporary Survey

    Authors: Shimin Gong, Xiao Lu, Dinh Thai Hoang, Dusit Niyato, Lei Shu, Dong In Kim, Ying-Chang Liang

    Abstract: This paper presents a literature review on recent applications and design aspects of the intelligent reflecting surface (IRS) in the future wireless networks. Conventionally, the network optimization has been limited to transmission control at two endpoints, i.e., end users and network controller. The fading wireless channel is uncontrollable and becomes one of the main limiting factors for perfor… ▽ More

    Submitted 19 May, 2020; v1 submitted 16 December, 2019; originally announced December 2019.

    Comments: 31 pages, 10 figures, 6 tables, 203 references

    Journal ref: IEEE Communications Surveys & Tutorials, June 2020

  30. arXiv:1911.01460  [pdf, ps, other

    cs.CL

    A Failure of Aspect Sentiment Classifiers and an Adaptive Re-weighting Solution

    Authors: Hu Xu, Bing Liu, Lei Shu, Philip S. Yu

    Abstract: Aspect-based sentiment classification (ASC) is an important task in fine-grained sentiment analysis.~Deep supervised ASC approaches typically model this task as a pair-wise classification task that takes an aspect and a sentence containing the aspect and outputs the polarity of the aspect in that sentence. However, we discovered that many existing approaches fail to learn an effective ASC classifi… ▽ More

    Submitted 4 November, 2019; originally announced November 2019.

  31. arXiv:1911.01172  [pdf

    cs.LG stat.ML

    Fast-UAP: An Algorithm for Speeding up Universal Adversarial Perturbation Generation with Orientation of Perturbation Vectors

    Authors: Jiazhu Dai, Le Shu

    Abstract: Convolutional neural networks (CNN) have become one of the most popular machine learning tools and are being applied in various tasks, however, CNN models are vulnerable to universal perturbations, which are usually human-imperceptible but can cause natural images to be misclassified with high probability. One of the state-of-the-art algorithms to generate universal perturbations is known as UAP.… ▽ More

    Submitted 6 January, 2020; v1 submitted 4 November, 2019; originally announced November 2019.

    Comments: 9 pages, 7 figures, 1 table, 1 algorithm

    MSC Class: I.2.0 ACM Class: I.2.0

  32. arXiv:1908.11546  [pdf, other

    cs.CL

    Modeling Multi-Action Policy for Task-Oriented Dialogues

    Authors: Lei Shu, Hu Xu, Bing Liu, Piero Molino

    Abstract: Dialogue management (DM) plays a key role in the quality of the interaction with the user in a task-oriented dialogue system. In most existing approaches, the agent predicts only one DM policy action per turn. This significantly limits the expressive power of the conversational agent and introduces unwanted turns of interactions that may challenge users' patience. Longer conversations also lead to… ▽ More

    Submitted 30 August, 2019; originally announced August 2019.

    Comments: 7

  33. arXiv:1908.02402  [pdf, other

    cs.CL cs.AI cs.HC cs.LG

    Flexibly-Structured Model for Task-Oriented Dialogues

    Authors: Lei Shu, Piero Molino, Mahdi Namazifar, Hu Xu, Bing Liu, Huaixiu Zheng, Gokhan Tur

    Abstract: This paper proposes a novel end-to-end architecture for task-oriented dialogue systems. It is based on a simple and practical yet very effective sequence-to-sequence approach, where language understanding and state tracking tasks are modeled jointly with a structured copy-augmented sequential decoder and a multi-label decoder for each slot. The policy engine and language generation tasks are model… ▽ More

    Submitted 6 August, 2019; originally announced August 2019.

  34. arXiv:1905.06407  [pdf, other

    cs.CL cs.LG

    Controlled CNN-based Sequence Labeling for Aspect Extraction

    Authors: Lei Shu, Hu Xu, Bing Liu

    Abstract: One key task of fine-grained sentiment analysis on reviews is to extract aspects or features that users have expressed opinions on. This paper focuses on supervised aspect extraction using a modified CNN called controlled CNN (Ctrl). The modified CNN has two types of control modules. Through asynchronous parameter updating, it prevents over-fitting and boosts CNN's performance significantly. This… ▽ More

    Submitted 29 May, 2019; v1 submitted 15 May, 2019; originally announced May 2019.

  35. arXiv:1904.02232  [pdf, other

    cs.CL

    BERT Post-Training for Review Reading Comprehension and Aspect-based Sentiment Analysis

    Authors: Hu Xu, Bing Liu, Lei Shu, Philip S. Yu

    Abstract: Question-answering plays an important role in e-commerce as it allows potential customers to actively seek crucial information about products or services to help their purchase decision making. Inspired by the recent success of machine reading comprehension (MRC) on formal documents, this paper explores the potential of turning customer reviews into a large source of knowledge that can be exploite… ▽ More

    Submitted 3 May, 2019; v1 submitted 3 April, 2019; originally announced April 2019.

    Comments: accepted by NAACL 2019

  36. arXiv:1902.00821  [pdf, ps, other

    cs.CL

    Review Conversational Reading Comprehension

    Authors: Hu Xu, Bing Liu, Lei Shu, Philip S. Yu

    Abstract: Inspired by conversational reading comprehension (CRC), this paper studies a novel task of leveraging reviews as a source to build an agent that can answer multi-turn questions from potential consumers of online businesses. We first build a review CRC dataset and then propose a novel task-aware pre-tuning step running between language model (e.g., BERT) pre-training and domain-specific fine-tuning… ▽ More

    Submitted 6 November, 2019; v1 submitted 2 February, 2019; originally announced February 2019.

  37. arXiv:1809.06004  [pdf, other

    cs.CL cs.AI cs.LG

    Open-world Learning and Application to Product Classification

    Authors: Hu Xu, Bing Liu, Lei Shu, P. Yu

    Abstract: Classic supervised learning makes the closed-world assumption, meaning that classes seen in testing must have been seen in training. However, in the dynamic world, new or unseen class examples may appear constantly. A model working in such an environment must be able to reject unseen classes (not seen or used in training). If enough data is collected for the unseen classes, the system should incre… ▽ More

    Submitted 1 March, 2019; v1 submitted 16 September, 2018; originally announced September 2018.

    Comments: accepted by The Web Conference (WWW 2019) Previous title: Learning to Accept New Classes without Training

  38. arXiv:1805.09991  [pdf, other

    cs.CL

    Lifelong Domain Word Embedding via Meta-Learning

    Authors: Hu Xu, Bing Liu, Lei Shu, Philip S. Yu

    Abstract: Learning high-quality domain word embeddings is important for achieving good performance in many NLP tasks. General-purpose embeddings trained on large-scale corpora are often sub-optimal for domain-specific applications. However, domain-specific tasks often do not have large in-domain corpora for training high-quality domain embeddings. In this paper, we propose a novel lifelong learning setting… ▽ More

    Submitted 25 May, 2018; originally announced May 2018.

    Comments: 7 pages

    Journal ref: IJCAI 2018

  39. arXiv:1805.04601  [pdf, other

    cs.CL

    Double Embeddings and CNN-based Sequence Labeling for Aspect Extraction

    Authors: Hu Xu, Bing Liu, Lei Shu, Philip S. Yu

    Abstract: One key task of fine-grained sentiment analysis of product reviews is to extract product aspects or features that users have expressed opinions on. This paper focuses on supervised aspect extraction using deep learning. Unlike other highly sophisticated supervised deep learning models, this paper proposes a novel and yet simple CNN model employing two types of pre-trained embeddings for aspect ext… ▽ More

    Submitted 11 May, 2018; originally announced May 2018.

    Comments: ACL 2018

  40. arXiv:1804.07942  [pdf, other

    cs.CL

    Generative Stock Question Answering

    Authors: Zhaopeng Tu, Yong Jiang, Xiaojiang Liu, Lei Shu, Shuming Shi

    Abstract: We study the problem of stock related question answering (StockQA): automatically generating answers to stock related questions, just like professional stock analysts providing action recommendations to stocks upon user's requests. StockQA is quite different from previous QA tasks since (1) the answers in StockQA are natural language sentences (rather than entities or values) and due to the dynami… ▽ More

    Submitted 20 September, 2018; v1 submitted 21 April, 2018; originally announced April 2018.

    Comments: data: http://ai.tencent.com/ailab/nlp/data/stockQA.tar.gz

  41. arXiv:1804.00976  [pdf, ps, other

    math.DS cs.DM

    On Attractors of Isospectral Compressions of Networks

    Authors: Leonid Bunimovich, Longmei Shu

    Abstract: In the recently developed theory of isospectral transformations of networks isospectral compressions are performed with respect to some chosen characteristic (attribute) of nodes (or edges) of networks. Each isospectral compression (when a certain characteristic is fixed) defines a dynamical system on the space of all networks. It is shown that any orbit of such dynamical system which starts at an… ▽ More

    Submitted 30 March, 2018; originally announced April 2018.

    Comments: arXiv admin note: text overlap with arXiv:1802.03410

    MSC Class: 05C50; 15A18

  42. arXiv:1801.05609  [pdf, other

    cs.LG cs.AI

    Unseen Class Discovery in Open-world Classification

    Authors: Lei Shu, Hu Xu, Bing Liu

    Abstract: This paper concerns open-world classification, where the classifier not only needs to classify test examples into seen classes that have appeared in training but also reject examples from unseen or novel classes that have not appeared in training. Specifically, this paper focuses on discovering the hidden unseen classes of the rejected examples. Clearly, without prior knowledge this is difficult.… ▽ More

    Submitted 17 January, 2018; originally announced January 2018.

  43. arXiv:1712.02186  [pdf, other

    cs.CL

    Product Function Need Recognition via Semi-supervised Attention Network

    Authors: Hu Xu, Sihong Xie, Lei Shu, Philip S. Yu

    Abstract: Functionality is of utmost importance to customers when they purchase products. However, it is unclear to customers whether a product can really satisfy their needs on functions. Further, missing functions may be intentionally hidden by the manufacturers or the sellers. As a result, a customer needs to spend a fair amount of time before purchasing or just purchase the product on his/her own risk.… ▽ More

    Submitted 6 December, 2017; originally announced December 2017.

  44. arXiv:1712.02016  [pdf, other

    cs.CL

    Dual Attention Network for Product Compatibility and Function Satisfiability Analysis

    Authors: Hu Xu, Sihong Xie, Lei Shu, Philip S. Yu

    Abstract: Product compatibility and their functionality are of utmost importance to customers when they purchase products, and to sellers and manufacturers when they sell products. Due to the huge number of products available online, it is infeasible to enumerate and test the compatibility and functionality of every product. In this paper, we address two closely related problems: product compatibility analy… ▽ More

    Submitted 5 December, 2017; originally announced December 2017.

  45. arXiv:1709.08716  [pdf, other

    cs.CL

    DOC: Deep Open Classification of Text Documents

    Authors: Lei Shu, Hu Xu, Bing Liu

    Abstract: Traditional supervised learning makes the closed-world assumption that the classes appeared in the test data must have appeared in training. This also applies to text learning or text classification. As learning is used increasingly in dynamic open environments where some new/test documents may not belong to any of the training classes, identifying these novel documents during classification prese… ▽ More

    Submitted 25 September, 2017; originally announced September 2017.

    Comments: accepted at EMNLP 2017

  46. arXiv:1707.00323  [pdf, other

    math.NA cs.CG

    An improved isogeometric analysis method for trimmed geometries

    Authors: Jinlan Xu, Ningning Sun, Laixin Shu, Timon Rabczuk, Gang Xu

    Abstract: Trimming techniques are efficient ways to generate complex geometries in Computer-Aided Design(CAD). In this paper, an improved isogeometric analysis(IGA) method for trimmed geometries is proposed. We will show that the proposed method reduces the numerical error of physical solution by 50% for simple trimmed geometries, and the condition number of stiffness matrix is also decreased. Furthermore,… ▽ More

    Submitted 2 July, 2017; originally announced July 2017.

  47. arXiv:1705.10030  [pdf, ps, other

    cs.CL

    Supervised Complementary Entity Recognition with Augmented Key-value Pairs of Knowledge

    Authors: Hu Xu, Lei Shu, Philip S. Yu

    Abstract: Extracting opinion targets is an important task in sentiment analysis on product reviews and complementary entities (products) are one important type of opinion targets that may work together with the reviewed product. In this paper, we address the problem of Complementary Entity Recognition (CER) as a supervised sequence labeling with the capability of expanding domain knowledge as key-value pair… ▽ More

    Submitted 28 May, 2017; originally announced May 2017.

  48. arXiv:1705.00251  [pdf, ps, other

    cs.CL

    Lifelong Learning CRF for Supervised Aspect Extraction

    Authors: Lei Shu, Hu Xu, Bing Liu

    Abstract: This paper makes a focused contribution to supervised aspect extraction. It shows that if the system has performed aspect extraction from many past domains and retained their results as knowledge, Conditional Random Fields (CRF) can leverage this knowledge in a lifelong learning manner to extract in a new domain markedly better than the traditional CRF without using this prior knowledge. The key i… ▽ More

    Submitted 29 April, 2017; originally announced May 2017.

    Comments: Accepted at ACL 2017. arXiv admin note: text overlap with arXiv:1612.07940

  49. arXiv:1612.07940  [pdf, ps, other

    cs.CL cs.LG

    Supervised Opinion Aspect Extraction by Exploiting Past Extraction Results

    Authors: Lei Shu, Bing Liu, Hu Xu, Annice Kim

    Abstract: One of the key tasks of sentiment analysis of product reviews is to extract product aspects or features that users have expressed opinions on. In this work, we focus on using supervised sequence labeling as the base approach to performing the task. Although several extraction methods using sequence labeling methods such as Conditional Random Fields (CRF) and Hidden Markov Models (HMM) have been pr… ▽ More

    Submitted 23 December, 2016; originally announced December 2016.

    Comments: 10 pages

  50. arXiv:1612.04499  [pdf, other

    cs.CL

    Mining Compatible/Incompatible Entities from Question and Answering via Yes/No Answer Classification using Distant Label Expansion

    Authors: Hu Xu, Lei Shu, Jingyuan Zhang, Philip S. Yu

    Abstract: Product Community Question Answering (PCQA) provides useful information about products and their features (aspects) that may not be well addressed by product descriptions and reviews. We observe that a product's compatibility issues with other products are frequently discussed in PCQA and such issues are more frequently addressed in accessories, i.e., via a yes/no question "Does this mouse work wi… ▽ More

    Submitted 14 December, 2016; originally announced December 2016.

    Comments: 9 pages, 1 figures