Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 51 results for author: Kurohashi, S

.
  1. arXiv:2407.03963  [pdf, other

    cs.CL cs.AI

    LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs

    Authors: LLM-jp, :, Akiko Aizawa, Eiji Aramaki, Bowen Chen, Fei Cheng, Hiroyuki Deguchi, Rintaro Enomoto, Kazuki Fujii, Kensuke Fukumoto, Takuya Fukushima, Namgi Han, Yuto Harada, Chikara Hashimoto, Tatsuya Hiraoka, Shohei Hisada, Sosuke Hosokawa, Lu Jie, Keisuke Kamata, Teruhito Kanazawa, Hiroki Kanezashi, Hiroshi Kataoka, Satoru Katsumata, Daisuke Kawahara, Seiya Kawano , et al. (57 additional authors not shown)

    Abstract: This paper introduces LLM-jp, a cross-organizational project for the research and development of Japanese large language models (LLMs). LLM-jp aims to develop open-source and strong Japanese LLMs, and as of this writing, more than 1,500 participants from academia and industry are working together for this purpose. This paper presents the background of the establishment of LLM-jp, summaries of its… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  2. arXiv:2405.13233  [pdf, other

    cs.CL

    MELD-ST: An Emotion-aware Speech Translation Dataset

    Authors: Sirou Chen, Sakiko Yahata, Shuichiro Shimizu, Zhengdong Yang, Yihang Li, Chenhui Chu, Sadao Kurohashi

    Abstract: Emotion plays a crucial role in human conversation. This paper underscores the significance of considering emotion in speech translation. We present the MELD-ST dataset for the emotion-aware speech translation task, comprising English-to-Japanese and English-to-German language pairs. Each language pair includes about 10,000 utterances annotated with emotion labels from the MELD dataset. Baseline e… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: 9 pages. Accepted to ACL 2024 Findings. Dataset: https://huggingface.co/datasets/ku-nlp/MELD-ST

  3. arXiv:2403.19259  [pdf, other

    cs.CL

    J-CRe3: A Japanese Conversation Dataset for Real-world Reference Resolution

    Authors: Nobuhiro Ueda, Hideko Habe, Yoko Matsui, Akishige Yuguchi, Seiya Kawano, Yasutomo Kawanishi, Sadao Kurohashi, Koichiro Yoshino

    Abstract: Understanding expressions that refer to the physical world is crucial for such human-assisting systems in the real world, as robots that must perform actions that are expected by users. In real-world reference resolution, a system must ground the verbal information that appears in user interactions to the visual information observed in egocentric views. To this end, we propose a multimodal referen… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: LREC-COLING 2024

  4. arXiv:2403.18504  [pdf

    cs.CL

    AcTED: Automatic Acquisition of Typical Event Duration for Semi-supervised Temporal Commonsense QA

    Authors: Felix Virgo, Fei Cheng, Lis Kanashiro Pereira, Masayuki Asahara, Ichiro Kobayashi, Sadao Kurohashi

    Abstract: We propose a voting-driven semi-supervised approach to automatically acquire the typical duration of an event and use it as pseudo-labeled data. The human evaluation demonstrates that our pseudo labels exhibit surprisingly high accuracy and balanced coverage. In the temporal commonsense QA task, experimental results show that using only pseudo examples of 400 events, we achieve performance compara… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  5. arXiv:2403.03690  [pdf

    cs.CL cs.AI

    Rapidly Developing High-quality Instruction Data and Evaluation Benchmark for Large Language Models with Minimal Human Effort: A Case Study on Japanese

    Authors: Yikun Sun, Zhen Wan, Nobuhiro Ueda, Sakiko Yahata, Fei Cheng, Chenhui Chu, Sadao Kurohashi

    Abstract: The creation of instruction data and evaluation benchmarks for serving Large language models often involves enormous human annotation. This issue becomes particularly pronounced when rapidly developing such resources for a non-English language like Japanese. Instead of following the popular practice of directly translating existing English resources into Japanese (e.g., Japanese-Alpaca), we propos… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: COLING 2024. Our code are available here: \href{https://github.com/hitoshizuku7/awesome-Ja-self-instruct}{self-instruct data} and \href{https://github.com/ku-nlp/ja-vicuna-qa-benchmark}{evaluation benchmark}

  6. arXiv:2402.13522  [pdf, other

    cs.CL

    RecMind: Japanese Movie Recommendation Dialogue with Seeker's Internal State

    Authors: Takashi Kodama, Hirokazu Kiyomaru, Yin Jou Huang, Sadao Kurohashi

    Abstract: Humans pay careful attention to the interlocutor's internal state in dialogues. For example, in recommendation dialogues, we make recommendations while estimating the seeker's internal state, such as his/her level of knowledge and interest. Since there are no existing annotated resources for the analysis, we constructed RecMind, a Japanese movie recommendation dialogue dataset with annotations of… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  7. arXiv:2311.03696  [pdf, other

    cs.CL

    Bilingual Corpus Mining and Multistage Fine-Tuning for Improving Machine Translation of Lecture Transcripts

    Authors: Haiyue Song, Raj Dabre, Chenhui Chu, Atsushi Fujita, Sadao Kurohashi

    Abstract: Lecture transcript translation helps learners understand online courses, however, building a high-quality lecture machine translation system lacks publicly available parallel corpora. To address this, we examine a framework for parallel corpus mining, which provides a quick and effective way to mine a parallel corpus from publicly available lectures on Coursera. To create the parallel corpora, we… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Comments: Submitted to the Journal of Information Processing (JIP). arXiv admin note: text overlap with arXiv:1912.11739

  8. arXiv:2310.20236  [pdf, other

    cs.CL

    Dynamically Updating Event Representations for Temporal Relation Classification with Multi-category Learning

    Authors: Fei Cheng, Masayuki Asahara, Ichiro Kobayashi, Sadao Kurohashi

    Abstract: Temporal relation classification is a pair-wise task for identifying the relation of a temporal link (TLINK) between two mentions, i.e. event, time, and document creation time (DCT). It leads to two crucial limits: 1) Two TLINKs involving a common mention do not share information. 2) Existing models with independent classifiers for each TLINK category (E2E, E2T, and E2D) hinder from using the whol… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

    Comments: EMNLP 2020 Findings

  9. arXiv:2310.20201  [pdf, other

    cs.CL

    Video-Helpful Multimodal Machine Translation

    Authors: Yihang Li, Shuichiro Shimizu, Chenhui Chu, Sadao Kurohashi, Wei Li

    Abstract: Existing multimodal machine translation (MMT) datasets consist of images and video captions or instructional video subtitles, which rarely contain linguistic ambiguity, making visual information ineffective in generating appropriate translations. Recent work has constructed an ambiguous subtitles dataset to alleviate this problem but is still limited to the problem that videos do not necessarily c… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

    Comments: Accepted by EMNLP 2023 Main Conference (long paper)

  10. arXiv:2310.03328  [pdf, other

    cs.CL

    Reformulating Domain Adaptation of Large Language Models as Adapt-Retrieve-Revise

    Authors: Zhen wan, Yating Zhang, Yexiang Wang, Fei Cheng, Sadao Kurohashi

    Abstract: While large language models (LLMs) like GPT-4 have recently demonstrated astonishing zero-shot capabilities in general domain tasks, they often generate content with hallucinations in specific domains such as Chinese law, hindering their application in these areas. This is typically due to the absence of training data that encompasses such a specific domain, preventing GPT-4 from acquiring in-doma… ▽ More

    Submitted 12 October, 2023; v1 submitted 5 October, 2023; originally announced October 2023.

    Comments: Under submission to ICLR 2024

  11. SelfSeg: A Self-supervised Sub-word Segmentation Method for Neural Machine Translation

    Authors: Haiyue Song, Raj Dabre, Chenhui Chu, Sadao Kurohashi, Eiichiro Sumita

    Abstract: Sub-word segmentation is an essential pre-processing step for Neural Machine Translation (NMT). Existing work has shown that neural sub-word segmenters are better than Byte-Pair Encoding (BPE), however, they are inefficient as they require parallel corpora, days to train and hours to decode. This paper introduces SelfSeg, a self-supervised neural sub-word segmentation method that is much faster to… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.

    Comments: Accepted to TALLIP journal

  12. arXiv:2305.16896  [pdf, other

    cs.CL cs.AI cs.LG

    MultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of Thought Prompting

    Authors: Tatsuro Inaba, Hirokazu Kiyomaru, Fei Cheng, Sadao Kurohashi

    Abstract: Large language models (LLMs) have achieved impressive performance on various reasoning tasks. To further improve the performance, we propose MultiTool-CoT, a novel framework that leverages chain-of-thought (CoT) prompting to incorporate multiple external tools, such as a calculator and a knowledge retriever, during the reasoning process. We apply MultiTool-CoT to the Task 2 dataset of NumGLUE, whi… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: ACL2023. Our code is available at https://github.com/InabaTatsuro/MultiTool-CoT

  13. arXiv:2305.10190  [pdf, other

    cs.CL

    Variable-length Neural Interlingua Representations for Zero-shot Neural Machine Translation

    Authors: Zhuoyuan Mao, Haiyue Song, Raj Dabre, Chenhui Chu, Sadao Kurohashi

    Abstract: The language-independency of encoded representations within multilingual neural machine translation (MNMT) models is crucial for their generalization ability on zero-shot translation. Neural interlingua representations have been shown as an effective method for achieving this. However, fixed-length neural interlingua representations introduced in previous work can limit its flexibility and represe… ▽ More

    Submitted 17 May, 2023; originally announced May 2023.

    Comments: Accepted to Multi3Generation workshop (held in conjunction with EAMT 2023)

  14. arXiv:2305.09312  [pdf, other

    cs.CL

    Exploring the Impact of Layer Normalization for Zero-shot Neural Machine Translation

    Authors: Zhuoyuan Mao, Raj Dabre, Qianying Liu, Haiyue Song, Chenhui Chu, Sadao Kurohashi

    Abstract: This paper studies the impact of layer normalization (LayerNorm) on zero-shot translation (ZST). Recent efforts for ZST often utilize the Transformer architecture as the backbone, with LayerNorm at the input of layers (PreNorm) set as the default. However, Xu et al. (2019) has revealed that PreNorm carries the risk of overfitting the training data. Based on this, we hypothesize that PreNorm may ov… ▽ More

    Submitted 16 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL 2023 main conference

  15. arXiv:2305.09210  [pdf, other

    cs.CL

    Towards Speech Dialogue Translation Mediating Speakers of Different Languages

    Authors: Shuichiro Shimizu, Chenhui Chu, Sheng Li, Sadao Kurohashi

    Abstract: We present a new task, speech dialogue translation mediating speakers of different languages. We construct the SpeechBSD dataset for the task and conduct baseline experiments. Furthermore, we consider context to be an important aspect that needs to be addressed in this task and propose two ways of utilizing context, namely monolingual context and bilingual context. We conduct cascaded speech trans… ▽ More

    Submitted 22 May, 2023; v1 submitted 16 May, 2023; originally announced May 2023.

    Comments: 11 pages, 4 figures. Accepted to ACL 2023 Findings. Dataset: https://github.com/ku-nlp/speechBSD

  16. arXiv:2305.08371  [pdf, other

    cs.CL

    SuperDialseg: A Large-scale Dataset for Supervised Dialogue Segmentation

    Authors: Junfeng Jiang, Chengzhang Dong, Sadao Kurohashi, Akiko Aizawa

    Abstract: Dialogue segmentation is a crucial task for dialogue systems allowing a better understanding of conversational texts. Despite recent progress in unsupervised dialogue segmentation methods, their performances are limited by the lack of explicit supervised signals for training. Furthermore, the precise definition of segmentation points in conversations still remains as a challenging problem, increas… ▽ More

    Submitted 15 October, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

    Comments: Accepted as a Long Paper at EMNLP 2023 (main)

  17. arXiv:2305.07475  [pdf, other

    cs.CL

    Comprehensive Solution Program Centric Pretraining for Table-and-Text Hybrid Numerical Reasoning

    Authors: Qianying Liu, Dongsheng Yang, Wenjie Zhong, Fei Cheng, Sadao Kurohashi

    Abstract: Numerical reasoning over table-and-text hybrid passages, such as financial reports, poses significant challenges and has numerous potential applications. Noise and irrelevant variables in the model input have been a hindrance to its performance. Additionally, coarse-grained supervision of the whole solution program has impeded the model's ability to learn the underlying numerical reasoning process… ▽ More

    Submitted 12 May, 2023; originally announced May 2023.

    Comments: 11 pages

  18. arXiv:2305.02105  [pdf, other

    cs.CL

    GPT-RE: In-context Learning for Relation Extraction using Large Language Models

    Authors: Zhen Wan, Fei Cheng, Zhuoyuan Mao, Qianying Liu, Haiyue Song, Jiwei Li, Sadao Kurohashi

    Abstract: In spite of the potential for ground-breaking achievements offered by large language models (LLMs) (e.g., GPT-3), they still lag significantly behind fully-supervised baselines (e.g., fine-tuned BERT) in relation extraction (RE). This is due to the two major shortcomings of LLMs in RE: (1) low relevance regarding entity and relation in retrieved demonstrations for in-context learning; and (2) the… ▽ More

    Submitted 8 December, 2023; v1 submitted 3 May, 2023; originally announced May 2023.

    Comments: Accepted by EMNLP 2023 Main Conference (long paper)

  19. arXiv:2211.16022  [pdf, other

    cs.CL

    Textual Enhanced Contrastive Learning for Solving Math Word Problems

    Authors: Yibin Shen, Qianying Liu, Zhuoyuan Mao, Fei Cheng, Sadao Kurohashi

    Abstract: Solving math word problems is the task that analyses the relation of quantities and requires an accurate understanding of contextual natural language information. Recent studies show that current models rely on shallow heuristics to predict solutions and could be easily misled by small textual perturbations. To address this problem, we propose a Textual Enhanced Contrastive Learning framework, whi… ▽ More

    Submitted 29 November, 2022; originally announced November 2022.

    Comments: Findings of EMNLP 2022

  20. arXiv:2210.11800  [pdf, other

    cs.CL

    Rescue Implicit and Long-tail Cases: Nearest Neighbor Relation Extraction

    Authors: Zhen Wan, Qianying Liu, Zhuoyuan Mao, Fei Cheng, Sadao Kurohashi, Jiwei Li

    Abstract: Relation extraction (RE) has achieved remarkable progress with the help of pre-trained language models. However, existing RE models are usually incapable of handling two situations: implicit expressions and long-tail relation types, caused by language complexity and data sparsity. In this paper, we introduce a simple enhancement of RE using $k$ nearest neighbors ($k$NN-RE). $k$NN-RE allows the mod… ▽ More

    Submitted 30 January, 2023; v1 submitted 21 October, 2022; originally announced October 2022.

    Comments: EMNLP 2022 (short paper)

  21. arXiv:2210.07017  [pdf, other

    cs.CL

    ComSearch: Equation Searching with Combinatorial Strategy for Solving Math Word Problems with Weak Supervision

    Authors: Qianying Liu, Wenyu Guan, Jianhao Shen, Fei Cheng, Sadao Kurohashi

    Abstract: Previous studies have introduced a weakly-supervised paradigm for solving math word problems requiring only the answer value annotation. While these methods search for correct value equation candidates as pseudo labels, they search among a narrow sub-space of the enormous equation space. To address this problem, we propose a novel search algorithm with combinatorial strategy \textbf{ComSearch}, wh… ▽ More

    Submitted 7 March, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: EACL 2023 long paper, 14 pages

  22. arXiv:2209.10310  [pdf, other

    cs.CL

    Seeking Diverse Reasoning Logic: Controlled Equation Expression Generation for Solving Math Word Problems

    Authors: Yibin Shen, Qianying Liu, Zhuoyuan Mao, Zhen Wan, Fei Cheng, Sadao Kurohashi

    Abstract: To solve Math Word Problems, human students leverage diverse reasoning logic that reaches different possible equation solutions. However, the mainstream sequence-to-sequence approach of automatic solvers aims to decode a fixed solution equation supervised by human annotation. In this paper, we propose a controlled equation generation solver by leveraging a set of control codes to guide the model t… ▽ More

    Submitted 29 November, 2022; v1 submitted 21 September, 2022; originally announced September 2022.

    Comments: AACL 2022 short paper

  23. EMS: Efficient and Effective Massively Multilingual Sentence Embedding Learning

    Authors: Zhuoyuan Mao, Chenhui Chu, Sadao Kurohashi

    Abstract: Massively multilingual sentence representation models, e.g., LASER, SBERT-distill, and LaBSE, help significantly improve cross-lingual downstream tasks. However, the use of a large amount of data or inefficient model architectures results in heavy computation to train a new model according to our preferred languages and domains. To resolve this issue, we introduce efficient and effective massively… ▽ More

    Submitted 30 May, 2024; v1 submitted 31 May, 2022; originally announced May 2022.

    Comments: This work is a multilingual extension of arXiv:2105.13856. This work has been accepted by IEEE/ACM Transactions on Audio, Speech, and Language Processing (DOI: 10.1109/TASLP.2024.3402064). Copyright has been transferred

  24. arXiv:2205.08770  [pdf, other

    cs.CL

    Relation Extraction with Weighted Contrastive Pre-training on Distant Supervision

    Authors: Zhen Wan, Fei Cheng, Qianying Liu, Zhuoyuan Mao, Haiyue Song, Sadao Kurohashi

    Abstract: Contrastive pre-training on distant supervision has shown remarkable effectiveness in improving supervised relation extraction tasks. However, the existing methods ignore the intrinsic noise of distant supervision during the pre-training stage. In this paper, we propose a weighted contrastive learning method by leveraging the supervised data to estimate the reliability of pre-training instances an… ▽ More

    Submitted 10 February, 2023; v1 submitted 18 May, 2022; originally announced May 2022.

    Comments: EACL 2023 (Findings)

  25. arXiv:2204.12165  [pdf, other

    cs.CL

    When do Contrastive Word Alignments Improve Many-to-many Neural Machine Translation?

    Authors: Zhuoyuan Mao, Chenhui Chu, Raj Dabre, Haiyue Song, Zhen Wan, Sadao Kurohashi

    Abstract: Word alignment has proven to benefit many-to-many neural machine translation (NMT). However, high-quality ground-truth bilingual dictionaries were used for pre-editing in previous methods, which are unavailable for most language pairs. Meanwhile, the contrastive objective can implicitly utilize automatically learned word alignment, which has not been explored in many-to-many NMT. This work propose… ▽ More

    Submitted 26 April, 2022; originally announced April 2022.

    Comments: NAACL 2022 findings

  26. arXiv:2204.03855  [pdf, other

    eess.AS cs.CL

    Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition

    Authors: Qianying Liu, Zhuo Gong, Zhengdong Yang, Yuhang Yang, Sheng Li, Chenchen Ding, Nobuaki Minematsu, Hao Huang, Fei Cheng, Chenhui Chu, Sadao Kurohashi

    Abstract: Low-resource speech recognition has been long-suffering from insufficient training data. In this paper, we propose an approach that leverages neighboring languages to improve low-resource scenario performance, founded on the hypothesis that similar linguistic units in neighboring languages exhibit comparable term frequency distributions, which enables us to construct a Huffman tree for performing… ▽ More

    Submitted 30 April, 2023; v1 submitted 8 April, 2022; originally announced April 2022.

    Comments: 7 pages, ICASSP 2023

  27. Linguistically-driven Multi-task Pre-training for Low-resource Neural Machine Translation

    Authors: Zhuoyuan Mao, Chenhui Chu, Sadao Kurohashi

    Abstract: In the present study, we propose novel sequence-to-sequence pre-training objectives for low-resource machine translation (NMT): Japanese-specific sequence to sequence (JASS) for language pairs involving Japanese as the source or target language, and English-specific sequence to sequence (ENSS) for language pairs involving English. JASS focuses on masking and reordering Japanese linguistic units kn… ▽ More

    Submitted 20 January, 2022; originally announced January 2022.

    Comments: An extension of work arXiv:2005.03361

    Journal ref: TALLIP Volume 21, Issue 4, July 2022

  28. arXiv:2201.08054  [pdf, other

    cs.CL

    VISA: An Ambiguous Subtitles Dataset for Visual Scene-Aware Machine Translation

    Authors: Yihang Li, Shuichiro Shimizu, Weiqi Gu, Chenhui Chu, Sadao Kurohashi

    Abstract: Existing multimodal machine translation (MMT) datasets consist of images and video captions or general subtitles, which rarely contain linguistic ambiguity, making visual information not so effective to generate appropriate translations. We introduce VISA, a new dataset that consists of 40k Japanese-English parallel sentence pairs and corresponding video clips with the following key features: (1)… ▽ More

    Submitted 26 May, 2022; v1 submitted 20 January, 2022; originally announced January 2022.

    Comments: Accepted by LREC2022

  29. arXiv:2111.05805  [pdf, other

    cs.CL

    Cross-lingual Adaption Model-Agnostic Meta-Learning for Natural Language Understanding

    Authors: Qianying Liu, Fei Cheng, Sadao Kurohashi

    Abstract: Meta learning with auxiliary languages has demonstrated promising improvements for cross-lingual natural language processing. However, previous studies sample the meta-training and meta-testing data from the same language, which limits the ability of the model for cross-lingual transfer. In this paper, we propose XLA-MAML, which performs direct cross-lingual adaption in the meta-learning stage. We… ▽ More

    Submitted 10 November, 2021; originally announced November 2021.

    Comments: 11 pages

  30. arXiv:2111.04261  [pdf, other

    cs.CL cs.AI

    JaMIE: A Pipeline Japanese Medical Information Extraction System

    Authors: Fei Cheng, Shuntaro Yada, Ribeka Tanaka, Eiji Aramaki, Sadao Kurohashi

    Abstract: We present an open-access natural language processing toolkit for Japanese medical information extraction. We first propose a novel relation annotation schema for investigating the medical and temporal relations between medical entities in Japanese medical reports. We experiment with the practical annotation scenarios by separately annotating two different types of reports. We design a pipeline sy… ▽ More

    Submitted 7 November, 2021; originally announced November 2021.

    Comments: 8 pages

  31. arXiv:2105.13856  [pdf, other

    cs.CL cs.AI

    Lightweight Cross-Lingual Sentence Representation Learning

    Authors: Zhuoyuan Mao, Prakhar Gupta, Pei Wang, Chenhui Chu, Martin Jaggi, Sadao Kurohashi

    Abstract: Large-scale models for learning fixed-dimensional cross-lingual sentence representations like LASER (Artetxe and Schwenk, 2019b) lead to significant improvement in performance on downstream tasks. However, further increases and modifications based on such large-scale models are usually impractical due to memory limitations. In this work, we introduce a lightweight dual-transformer architecture wit… ▽ More

    Submitted 27 May, 2022; v1 submitted 28 May, 2021; originally announced May 2021.

    Comments: ACL 2021 main conference; modified Eq. (2)

  32. arXiv:2104.09833  [pdf, other

    cs.CL

    Frustratingly Easy Edit-based Linguistic Steganography with a Masked Language Model

    Authors: Honai Ueoka, Yugo Murawaki, Sadao Kurohashi

    Abstract: With advances in neural language models, the focus of linguistic steganography has shifted from edit-based approaches to generation-based ones. While the latter's payload capacity is impressive, generating genuine-looking texts remains challenging. In this paper, we revisit edit-based linguistic steganography, with the idea that a masked language model offers an off-the-shelf solution. The propose… ▽ More

    Submitted 20 April, 2021; originally announced April 2021.

    Comments: 7 pages, 4 firgures

  33. arXiv:2012.03118  [pdf, other

    cs.CL

    Modeling and Utilizing User's Internal State in Movie Recommendation Dialogue

    Authors: Takashi Kodama, Ribeka Tanaka, Sadao Kurohashi

    Abstract: Intelligent dialogue systems are expected as a new interface between humans and machines. Such an intelligent dialogue system should estimate the user's internal state (UIS) in dialogues and change its response appropriately according to the estimation result. In this paper, we model the UIS in dialogues, taking movie recommendation dialogues as examples, and construct a dialogue system that chang… ▽ More

    Submitted 5 December, 2020; originally announced December 2020.

  34. arXiv:2010.01556  [pdf, other

    cs.CL

    Reverse Operation based Data Augmentation for Solving Math Word Problems

    Authors: Qianying Liu, Wenyu Guan, Sujian Li, Fei Cheng, Daisuke Kawahara, Sadao Kurohashi

    Abstract: Automatically solving math word problems is a critical task in the field of natural language processing. Recent models have reached their performance bottleneck and require more high-quality data for training. We propose a novel data augmentation method that reverses the mathematical logic of math word problems to produce new high-quality math problems and introduce new knowledge points that can b… ▽ More

    Submitted 10 November, 2021; v1 submitted 4 October, 2020; originally announced October 2020.

    Comments: 11 pages. Accepted by IEEE Transactions on Audio, Speech and Language Processing

  35. arXiv:2009.07503  [pdf, other

    cs.CL cs.AI cs.LG

    Minimize Exposure Bias of Seq2Seq Models in Joint Entity and Relation Extraction

    Authors: Ranran Haoran Zhang, Qianying Liu, Aysa Xuemo Fan, Heng Ji, Daojian Zeng, Fei Cheng, Daisuke Kawahara, Sadao Kurohashi

    Abstract: Joint entity and relation extraction aims to extract relation triplets from plain text directly. Prior work leverages Sequence-to-Sequence (Seq2Seq) models for triplet sequence generation. However, Seq2Seq enforces an unnecessary order on the unordered triplets and involves a large decoding length associated with error accumulation. These introduce exposure bias, which may cause the models overfit… ▽ More

    Submitted 6 October, 2020; v1 submitted 16 September, 2020; originally announced September 2020.

    Comments: EMNLP 2020 Findings

  36. arXiv:2008.01523  [pdf, other

    cs.CL

    A System for Worldwide COVID-19 Information Aggregation

    Authors: Akiko Aizawa, Frederic Bergeron, Junjie Chen, Fei Cheng, Katsuhiko Hayashi, Kentaro Inui, Hiroyoshi Ito, Daisuke Kawahara, Masaru Kitsuregawa, Hirokazu Kiyomaru, Masaki Kobayashi, Takashi Kodama, Sadao Kurohashi, Qianying Liu, Masaki Matsubara, Yusuke Miyao, Atsuyuki Morishima, Yugo Murawaki, Kazumasa Omura, Haiyue Song, Eiichiro Sumita, Shinji Suzuki, Ribeka Tanaka, Yu Tanaka, Masashi Toyoda , et al. (4 additional authors not shown)

    Abstract: The global pandemic of COVID-19 has made the public pay close attention to related news, covering various domains, such as sanitation, treatment, and effects on education. Meanwhile, the COVID-19 condition is very different among the countries (e.g., policies and development of the epidemic), and thus citizens would be interested in news in foreign countries. We build a system for worldwide COVID-… ▽ More

    Submitted 11 October, 2020; v1 submitted 27 July, 2020; originally announced August 2020.

    Comments: Accepted to EMNLP 2020 Workshop NLP-COVID

  37. arXiv:2005.03361  [pdf, other

    cs.CL

    JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation

    Authors: Zhuoyuan Mao, Fabien Cromieres, Raj Dabre, Haiyue Song, Sadao Kurohashi

    Abstract: Neural machine translation (NMT) needs large parallel corpora for state-of-the-art translation quality. Low-resource NMT is typically addressed by transfer learning which leverages large monolingual or parallel corpora for pre-training. Monolingual pre-training approaches such as MASS (MAsked Sequence to Sequence) are extremely effective in boosting NMT quality for languages with small parallel co… ▽ More

    Submitted 7 May, 2020; originally announced May 2020.

    Comments: LREC 2020

  38. arXiv:2001.08353  [pdf, other

    cs.CL cs.LG

    Pre-training via Leveraging Assisting Languages and Data Selection for Neural Machine Translation

    Authors: Haiyue Song, Raj Dabre, Zhuoyuan Mao, Fei Cheng, Sadao Kurohashi, Eiichiro Sumita

    Abstract: Sequence-to-sequence (S2S) pre-training using large monolingual data is known to improve performance for various S2S NLP tasks in low-resource settings. However, large monolingual corpora might not always be available for the languages of interest (LOI). To this end, we propose to exploit monolingual corpora of other languages to complement the scarcity of monolingual corpora for the LOI. A case s… ▽ More

    Submitted 22 January, 2020; originally announced January 2020.

    Comments: Work in progress. Submitted to a conference

  39. arXiv:1912.11739  [pdf, other

    cs.CL cs.AI cs.LG

    Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation

    Authors: Haiyue Song, Raj Dabre, Atsushi Fujita, Sadao Kurohashi

    Abstract: Lectures translation is a case of spoken language translation and there is a lack of publicly available parallel corpora for this purpose. To address this, we examine a language independent framework for parallel corpus mining which is a quick and effective way to mine a parallel corpus from publicly available lectures at Coursera. Our approach determines sentence alignments, relying on machine tr… ▽ More

    Submitted 13 January, 2020; v1 submitted 25 December, 2019; originally announced December 2019.

    Comments: 10 pages, 1 figure, 9 tables, under review by LREC2020

  40. Emotion helps Sentiment: A Multi-task Model for Sentiment and Emotion Analysis

    Authors: Abhishek Kumar, Asif Ekbal, Daisuke Kawahra, Sadao Kurohashi

    Abstract: In this paper, we propose a two-layered multi-task attention based neural network that performs sentiment analysis through emotion analysis. The proposed approach is based on Bidirectional Long Short-Term Memory and uses Distributional Thesaurus as a source of external knowledge to improve the sentiment and emotion prediction. The proposed system has two levels of attention to hierarchically build… ▽ More

    Submitted 28 November, 2019; originally announced November 2019.

    Comments: Accepted in the Proceedings of The 2019 IEEE International Joint Conference on Neural Networks (IJCNN 2019)

  41. arXiv:1911.09709  [pdf, other

    cs.CL cs.AI

    Automatically Neutralizing Subjective Bias in Text

    Authors: Reid Pryzant, Richard Diehl Martinez, Nathan Dass, Sadao Kurohashi, Dan Jurafsky, Diyi Yang

    Abstract: Texts like news, encyclopedias, and some social media strive for objectivity. Yet bias in the form of inappropriate subjectivity - introducing attitudes via framing, presupposing truth, and casting doubt - remains ubiquitous. This kind of bias erodes our collective trust and fuels social conflict. To address this issue, we introduce a novel testbed for natural language generation: automatically br… ▽ More

    Submitted 12 December, 2019; v1 submitted 21 November, 2019; originally announced November 2019.

    Comments: To appear at AAAI 2020

  42. arXiv:1909.00694  [pdf, other

    cs.CL

    Minimally Supervised Learning of Affective Events Using Discourse Relations

    Authors: Jun Saito, Yugo Murawaki, Sadao Kurohashi

    Abstract: Recognizing affective events that trigger positive or negative sentiment has a wide range of natural language processing applications but remains a challenging problem mainly because the polarity of an event is not necessarily predictable from its constituent words. In this paper, we propose to propagate affective polarity using discourse relations. Our method is simple and only requires a very sm… ▽ More

    Submitted 28 December, 2019; v1 submitted 2 September, 2019; originally announced September 2019.

    Comments: 8 pages, 1 figure. EMNLP2019 (short paper)

  43. arXiv:1905.02851  [pdf, other

    cs.IR cs.CL

    FAQ Retrieval using Query-Question Similarity and BERT-Based Query-Answer Relevance

    Authors: Wataru Sakata, Tomohide Shibata, Ribeka Tanaka, Sadao Kurohashi

    Abstract: Frequently Asked Question (FAQ) retrieval is an important task where the objective is to retrieve an appropriate Question-Answer (QA) pair from a database based on a user's query. We propose a FAQ retrieval system that considers the similarity between a user's query and a question as well as the relevance between the query and an answer. Although a common approach to FAQ retrieval is to construct… ▽ More

    Submitted 23 May, 2019; v1 submitted 7 May, 2019; originally announced May 2019.

    Comments: Accepted in SIGIR 2019 (short paper), camera ready, 4 pages

  44. arXiv:1808.01216  [pdf, other

    cs.CL

    A Multi-task Ensemble Framework for Emotion, Sentiment and Intensity Prediction

    Authors: Md Shad Akhtar, Deepanway Ghosal, Asif Ekbal, Pushpak Bhattacharyya, Sadao Kurohashi

    Abstract: In this paper, through multi-task ensemble framework we address three problems of emotion and sentiment analysis i.e. "emotion classification & intensity", "valence, arousal & dominance for emotion" and "valence & arousal} for sentiment". The underlying problems cover two granularities (i.e. coarse-grained and fine-grained) and a diverse range of domains (i.e. tweets, Facebook posts, news headline… ▽ More

    Submitted 15 October, 2018; v1 submitted 3 August, 2018; originally announced August 2018.

  45. arXiv:1806.00971  [pdf, other

    cs.CL

    Neural Adversarial Training for Semi-supervised Japanese Predicate-argument Structure Analysis

    Authors: Shuhei Kurita, Daisuke Kawahara, Sadao Kurohashi

    Abstract: Japanese predicate-argument structure (PAS) analysis involves zero anaphora resolution, which is notoriously difficult. To improve the performance of Japanese PAS analysis, it is straightforward to increase the size of corpora annotated with PAS. However, since it is prohibitively expensive, it is promising to take advantage of a large amount of raw corpora. In this paper, we propose a novel Japan… ▽ More

    Submitted 4 June, 2018; v1 submitted 4 June, 2018; originally announced June 2018.

    Comments: Accepted by ACL-2018. 9 pages, 3 figures

  46. arXiv:1805.07819  [pdf, other

    cs.CL

    Knowledge-enriched Two-layered Attention Network for Sentiment Analysis

    Authors: Abhishek Kumar, Daisuke Kawahara, Sadao Kurohashi

    Abstract: We propose a novel two-layered attention network based on Bidirectional Long Short-Term Memory for sentiment analysis. The novel two-layered attention network takes advantage of the external knowledge bases to improve the sentiment prediction. It uses the Knowledge Graph Embedding generated using the WordNet. We build our model by combining the two-layered attention network with the supervised mod… ▽ More

    Submitted 15 June, 2018; v1 submitted 20 May, 2018; originally announced May 2018.

    Comments: Accepted in NAACL 2018

  47. arXiv:1710.01025  [pdf, ps, other

    cs.CL

    MMCR4NLP: Multilingual Multiway Corpora Repository for Natural Language Processing

    Authors: Raj Dabre, Sadao Kurohashi

    Abstract: Multilinguality is gradually becoming ubiquitous in the sense that more and more researchers have successfully shown that using additional languages help improve the results in many Natural Language Processing tasks. Multilingual Multiway Corpora (MMC) contain the same sentence in multiple languages. Such corpora have been primarily used for Multi-Source and Pivot Language Machine Translation but… ▽ More

    Submitted 14 February, 2019; v1 submitted 3 October, 2017; originally announced October 2017.

    Comments: V2: Fixed broken urls V1: 4 pages, Language Resources Paper, Submitted to LREC 2018, parallel corpora, multilingual multiway corpora, machine translation, resource

  48. arXiv:1702.06135  [pdf, other

    cs.CL

    Enabling Multi-Source Neural Machine Translation By Concatenating Source Sentences In Multiple Languages

    Authors: Raj Dabre, Fabien Cromieres, Sadao Kurohashi

    Abstract: In this paper, we explore a simple solution to "Multi-Source Neural Machine Translation" (MSNMT) which only relies on preprocessing a N-way multilingual corpus without modifying the Neural Machine Translation (NMT) architecture or training procedure. We simply concatenate the source sentences to form a single long multi-source input sentence while keeping the target side sentence as it is and trai… ▽ More

    Submitted 3 March, 2019; v1 submitted 20 February, 2017; originally announced February 2017.

    Comments: Official version of manuscript which was accepted in MT Summit 2017

  49. arXiv:1701.03214  [pdf, ps, other

    cs.CL

    An Empirical Comparison of Simple Domain Adaptation Methods for Neural Machine Translation

    Authors: Chenhui Chu, Raj Dabre, Sadao Kurohashi

    Abstract: In this paper, we propose a novel domain adaptation method named "mixed fine tuning" for neural machine translation (NMT). We combine two existing approaches namely fine tuning and multi domain NMT. We first train an NMT model on an out-of-domain parallel corpus, and then fine tune it on a parallel corpus which is a mix of the in-domain and out-of-domain corpora. All corpora are augmented with art… ▽ More

    Submitted 3 February, 2017; v1 submitted 11 January, 2017; originally announced January 2017.

    Comments: 6 pages

  50. Reading Comprehension using Entity-based Memory Network

    Authors: Xun Wang, Katsuhito Sudoh, Masaaki Nagata, Tomohide Shibata, Daisuke Kawahara, Sadao Kurohashi

    Abstract: This paper introduces a novel neural network model for question answering, the \emph{entity-based memory network}. It enhances neural networks' ability of representing and calculating information over a long period by keeping records of entities contained in text. The core component is a memory pool which comprises entities' states. These entities' states are continuously updated according to the… ▽ More

    Submitted 1 February, 2017; v1 submitted 12 December, 2016; originally announced December 2016.