Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–20 of 20 results for author: Ma, M D

.
  1. arXiv:2407.05250  [pdf, other

    cs.CL

    CLIMB: A Benchmark of Clinical Bias in Large Language Models

    Authors: Yubo Zhang, Shudi Hou, Mingyu Derek Ma, Wei Wang, Muhao Chen, Jieyu Zhao

    Abstract: Large language models (LLMs) are increasingly applied to clinical decision-making. However, their potential to exhibit bias poses significant risks to clinical equity. Currently, there is a lack of benchmarks that systematically evaluate such clinical bias in LLMs. While in downstream tasks, some biases of LLMs can be avoided such as by instructing the model to answer "I'm not sure...", the intern… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  2. arXiv:2407.01231  [pdf, other

    cs.CL cs.AI

    MIRAI: Evaluating LLM Agents for Event Forecasting

    Authors: Chenchen Ye, Ziniu Hu, Yihe Deng, Zijie Huang, Mingyu Derek Ma, Yanqiao Zhu, Wei Wang

    Abstract: Recent advancements in Large Language Models (LLMs) have empowered LLM agents to autonomously collect world information, over which to conduct reasoning to solve complex problems. Given this capability, increasing interests have been put into employing LLM agents for predicting international events, which can influence decision-making and shape policy development on an international scale. Despite… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 66 pages, 8 figures, 6 tables; Website: https://mirai-llm.github.io/

  3. arXiv:2406.09923  [pdf, other

    cs.CL cs.AI cs.LG

    CliBench: Multifaceted Evaluation of Large Language Models in Clinical Decisions on Diagnoses, Procedures, Lab Tests Orders and Prescriptions

    Authors: Mingyu Derek Ma, Chenchen Ye, Yu Yan, Xiaoxuan Wang, Peipei Ping, Timothy S Chang, Wei Wang

    Abstract: The integration of Artificial Intelligence (AI), especially Large Language Models (LLMs), into the clinical diagnosis process offers significant potential to improve the efficiency and accessibility of medical care. While LLMs have shown some promise in the medical domain, their application in clinical diagnosis remains underexplored, especially in real-world clinical practice, where highly sophis… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Project page: https://clibench.github.io

  4. arXiv:2406.09411  [pdf, other

    cs.CV cs.AI cs.CL

    MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding

    Authors: Fei Wang, Xingyu Fu, James Y. Huang, Zekun Li, Qin Liu, Xiaogeng Liu, Mingyu Derek Ma, Nan Xu, Wenxuan Zhou, Kai Zhang, Tianyi Lorena Yan, Wenjie Jacky Mo, Hsiang-Hui Liu, Pan Lu, Chunyuan Li, Chaowei Xiao, Kai-Wei Chang, Dan Roth, Sheng Zhang, Hoifung Poon, Muhao Chen

    Abstract: We introduce MuirBench, a comprehensive benchmark that focuses on robust multi-image understanding capabilities of multimodal LLMs. MuirBench consists of 12 diverse multi-image tasks (e.g., scene understanding, ordering) that involve 10 categories of multi-image relations (e.g., multiview, temporal relations). Comprising 11,264 images and 2,600 multiple-choice questions, MuirBench is created in a… ▽ More

    Submitted 1 July, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: typos corrected, references added, Project Page: https://muirbench.github.io/

  5. arXiv:2403.02586  [pdf, other

    cs.CL

    Improving Event Definition Following For Zero-Shot Event Detection

    Authors: Zefan Cai, Po-Nien Kung, Ashima Suvarna, Mingyu Derek Ma, Hritik Bansal, Baobao Chang, P. Jeffrey Brantingham, Wei Wang, Nanyun Peng

    Abstract: Existing approaches on zero-shot event detection usually train models on datasets annotated with known event types, and prompt them with unseen event definitions. These approaches yield sporadic successes, yet generally fall short of expectations. In this work, we aim to improve zero-shot event detection by training models to better follow event definitions. We hypothesize that a diverse set of ev… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  6. arXiv:2401.12255  [pdf, other

    cs.CR cs.AI cs.CL cs.LG

    Instructional Fingerprinting of Large Language Models

    Authors: Jiashu Xu, Fei Wang, Mingyu Derek Ma, Pang Wei Koh, Chaowei Xiao, Muhao Chen

    Abstract: The exorbitant cost of training Large language models (LLMs) from scratch makes it essential to fingerprint the models to protect intellectual property via ownership authentication and to ensure downstream users and developers comply with their license terms (e.g. restricting commercial use). In this study, we present a pilot study on LLM fingerprinting as a form of very lightweight instruction tu… ▽ More

    Submitted 3 April, 2024; v1 submitted 21 January, 2024; originally announced January 2024.

    Comments: Accepted at NAACL 2024; 30 pages

  7. arXiv:2311.09630  [pdf, other

    cs.CL cs.CY cs.SI

    Decoding Susceptibility: Modeling Misbelief to Misinformation Through a Computational Approach

    Authors: Yanchen Liu, Mingyu Derek Ma, Wenna Qin, Azure Zhou, Jiaao Chen, Weiyan Shi, Wei Wang, Diyi Yang

    Abstract: Susceptibility to misinformation describes the degree of belief in unverifiable claims, a latent aspect of individuals' mental processes that is not observable. Existing susceptibility studies heavily rely on self-reported beliefs, which can be subject to bias, expensive to collect, and challenging to scale for downstream applications. To address these limitations, in this work, we propose a compu… ▽ More

    Submitted 16 February, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

  8. arXiv:2310.08795  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Mitigating Bias for Question Answering Models by Tracking Bias Influence

    Authors: Mingyu Derek Ma, Jiun-Yu Kao, Arpit Gupta, Yu-Hsiang Lin, Wenbo Zhao, Tagyoung Chung, Wei Wang, Kai-Wei Chang, Nanyun Peng

    Abstract: Models of various NLP tasks have been shown to exhibit stereotypes, and the bias in the question answering (QA) models is especially harmful as the output answers might be directly consumed by the end users. There have been datasets to evaluate bias in QA models, while bias mitigation technique for the QA models is still under-explored. In this work, we propose BMBI, an approach to mitigate the bi… ▽ More

    Submitted 17 June, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

    Comments: To appear at NAACL 2024 main conference

  9. arXiv:2310.02529  [pdf, other

    cs.SI cs.AI cs.HC

    MIDDAG: Where Does Our News Go? Investigating Information Diffusion via Community-Level Information Pathways

    Authors: Mingyu Derek Ma, Alexander K. Taylor, Nuan Wen, Yanchen Liu, Po-Nien Kung, Wenna Qin, Shicheng Wen, Azure Zhou, Diyi Yang, Xuezhe Ma, Nanyun Peng, Wei Wang

    Abstract: We present MIDDAG, an intuitive, interactive system that visualizes the information propagation paths on social media triggered by COVID-19-related news articles accompanied by comprehensive insights, including user/community susceptibility level, as well as events and popular opinions raised by the crowd while propagating the information. Besides discovering information flow patterns among users,… ▽ More

    Submitted 20 February, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

    Comments: To appear at AAAI'24. System demo video and more info: info-pathways.github.io

  10. arXiv:2305.15090  [pdf, other

    cs.CL cs.AI

    STAR: Boosting Low-Resource Information Extraction by Structure-to-Text Data Generation with Large Language Models

    Authors: Mingyu Derek Ma, Xiaoxuan Wang, Po-Nien Kung, P. Jeffrey Brantingham, Nanyun Peng, Wei Wang

    Abstract: Information extraction tasks such as event extraction require an in-depth understanding of the output structure and sub-task dependencies. They heavily rely on task-specific training data in the form of (passage, target structure) pairs to obtain reasonable performance. However, obtaining such data through human annotation is costly, leading to a pressing need for low-resource information extracti… ▽ More

    Submitted 20 February, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: To appear at AAAI'24. More info is at https://derek.ma/STAR

  11. arXiv:2305.14710  [pdf, other

    cs.CL cs.AI cs.CR cs.LG

    Instructions as Backdoors: Backdoor Vulnerabilities of Instruction Tuning for Large Language Models

    Authors: Jiashu Xu, Mingyu Derek Ma, Fei Wang, Chaowei Xiao, Muhao Chen

    Abstract: We investigate security concerns of the emergent instruction tuning paradigm, that models are trained on crowdsourced datasets with task instructions to achieve superior performance. Our studies demonstrate that an attacker can inject backdoors by issuing very few malicious instructions (~1000 tokens) and control model behavior through data poisoning, without even the need to modify data instances… ▽ More

    Submitted 3 April, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: NAACL 2024

  12. arXiv:2301.10915  [pdf, other

    cs.CL cs.AI

    Parameter-Efficient Low-Resource Dialogue State Tracking by Prompt Tuning

    Authors: Mingyu Derek Ma, Jiun-Yu Kao, Shuyang Gao, Arpit Gupta, Di Jin, Tagyoung Chung, Nanyun Peng

    Abstract: Dialogue state tracking (DST) is an important step in dialogue management to keep track of users' beliefs. Existing works fine-tune all language model (LM) parameters to tackle the DST task, which requires significant data and computing resources for training and hosting. The cost grows exponentially in the real-world deployment where dozens of fine-tuned LM are used for different domains and task… ▽ More

    Submitted 29 May, 2023; v1 submitted 25 January, 2023; originally announced January 2023.

    Comments: In the INTERSPEECH 2023, and the Second Workshop on Efficient Natural Language and Speech Processing (ENLSP) at NeurIPS 2022

  13. arXiv:2212.10786  [pdf, other

    cs.CL cs.IR cs.LG

    Multi-hop Evidence Retrieval for Cross-document Relation Extraction

    Authors: Keming Lu, I-Hung Hsu, Wenxuan Zhou, Mingyu Derek Ma, Muhao Chen

    Abstract: Relation Extraction (RE) has been extended to cross-document scenarios because many relations are not simply described in a single document. This inevitably brings the challenge of efficient open-space evidence retrieval to support the inference of cross-document relations, along with the challenge of multi-hop reasoning on top of entities and evidence scattered in an open set of documents. To com… ▽ More

    Submitted 4 June, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

    Comments: ACL 2023 (Findings)

  14. arXiv:2212.10784  [pdf, other

    cs.CL cs.AI q-bio.QM

    Can NLI Provide Proper Indirect Supervision for Low-resource Biomedical Relation Extraction?

    Authors: Jiashu Xu, Mingyu Derek Ma, Muhao Chen

    Abstract: Two key obstacles in biomedical relation extraction (RE) are the scarcity of annotations and the prevalence of instances without explicitly pre-defined labels due to low annotation coverage. Existing approaches, which treat biomedical RE as a multi-class classification task, often result in poor generalization in low-resource settings and do not have the ability to make selective prediction on unk… ▽ More

    Submitted 19 October, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

    Comments: 16 pages; ACL 2023; code in https://github.com/luka-group/NLI_as_Indirect_Supervision

  15. arXiv:2209.05635  [pdf, other

    cs.LG cs.AI

    Bending the Future: Autoregressive Modeling of Temporal Knowledge Graphs in Curvature-Variable Hyperbolic Spaces

    Authors: Jihoon Sohn, Mingyu Derek Ma, Muhao Chen

    Abstract: Recently there is an increasing scholarly interest in time-varying knowledge graphs, or temporal knowledge graphs (TKG). Previous research suggests diverse approaches to TKG reasoning that uses historical information. However, less attention has been given to the hierarchies within such information at different timestamps. Given that TKG is a sequence of knowledge graphs based on time, the chronol… ▽ More

    Submitted 12 September, 2022; originally announced September 2022.

    Comments: 11 pages, 2 figures, In the 4th Conference on Automated Knowledge Base Construction (AKBC) 2022

  16. arXiv:2208.07989  [pdf, other

    cs.CL cs.AI cs.LG

    DICE: Data-Efficient Clinical Event Extraction with Generative Models

    Authors: Mingyu Derek Ma, Alexander K. Taylor, Wei Wang, Nanyun Peng

    Abstract: Event extraction for the clinical domain is an under-explored research area. The lack of training data along with the high volume of domain-specific terminologies with vague entity boundaries makes the task especially challenging. In this paper, we introduce DICE, a robust and data-efficient generative model for clinical event extraction. DICE frames event extraction as a conditional generation pr… ▽ More

    Submitted 25 May, 2023; v1 submitted 16 August, 2022; originally announced August 2022.

    Comments: In ACL 2023; 18 pages, 4 figures, 12 tables

  17. arXiv:2205.09837  [pdf, other

    cs.CL cs.LG

    Summarization as Indirect Supervision for Relation Extraction

    Authors: Keming Lu, I-Hung Hsu, Wenxuan Zhou, Mingyu Derek Ma, Muhao Chen

    Abstract: Relation extraction (RE) models have been challenged by their reliance on training data with expensive annotations. Considering that summarization tasks aim at acquiring concise expressions of synoptical information from the longer context, these tasks naturally align with the objective of RE, i.e., extracting a kind of synoptical information that describes the relation of entity mentions. We pres… ▽ More

    Submitted 21 October, 2022; v1 submitted 19 May, 2022; originally announced May 2022.

    Comments: Accepted by EMNLP 2022

  18. arXiv:2109.10500  [pdf, other

    cs.CL cs.AI

    HyperExpan: Taxonomy Expansion with Hyperbolic Representation Learning

    Authors: Mingyu Derek Ma, Muhao Chen, Te-Lin Wu, Nanyun Peng

    Abstract: Taxonomies are valuable resources for many applications, but the limited coverage due to the expensive manual curation process hinders their general applicability. Prior works attempt to automatically expand existing taxonomies to improve their coverage by learning concept embeddings in Euclidean space, while taxonomies, inherently hierarchical, more naturally align with the geometric properties o… ▽ More

    Submitted 21 September, 2021; originally announced September 2021.

    Comments: To appear in Findings of ACL: EMNLP 2021

  19. arXiv:2101.04922  [pdf, other

    cs.CL cs.AI cs.HC

    EventPlus: A Temporal Event Understanding Pipeline

    Authors: Mingyu Derek Ma, Jiao Sun, Mu Yang, Kung-Hsiang Huang, Nuan Wen, Shikhar Singh, Rujun Han, Nanyun Peng

    Abstract: We present EventPlus, a temporal event understanding pipeline that integrates various state-of-the-art event understanding components including event trigger and type detection, event argument detection, event duration and temporal relation extraction. Event information, especially event temporal knowledge, is a type of common sense knowledge that helps people understand how stories evolve and pro… ▽ More

    Submitted 25 April, 2021; v1 submitted 13 January, 2021; originally announced January 2021.

    Comments: To appear at NAACL 2021 (Demonstrations)

  20. arXiv:1907.03975  [pdf, ps, other

    cs.CL

    Implicit Discourse Relation Identification for Open-domain Dialogues

    Authors: Mingyu Derek Ma, Kevin K. Bowden, Jiaqi Wu, Wen Cui, Marilyn Walker

    Abstract: Discourse relation identification has been an active area of research for many years, and the challenge of identifying implicit relations remains largely an unsolved task, especially in the context of an open-domain dialogue system. Previous work primarily relies on a corpora of formal text which is inherently non-dialogic, i.e., news and journals. This data however is not suitable to handle the n… ▽ More

    Submitted 8 July, 2019; originally announced July 2019.

    Comments: To appear in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL2019)