Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–5 of 5 results for author: Weerasooriya, T C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.06559  [pdf, other

    cs.CL cs.AI cs.LG

    Harnessing Business and Media Insights with Large Language Models

    Authors: Yujia Bao, Ankit Parag Shah, Neeru Narang, Jonathan Rivers, Rajeev Maksey, Lan Guan, Louise N. Barrere, Shelley Evenson, Rahul Basole, Connie Miao, Ankit Mehta, Fabien Boulay, Su Min Park, Natalie E. Pearson, Eldhose Joy, Tiger He, Sumiran Thakur, Koustav Ghosal, Josh On, Phoebe Morrison, Tim Major, Eva Siqi Wang, Gina Escobar, Jiaheng Wei, Tharindu Cyril Weerasooriya , et al. (8 additional authors not shown)

    Abstract: This paper introduces Fortune Analytics Language Model (FALM). FALM empowers users with direct access to comprehensive business analysis, including market trends, company performance metrics, and expert insights. Unlike generic LLMs, FALM leverages a curated knowledge base built from professional journalism, enabling it to deliver precise and in-depth answers to intricate business questions. Users… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  2. arXiv:2307.10189  [pdf, other

    cs.IR cs.CL cs.SI

    Subjective Crowd Disagreements for Subjective Data: Uncovering Meaningful CrowdOpinion with Population-level Learning

    Authors: Tharindu Cyril Weerasooriya, Sarah Luger, Saloni Poddar, Ashiqur R. KhudaBukhsh, Christopher M. Homan

    Abstract: Human-annotated data plays a critical role in the fairness of AI systems, including those that deal with life-altering decisions or moderating human-created web/social media content. Conventionally, annotator disagreements are resolved before any learning takes place. However, researchers are increasingly identifying annotator disagreement as pervasive and meaningful. They also question the perfor… ▽ More

    Submitted 7 July, 2023; originally announced July 2023.

    Comments: Accepted for Publication at ACL 2023

  3. arXiv:2301.12534  [pdf, other

    cs.CL cs.CY cs.LG

    Vicarious Offense and Noise Audit of Offensive Speech Classifiers: Unifying Human and Machine Disagreement on What is Offensive

    Authors: Tharindu Cyril Weerasooriya, Sujan Dutta, Tharindu Ranasinghe, Marcos Zampieri, Christopher M. Homan, Ashiqur R. KhudaBukhsh

    Abstract: Offensive speech detection is a key component of content moderation. However, what is offensive can be highly subjective. This paper investigates how machine and human moderators disagree on what is offensive when it comes to real-world social web political discourse. We show that (1) there is extensive disagreement among the moderators (humans and machines); and (2) human and large-language-model… ▽ More

    Submitted 9 November, 2023; v1 submitted 29 January, 2023; originally announced January 2023.

    Comments: Accepted to appear at EMNLP 2023

  4. arXiv:2106.10600  [pdf, other

    cs.AI cs.SI

    Improving Label Quality by Jointly Modeling Items and Annotators

    Authors: Tharindu Cyril Weerasooriya, Alexander G. Ororbia, Christopher M. Homan

    Abstract: We propose a fully Bayesian framework for learning ground truth labels from noisy annotators. Our framework ensures scalability by factoring a generative, Bayesian soft clustering model over label distributions into the classic David and Skene joint annotator-data model. Earlier research along these lines has neither fully incorporated label distributions nor explored clustering by annotators on… ▽ More

    Submitted 19 June, 2021; originally announced June 2021.

  5. arXiv:2003.07406  [pdf, other

    cs.LG stat.ML

    Neighborhood-based Pooling for Population-level Label Distribution Learning

    Authors: Tharindu Cyril Weerasooriya, Tong Liu, Christopher M. Homan

    Abstract: Supervised machine learning often requires human-annotated data. While annotator disagreement is typically interpreted as evidence of noise, population-level label distribution learning (PLDL) treats the collection of annotations for each data item as a sample of the opinions of a population of human annotators, among whom disagreement may be proper and expected, even with no noise present. From t… ▽ More

    Submitted 29 April, 2020; v1 submitted 16 March, 2020; originally announced March 2020.

    Journal ref: Proceedings of the 24th European Conference on Artificial Intelligence 2020