Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- posterDecember 2024
LaMuCo: Large-Scale Multilingual Conversation Speech Recognition Challenge
MMAsia '24 Workshops: Proceedings of the 6th ACM International Conference on Multimedia in Asia WorkshopsArticle No.: 19, Pages 1–3https://doi.org/10.1145/3700410.3702135Magic Data, in collaboration with M3Oriental, has jointly initiated the “Large-scale Multilingual Speech Recognition Challenge.” Centered on multilingualism, this challenge seeks to explore and develop advanced multilingual speech dialogue systems that ...
- ArticleNovember 2024
Navigating Text-to-Image Generative Bias Across Indic Languages
AbstractThis research investigates biases in text-to-image (TTI) models for the Indic languages widely spoken across India. It evaluates and compares the generative performance and cultural relevance of leading TTI models in these languages against their ...
- research-articleSeptember 2024
Using Explainable AI (XAI) for Identification of Subjectivity in Hate Speech Annotations for Low-Resource Languages
OASIS '24: Proceedings of the 4th International Workshop on Open Challenges in Online Social NetworksPages 10–17https://doi.org/10.1145/3677117.3685006The proliferation of hate speech on digital platforms has become a significant issue, and automated content moderation systems built on machine learning are a proposed solution. However, they face challenges in multilingual and low-resource settings due ...
- ArticleSeptember 2024
CATMuS Medieval: A Multilingual Large-Scale Cross-Century Dataset in Latin Script for Handwritten Text Recognition and Beyond
- Thibault Clérice,
- Ariane Pinche,
- Malamatenia Vlachou-Efstathiou,
- Alix Chagué,
- Jean-Baptiste Camps,
- Matthias Gille Levenson,
- Olivier Brisville-Fertin,
- Federico Boschetti,
- Franz Fischer,
- Michael Gervers,
- Agnès Boutreux,
- Avery Manton,
- Simon Gabay,
- Patricia O’Connor,
- Wouter Haverals,
- Mike Kestemont,
- Caroline Vandyck,
- Benjamin Kiessling
Document Analysis and Recognition - ICDAR 2024Pages 174–194https://doi.org/10.1007/978-3-031-70543-4_11AbstractThe surge in digitisation initiatives by Cultural Heritage institutions has facilitated online accessibility to numerous historical manuscripts. However, a substantial portion of these documents exists solely as images, lacking machine-readable ...
- research-articleAugust 2024
Enhancing E-commerce Spelling Correction with Fine-Tuned Transformer Models
KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data MiningPages 4928–4938https://doi.org/10.1145/3637528.3671625In the realm of e-commerce, the process of search stands as the primary point of interaction for users, wielding a profound influence on the platform's revenue generation. Notably, spelling correction assumes a pivotal role in shaping the user's search ...
-
- short-paperJuly 2024
On Backbones and Training Regimes for Dense Retrieval in African Languages
SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information RetrievalPages 2564–2568https://doi.org/10.1145/3626772.3657952The effectiveness of dense retrieval models trained with multilingual language models as backbones has been demonstrated in multilingual and cross-lingual information retrieval contexts. The optimal choice of a backbone model for a given retrieval task ...
- ArticleSeptember 2024
A Multilingual NLP Framework for Offshore Installations
Natural Language Processing and Information SystemsPages 367–377https://doi.org/10.1007/978-3-031-70242-6_35AbstractThis paper presents a novel multilingual NLP framework tailored for Brazilian offshore operations, combining data engineering, alignment techniques, and information extraction to offer safety recommendations based on a decade of Brazilian offshore ...
- research-articleMay 2024
Intersectional Factors that Influence K-2 Students' Computer Science Learning
- Sharin Jacob,
- Benjamin Gillen,
- Santiago Ojeda-Ramirez,
- Clare Baek,
- Carlos Barrera,
- Diana Franklin,
- Mark Warschauer
RESPECT 2024: Proceedings of the 2024 on RESPECT Annual ConferencePages 21–29https://doi.org/10.1145/3653666.3656091Understanding issues of intersectionality in education is vital for creating equitable learning environments. Intersectionality emphasizes the complexity of students' identities, including race, gender, and socioeconomic status, and how they interact to ...
- research-articleJuly 2024
AISG's Online Safety Prize Challenge: Detecting Harmful Social Bias in Multimodal Memes
- Ying Ying Lim,
- Ming Shan Hee,
- Xun Wei Yee,
- Weng Kuan Yau,
- Xinming Sim,
- Wesley Tay,
- Wee Siong Ng,
- See-Kiong Ng,
- Roy Ka-Wei Lee
WWW '24: Companion Proceedings of the ACM Web Conference 2024Pages 1884–1891https://doi.org/10.1145/3589335.3665993Identifying internet memes that perpetuate harmful social biases is a significant challenge due to the memes' associated cultural references and multilingualism. This challenge is particularly apparent in Singapore, where multiple languages and diverse ...
- short-paperMay 2024
Programming Language Models in Multilingual Settings
ICSE-Companion '24: Proceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion ProceedingsPages 204–206https://doi.org/10.1145/3639478.3639787Large language models have become increasingly utilized in programming contexts. However, due to the recent emergence of this trend, some aspects have been overlooked. We propose a research approach that investigates the inner mechanics of transformer ...
- research-articleMarch 2024
Improved BIO-Based Chinese Automatic Abstract-Generation Model
ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), Volume 23, Issue 3Article No.: 39, Pages 1–16https://doi.org/10.1145/3643695With its unique information-filtering function, text summarization technology has become a significant aspect of search engines and question-and-answer systems. However, existing models that include the copy mechanism often lack the ability to extract ...
- ArticleOctober 2023
End-to-End Multilingual Text Recognition Based on Byte Modeling
AbstractNowadays, multilingual text recognition is more and more widely used in computer vision. However, in practical applications, the independent modeling of each language cannot make full use of the information between different languages and consumes ...
- ArticleSeptember 2023
CML-TTS: A Multilingual Dataset for Speech Synthesis in Low-Resource Languages
- Frederico S. Oliveira,
- Edresson Casanova,
- Arnaldo Candido Junior,
- Anderson S. Soares,
- Arlindo R. Galvão Filho
AbstractIn this paper, we present CML-TTS, a recursive acronym for CML-Multi-Lingual-TTS, a new Text-to-Speech (TTS) dataset developed at the Center of Excellence in Artificial Intelligence (CEIA) of the Federal University of Goias (UFG). CML-TTS is based ...
- research-articleOctober 2023
Towards Better Multilingual Code Search through Cross-Lingual Contrastive Learning
Internetware '23: Proceedings of the 14th Asia-Pacific Symposium on InternetwarePages 22–32https://doi.org/10.1145/3609437.3609439Recent advances in deep learning have significantly improved the understanding of source code by leveraging large amounts of open-source software data. Thanks to the larger amount of data, code representation models trained with multilingual datasets ...
- short-paperJuly 2023
A Transformer-Based Substitute Recommendation Model Incorporating Weakly Supervised Customer Behavior Data
SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information RetrievalPages 3325–3329https://doi.org/10.1145/3539618.3591847The substitute-based recommendation is widely used in E-commerce to provide better alternatives to customers. However, existing research typically uses customer behavior signals like co-view and view-but-purchase-another to capture the substitute ...
- research-articleJuly 2023
A Case For Supporting Translanguaging in Technology
DIS '23 Companion: Companion Publication of the 2023 ACM Designing Interactive Systems ConferencePages 228–231https://doi.org/10.1145/3563703.3596627Monolingual ideology prevails in technology. It is further dominated by the English language. The problem arising from monolingual ideology in technology is pronounced for individuals with limited digital literacy and English familiarity. We present a ...
- abstractMay 2023
A Dataset of Underrepresented Languages in Eye Tracking Research
ETRA '23: Proceedings of the 2023 Symposium on Eye Tracking Research and ApplicationsArticle No.: 31, Pages 1–2https://doi.org/10.1145/3588015.3590128A number of factors come together to limit the diversity of eye-tracking research, where the majority of papers are conducted with stimuli in the English language. Studying eye movement over other languages is important considering that each language ...
- short-paperApril 2023
Transfer Learning for Multilingual Abusive Meme Detection
WebSci '23: Proceedings of the 15th ACM Web Science Conference 2023Pages 245–250https://doi.org/10.1145/3578503.3583607The exponential growth of social media platforms has permitted people to connect worldwide. However, it has also fueled the elevation of several harmful and abusive content on the Internet. Repeated exposure to abusive content may lead to psychological ...
- research-articleApril 2023
MetaTroll: Few-shot Detection of State-Sponsored Trolls with Transformer Adapters
WWW '23: Proceedings of the ACM Web Conference 2023Pages 1743–1753https://doi.org/10.1145/3543507.3583417State-sponsored trolls are the main actors of influence campaigns on social media and automatic troll detection is important to combat misinformation at scale. Existing troll detection models are developed based on training data for known campaigns (...