research-article

Open access

Accuracy and Fairness for Web-Based Content Analysis under Temporal Shifts and Delayed Labeling

Authors:

Abdulaziz A. Almuzaini,

David M. Pennock,

Vivek K. SinghAuthors Info & Claims

WEBSCI '24: Proceedings of the 16th ACM Web Science Conference

Pages 268 - 278

https://doi.org/10.1145/3614419.3644028

Published: 21 May 2024 Publication History

All formats PDF

Abstract

Web-based content analysis tasks, such as labeling toxicity, misinformation, or spam often rely on machine learning models to achieve cost and scale efficiencies. As these models impact real human lives, ensuring accuracy and fairness of such models is critical. However, maintaining the performance of these models over time can be challenging due to the temporal shifts in the application context and the sub-populations represented. Furthermore, there is often a delay in obtaining human expert labels for the raw data, which hinders the timely adaptation and safe deployment of the models. To overcome these challenges, we propose a novel approach that anticipates future distributions of data, especially in settings where unlabeled data becomes available earlier than the labels to estimate the future distribution of labels per sub-population and adapt the model preemptively. We evaluate our approach using multiple temporally-shifting datasets and consider bias based on racial, political, and demographic identities. We find that the proposed approach yields promising performance with respect to both accuracy and fairness. Our paper contributes to the web science literature by proposing a novel method for enhancing the quality and equity of web-based content analysis using machine learning. Experimental code and datasets are publicly available at https://github.com/Behavioral-Informatics-Lab/FAIRCAST.

Supplemental Material

MP4 File - PaperSession-5_Hate_Speech_einzeln_Donnerstag_240605_AbdulazizAlmuzaini

Accuracy and Fairness for Web-Based Content Analysis under Temporal Shifts and Delayed Labeling

Download
674.12 MB

References

[1]

Oshin Agarwal and Ani Nenkova. 2022. Temporal Effects on Pre-trained Models for Language Processing Tasks. Transactions of the Association for Computational Linguistics 10 (2022), 904–921. https://doi.org/10.1162/tacl_a_00497

[2]

Abdullah Almaatouq, Ahmad Alabdulkareem, Mariam Nouh, Erez Shmueli, Mansour Alsaleh, Vivek K Singh, Abdulrahman Alarifi, Anas Alfaris, and Alex Pentland. 2014. Twitter: who gets caught? observed trends in social micro-blogging spam. In Proceedings of the 2014 ACM conference on Web science. 33–41.

Digital Library

[3]

Abdulaziz A Almuzaini, Chidansh A Bhatt, David M Pennock, and Vivek K Singh. 2022. ABCinML: Anticipatory Bias Correction in Machine Learning Applications. In 2022 ACM Conference on Fairness, Accountability, and Transparency. 1552–1560.

[4]

Fatma Arslan, Naeemul Hassan, Chengkai Li, and Mark Tremayne. 2020. A benchmark dataset of check-worthy factual claims. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 14. 821–829.

[5]

Ricardo Baeza-Yates. 2018. Bias on the web. Commun. ACM 61, 6 (2018), 54–61.

Digital Library

[6]

Solon Barocas, Moritz Hardt, and Arvind Narayanan. 2019. Fairness and Machine Learning: Limitations and Opportunities. http://www.fairmlbook.org.

[7]

Barry Becker and Ronny Kohavi. 1996. Adult. UCI Machine Learning Repository.

[8]

Antonio Bella, Cesar Ferri, José Hernández-Orallo, and Maria Jose Ramirez-Quintana. 2010. Quantification via probability estimators. In 2010 IEEE International Conference on Data Mining. IEEE, 737–742.

Digital Library

[9]

Daniel Borkan, Lucas Dixon, Jeffrey Sorensen, Nithum Thain, and Lucy Vasserman. 2019. Nuanced metrics for measuring unintended bias with real data for text classification. In Companion proc. world wide web conference. 491–500.

Digital Library

[10]

Zhipeng Cai, Ozan Sener, and Vladlen Koltun. 2021. Online continual learning with natural distribution shifts: An empirical study with visual data. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 8281–8290.

[11]

Alessandro Castelnovo, Lorenzo Malandri, Fabio Mercorio, Mario Mezzanzanica, and Andrea Cosentini. 2021. Towards fairness through time. In European Conf. Machine Learning and Knowledge Discovery in Databases. Springer, 647–663.

[12]

Ilias Chalkidis and Anders Søgaard. 2022. Improved Multi-label Classification under Temporal Concept Drift: Rethinking Group-Robust Algorithms in a Label-Wise Setting. In Findings of the Association for Computational Linguistics: ACL 2022. 2441–2454.

[13]

Lingjiao Chen, Matei Zaharia, and James Y Zou. 2022. Estimating and explaining model performance when both covariates and labels shift. Advances in Neural Information Processing Systems 35 (2022), 11467–11479.

[14]

Ashwin De Silva, Rahul Ramesh, Lyle Ungar, Marshall Hussain Shuler, Noah J Cowan, Michael Platt, Chen Li, Leyla Isik, Seung-Eon Roh, Adam Charles, 2023. Prospective Learning: Principled Extrapolation to the Future. In Conference on Lifelong Learning Agents. PMLR, 347–357.

[15]

Frances Ding, Moritz Hardt, John Miller, and Ludwig Schmidt. 2021. Retiring Adult: New Datasets for Fair Machine Learning. arXiv preprint arXiv:2108.04884 (2021).

[16]

Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference. 214–226.

Digital Library

[17]

Maciej Grzenda, Heitor Murilo Gomes, and Albert Bifet. 2020. Delayed labelling evaluation for data streams. Data Mining and Knowledge Discovery 34, 5 (2020), 1237–1266.

Digital Library

[18]

Xudong Han, Timothy Baldwin, and Trevor Cohn. 2021. Balancing out Bias: Achieving Fairness Through Balanced Training. arXiv preprint arXiv:2109.08253 (2021).

[19]

Moritz Hardt, Eric Price, and Nati Srebro. 2016. Equality of opportunity in supervised learning. Advances in neural information processing systems 29 (2016), 3315–3323.

[20]

Christina X Ji, Ahmed M Alaa, and David Sontag. 2023. Large-Scale Study of Temporal Shift in Health Insurance Claims. In Conference on Health, Inference, and Learning. PMLR, 243–278.

[21]

Jing Jiang. 2008. A literature survey on domain adaptation of statistical classifiers. URL: http://sifaka. cs. uiuc. edu/jiang4/domainadaptation/survey 3, 1-12 (2008), 3.

[22]

Yaochu Jin and Bernhard Sendhoff. 2008. Pareto-based multiobjective machine learning: An overview and case studies. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 38, 3 (2008), 397–415.

Digital Library

[23]

Faisal Kamiran and Toon Calders. 2012. Data preprocessing techniques for classification without discrimination. Knowledge and Information Systems 33, 1 (2012), 1–33.

Digital Library

[24]

David Lazer, Ryan Kennedy, Gary King, and Alessandro Vespignani. 2014. The parable of Google Flu: traps in big data analysis. science 343, 6176 (2014), 1203–1205.

[25]

Zachary Lipton, Yu-Xiang Wang, and Alexander Smola. 2018. Detecting and correcting for label shift with black box predictors. In International conference on machine learning. PMLR, 3122–3130.

[26]

Karima Makhlouf, Sami Zhioua, and Catuscia Palamidessi. 2020. On the applicability of ML fairness notions. arXiv preprint arXiv:2006.16745 (2020).

[27]

Lydia Manikonda, Mee Young Um, and Rui Fan. 2022. Shift of User Attitudes about Anti-Asian Hate on Reddit Before and During COVID-19. In Proceedings of the 14th ACM Web Science Conference 2022. 364–369.

Digital Library

[28]

Binny Mathew, Anurag Illendula, Punyajoy Saha, Soumya Sarkar, Pawan Goyal, and Animesh Mukherjee. 2020. Hate begets hate: A temporal study of hate speech. Proc. ACM on Human-Computer Interaction 4, CSCW2 (2020), 1–24.

Digital Library

[29]

Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. 2021. A survey on bias and fairness in machine learning. ACM computing surveys (CSUR) 54, 6 (2021), 1–35.

[30]

Taichi Murayama. 2021. Dataset of fake news detection and fact verification: a survey. arXiv preprint arXiv:2111.03299 (2021).

[31]

Rahul Pandey, Carlos Castillo, and Hemant Purohit. 2019. Modeling human annotation errors to design bias-aware systems for social stream processing. In Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. 374–377.

Digital Library

[32]

Jinkyung Park, Rahul Dev Ellezhuthil, Joseph Isaac, Christoph Mergerson, Lauren Feldman, and Vivek Singh. 2023. Misinformation Detection Algorithms and Fairness across Political Ideologies: The Impact of Article Level Labeling. In Proceedings of the 15th ACM Web Science Conference 2023. 107–116.

Digital Library

[33]

Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532–1543.

[34]

Dana Pessach and Erez Shmueli. 2023. Algorithmic fairness. In Machine Learning for Data Science Handbook: Data Mining and Knowledge Discovery Handbook. Springer, 867–886.

[35]

Shiori Sagawa, Pang Wei Koh, Tatsunori B Hashimoto, and Percy Liang. 2019. Distributionally robust neural networks for group shifts: On the importance of regularization for worst-case generalization. arXiv preprint arXiv:1911.08731 (2019).

[36]

Jessica Schrouff, Natalie Harris, Oluwasanmi Koyejo, Ibrahim Alabdulmohsin, Eva Schnider, Krista Opsahl-Ong, Alex Brown, Subhrajit Roy, Diana Mincu, Christina Chen, 2022. Maintaining fairness across distribution shift: do we have viable solutions for real-world applications?arXiv preprint arXiv:2202.01034 (2022).

[37]

Shreya Shankar, Bernease Herman, and Aditya G Parameswaran. 2022. Rethinking streaming machine learning evaluation. arXiv preprint arXiv:2205.11473 (2022).

[38]

Harvineet Singh, Rina Singh, Vishwali Mhasawade, and Rumi Chunara. 2021. Fairness violations and mitigation under covariate shift. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. 3–13.

Digital Library

[39]

Anders Søgaard, Sebastian Ebert, Jasmijn Bastings, and Katja Filippova. 2021. We Need To Talk About Random Splits. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, Paola Merlo, Jorg Tiedemann, and Reut Tsarfaty (Eds.). Association for Computational Linguistics, Online, 1823–1832. https://doi.org/10.18653/v1/2021.eacl-main.156

[40]

Qingyao Sun, Kevin Murphy, Sayna Ebrahimi, and Alexander D’Amour. 2022. Beyond Invariance: Test-Time Label-Shift Adaptation for Distributions with" Spurious" Correlations. arXiv preprint arXiv:2211.15646 (2022).

[41]

Vladimir N Vapnik. 1999. An overview of statistical learning theory. IEEE transactions on neural networks 10, 5 (1999), 988–999.

Digital Library

[42]

Xiaomeng Wang, Yishi Zhang, and Ruilin Zhu. 2022. A brief review on algorithmic fairness. Management System Engineering (2022). https://link.springer.com/article/10.1007/s44176-022-00006-z

[43]

Zeyu Wang, Klint Qinami, Ioannis Christos Karakozis, Kyle Genova, Prem Nair, Kenji Hata, and Olga Russakovsky. 2020. Towards fairness in visual recognition: Effective strategies for bias mitigation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 8919–8928.

[44]

Huaxiu Yao, Caroline Choi, Bochuan Cao, Yoonho Lee, Pang Wei Koh, and Chelsea Finn. 2022. Wild-time: A benchmark of in-the-wild distribution shift over time. arXiv preprint arXiv:2211.14238 (2022).

[45]

Zhiqi Yu, Jingjing Li, Zhekai Du, Lei Zhu, and Heng Tao Shen. 2023. A Comprehensive Survey on Source-free Domain Adaptation. arXiv preprint arXiv:2302.11803 (2023).

[46]

Yuji Zhang, Jing Li, and Wenjie Li. 2023. Vibe: Topic-driven temporal adaptation for twitter classification. arXiv preprint arXiv:2310.10191 (2023).

[47]

Indre Žliobaite. 2010. Change with delayed labeling: When is it detectable?. In 2010 IEEE International Conference on Data Mining Workshops. IEEE, 843–850.

Digital Library

Index Terms

Accuracy and Fairness for Web-Based Content Analysis under Temporal Shifts and Delayed Labeling
1. Computing methodologies
  1. Machine learning
    1. Learning settings
      1. Online learning settings

Recommendations

Distribution Consistency based Self-Training for Graph Neural Networks with Sparse Labels
WSDM '24: Proceedings of the 17th ACM International Conference on Web Search and Data Mining

Few-shot node classification poses a significant challenge for Graph Neural Networks (GNNs) due to insufficient supervision and potential distribution shifts between labeled and unlabeled nodes. Self-training has emerged as a widely popular framework to ...
TTT-MIM: Test-Time Training with Masked Image Modeling for Denoising Distribution Shifts
Computer Vision – ECCV 2024
Abstract
Neural networks trained end-to-end give state-of-the-art performance for image denoising. However, when applied to an image outside of the training distribution, the performance often degrades significantly. In this work, we propose a test-time ...
Wakeword Detection Under Distribution Shifts
Text, Speech, and Dialogue
Abstract
We propose a novel approach for semi-supervised learning (SSL) designed to overcome distribution shifts between training and real-world data arising in the keyword spotting (KWS) task. Shifts from training data distribution are a key challenge for ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WEBSCI '24: Proceedings of the 16th ACM Web Science Conference

May 2024

395 pages

ISBN:9798400703348

DOI:10.1145/3614419

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 May 2024

Check for updates

Badges

Best Paper

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Data Availability

PaperSession-5_Hate_Speech_einzeln_Donnerstag_240605_AbdulazizAlmuzaini: Accuracy and Fairness for Web-Based Content Analysis under Temporal Shifts and Delayed Labeling https://dl.acm.org/doi/10.1145/3614419.3644028#PaperSession-5_Hate_Speech_einzeln_Donnerstag_240605_AbdulazizAlmuzaini.mp4

Conference

Websci '24

Sponsor:

SIGWEB

Websci '24: 16th ACM Web Science Conference

May 21 - 24, 2024

Stuttgart, Germany

Acceptance Rates

Overall Acceptance Rate 245 of 933 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
227
Total Downloads

Downloads (Last 12 months)227
Downloads (Last 6 weeks)65

Reflects downloads up to 12 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents