Search | arXiv e-print repository

Adversarial Robustness of VAEs across Intersectional Subgroups

Authors: Chethan Krishnamurthy Ramanaik, Arjun Roy, Eirini Ntoutsi

Abstract: Despite advancements in Autoencoders (AEs) for tasks like dimensionality reduction, representation learning and data generation, they remain vulnerable to adversarial attacks. Variational Autoencoders (VAEs), with their probabilistic approach to disentangling latent spaces, show stronger resistance to such perturbations compared to deterministic AEs; however, their resilience against adversarial i… ▽ More Despite advancements in Autoencoders (AEs) for tasks like dimensionality reduction, representation learning and data generation, they remain vulnerable to adversarial attacks. Variational Autoencoders (VAEs), with their probabilistic approach to disentangling latent spaces, show stronger resistance to such perturbations compared to deterministic AEs; however, their resilience against adversarial inputs is still a concern. This study evaluates the robustness of VAEs against non-targeted adversarial attacks by optimizing minimal sample-specific perturbations to cause maximal damage across diverse demographic subgroups (combinations of age and gender). We investigate two questions: whether there are robustness disparities among subgroups, and what factors contribute to these disparities, such as data scarcity and representation entanglement. Our findings reveal that robustness disparities exist but are not always correlated with the size of the subgroup. By using downstream gender and age classifiers and examining latent embeddings, we highlight the vulnerability of subgroups like older women, who are prone to misclassification due to adversarial perturbations pushing their representations toward those of other subgroups. △ Less

Submitted 4 July, 2024; originally announced July 2024.

arXiv:2404.15385 [pdf, ps, other]

Sum of Group Error Differences: A Critical Examination of Bias Evaluation in Biometric Verification and a Dual-Metric Measure

Authors: Alaa Elobaid, Nathan Ramoly, Lara Younes, Symeon Papadopoulos, Eirini Ntoutsi, Ioannis Kompatsiaris

Abstract: Biometric Verification (BV) systems often exhibit accuracy disparities across different demographic groups, leading to biases in BV applications. Assessing and quantifying these biases is essential for ensuring the fairness of BV systems. However, existing bias evaluation metrics in BV have limitations, such as focusing exclusively on match or non-match error rates, overlooking bias on demographic… ▽ More Biometric Verification (BV) systems often exhibit accuracy disparities across different demographic groups, leading to biases in BV applications. Assessing and quantifying these biases is essential for ensuring the fairness of BV systems. However, existing bias evaluation metrics in BV have limitations, such as focusing exclusively on match or non-match error rates, overlooking bias on demographic groups with performance levels falling between the best and worst performance levels, and neglecting the magnitude of the bias present. This paper presents an in-depth analysis of the limitations of current bias evaluation metrics in BV and, through experimental analysis, demonstrates their contextual suitability, merits, and limitations. Additionally, it introduces a novel general-purpose bias evaluation measure for BV, the ``Sum of Group Error Differences (SEDG)''. Our experimental results on controlled synthetic datasets demonstrate the effectiveness of demographic bias quantification when using existing metrics and our own proposed measure. We discuss the applicability of the bias evaluation metrics in a set of simulated demographic bias scenarios and provide scenario-based metric recommendations. Our code is publicly available under \url{https://github.com/alaaobeid/SEDG}. △ Less

Submitted 23 April, 2024; originally announced April 2024.

arXiv:2404.02629 [pdf, other]

Effector: A Python package for regional explanations

Authors: Vasilis Gkolemis, Christos Diou, Eirini Ntoutsi, Theodore Dalamagas, Bernd Bischl, Julia Herbinger, Giuseppe Casalicchio

Abstract: Global feature effect methods explain a model outputting one plot per feature. The plot shows the average effect of the feature on the output, like the effect of age on the annual income. However, average effects may be misleading when derived from local effects that are heterogeneous, i.e., they significantly deviate from the average. To decrease the heterogeneity, regional effects provide multip… ▽ More Global feature effect methods explain a model outputting one plot per feature. The plot shows the average effect of the feature on the output, like the effect of age on the annual income. However, average effects may be misleading when derived from local effects that are heterogeneous, i.e., they significantly deviate from the average. To decrease the heterogeneity, regional effects provide multiple plots per feature, each representing the average effect within a specific subspace. For interpretability, subspaces are defined as hyperrectangles defined by a chain of logical rules, like age's effect on annual income separately for males and females and different levels of professional experience. We introduce Effector, a Python library dedicated to regional feature effects. Effector implements well-established global effect methods, assesses the heterogeneity of each method and, based on that, provides regional effects. Effector automatically detects subspaces where regional effects have reduced heterogeneity. All global and regional effect methods share a common API, facilitating comparisons between them. Moreover, the library's interface is extensible so new methods can be easily added and benchmarked. The library has been thoroughly tested, ships with many tutorials (https://xai-effector.github.io/) and is available under an open-source license at PyPi (https://pypi.org/project/effector/) and Github (https://github.com/givasile/effector). △ Less

Submitted 3 April, 2024; originally announced April 2024.

Comments: 33 pages, 17 figures

arXiv:2402.10756 [pdf, other]

Towards Cohesion-Fairness Harmony: Contrastive Regularization in Individual Fair Graph Clustering

Authors: Siamak Ghodsi, Seyed Amjad Seyedi, Eirini Ntoutsi

Abstract: Conventional fair graph clustering methods face two primary challenges: i) They prioritize balanced clusters at the expense of cluster cohesion by imposing rigid constraints, ii) Existing methods of both individual and group-level fairness in graph partitioning mostly rely on eigen decompositions and thus, generally lack interpretability. To address these issues, we propose iFairNMTF, an individua… ▽ More Conventional fair graph clustering methods face two primary challenges: i) They prioritize balanced clusters at the expense of cluster cohesion by imposing rigid constraints, ii) Existing methods of both individual and group-level fairness in graph partitioning mostly rely on eigen decompositions and thus, generally lack interpretability. To address these issues, we propose iFairNMTF, an individual Fairness Nonnegative Matrix Tri-Factorization model with contrastive fairness regularization that achieves balanced and cohesive clusters. By introducing fairness regularization, our model allows for customizable accuracy-fairness trade-offs, thereby enhancing user autonomy without compromising the interpretability provided by nonnegative matrix tri-factorization. Experimental evaluations on real and synthetic datasets demonstrate the superior flexibility of iFairNMTF in achieving fairness and clustering performance. △ Less

Submitted 16 February, 2024; originally announced February 2024.

Comments: To be published in "The 28th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2024)"

arXiv:2311.00523 [pdf, other]

Learning impartial policies for sequential counterfactual explanations using Deep Reinforcement Learning

Authors: E. Panagiotou, E. Ntoutsi

Abstract: In the field of explainable Artificial Intelligence (XAI), sequential counterfactual (SCF) examples are often used to alter the decision of a trained classifier by implementing a sequence of modifications to the input instance. Although certain test-time algorithms aim to optimize for each new instance individually, recently Reinforcement Learning (RL) methods have been proposed that seek to learn… ▽ More In the field of explainable Artificial Intelligence (XAI), sequential counterfactual (SCF) examples are often used to alter the decision of a trained classifier by implementing a sequence of modifications to the input instance. Although certain test-time algorithms aim to optimize for each new instance individually, recently Reinforcement Learning (RL) methods have been proposed that seek to learn policies for discovering SCFs, thereby enhancing scalability. As is typical in RL, the formulation of the RL problem, including the specification of state space, actions, and rewards, can often be ambiguous. In this work, we identify shortcomings in existing methods that can result in policies with undesired properties, such as a bias towards specific actions. We propose to use the output probabilities of the classifier to create a more informative reward, to mitigate this effect. △ Less

Submitted 1 November, 2023; originally announced November 2023.

Comments: Accepted at the ECML PKDD 2023 Workshop: Explainable Artificial Intelligence From Static to Dynamic

arXiv:2310.13746 [pdf, other]

FairBranch: Fairness Conflict Correction on Task-group Branches for Fair Multi-Task Learning

Authors: Arjun Roy, Christos Koutlis, Symeon Papadopoulos, Eirini Ntoutsi

Abstract: The generalization capacity of Multi-Task Learning (MTL) becomes limited when unrelated tasks negatively impact each other by updating shared parameters with conflicting gradients, resulting in negative transfer and a reduction in MTL accuracy compared to single-task learning (STL). Recently, there has been an increasing focus on the fairness of MTL models, necessitating the optimization of both a… ▽ More The generalization capacity of Multi-Task Learning (MTL) becomes limited when unrelated tasks negatively impact each other by updating shared parameters with conflicting gradients, resulting in negative transfer and a reduction in MTL accuracy compared to single-task learning (STL). Recently, there has been an increasing focus on the fairness of MTL models, necessitating the optimization of both accuracy and fairness for individual tasks. Similarly to how negative transfer affects accuracy, task-specific fairness considerations can adversely influence the fairness of other tasks when there is a conflict of fairness loss gradients among jointly learned tasks, termed bias transfer. To address both negative and bias transfer in MTL, we introduce a novel method called FairBranch. FairBranch branches the MTL model by assessing the similarity of learned parameters, grouping related tasks to mitigate negative transfer. Additionally, it incorporates fairness loss gradient conflict correction between adjoining task-group branches to address bias transfer within these task groups. Our experiments in tabular and visual MTL problems demonstrate that FairBranch surpasses state-of-the-art MTL methods in terms of both fairness and accuracy. △ Less

Submitted 20 October, 2023; originally announced October 2023.

arXiv:2309.12215 [pdf, other]

Regionally Additive Models: Explainable-by-design models minimizing feature interactions

Authors: Vasilis Gkolemis, Anargiros Tzerefos, Theodore Dalamagas, Eirini Ntoutsi, Christos Diou

Abstract: Generalized Additive Models (GAMs) are widely used explainable-by-design models in various applications. GAMs assume that the output can be represented as a sum of univariate functions, referred to as components. However, this assumption fails in ML problems where the output depends on multiple features simultaneously. In these cases, GAMs fail to capture the interaction terms of the underlying fu… ▽ More Generalized Additive Models (GAMs) are widely used explainable-by-design models in various applications. GAMs assume that the output can be represented as a sum of univariate functions, referred to as components. However, this assumption fails in ML problems where the output depends on multiple features simultaneously. In these cases, GAMs fail to capture the interaction terms of the underlying function, leading to subpar accuracy. To (partially) address this issue, we propose Regionally Additive Models (RAMs), a novel class of explainable-by-design models. RAMs identify subregions within the feature space where interactions are minimized. Within these regions, it is more accurate to express the output as a sum of univariate functions (components). Consequently, RAMs fit one component per subregion of each feature instead of one component per feature. This approach yields a more expressive model compared to GAMs while retaining interpretability. The RAM framework consists of three steps. Firstly, we train a black-box model. Secondly, using Regional Effect Plots, we identify subregions where the black-box model exhibits near-local additivity. Lastly, we fit a GAM component for each identified subregion. We validate the effectiveness of RAMs through experiments on both synthetic and real-world datasets. The results confirm that RAMs offer improved expressiveness compared to GAMs while maintaining interpretability. △ Less

Submitted 21 September, 2023; originally announced September 2023.

Comments: Accepted at ECMLPKDD 2023 Workshop Uncertainty meets Explainability

arXiv:2309.11193 [pdf, other]

RHALE: Robust and Heterogeneity-aware Accumulated Local Effects

Authors: Vasilis Gkolemis, Theodore Dalamagas, Eirini Ntoutsi, Christos Diou

Abstract: Accumulated Local Effects (ALE) is a widely-used explainability method for isolating the average effect of a feature on the output, because it handles cases with correlated features well. However, it has two limitations. First, it does not quantify the deviation of instance-level (local) effects from the average (global) effect, known as heterogeneity. Second, for estimating the average effect, it… ▽ More Accumulated Local Effects (ALE) is a widely-used explainability method for isolating the average effect of a feature on the output, because it handles cases with correlated features well. However, it has two limitations. First, it does not quantify the deviation of instance-level (local) effects from the average (global) effect, known as heterogeneity. Second, for estimating the average effect, it partitions the feature domain into user-defined, fixed-sized bins, where different bin sizes may lead to inconsistent ALE estimations. To address these limitations, we propose Robust and Heterogeneity-aware ALE (RHALE). RHALE quantifies the heterogeneity by considering the standard deviation of the local effects and automatically determines an optimal variable-size bin-splitting. In this paper, we prove that to achieve an unbiased approximation of the standard deviation of local effects within each bin, bin splitting must follow a set of sufficient conditions. Based on these conditions, we propose an algorithm that automatically determines the optimal partitioning, balancing the estimation bias and variance. Through evaluations on synthetic and real datasets, we demonstrate the superiority of RHALE compared to other methods, including the advantages of automatic bin splitting, especially in cases with correlated features. △ Less

Submitted 20 September, 2023; originally announced September 2023.

Comments: Accepted at ECAI 2023 (European Conference on Artificial Intelligence)

arXiv:2306.01699 [pdf, other]

Affinity Clustering Framework for Data Debiasing Using Pairwise Distribution Discrepancy

Authors: Siamak Ghodsi, Eirini Ntoutsi

Abstract: Group imbalance, resulting from inadequate or unrepresentative data collection methods, is a primary cause of representation bias in datasets. Representation bias can exist with respect to different groups of one or more protected attributes and might lead to prejudicial and discriminatory outcomes toward certain groups of individuals; in cases where a learning model is trained on such biased data… ▽ More Group imbalance, resulting from inadequate or unrepresentative data collection methods, is a primary cause of representation bias in datasets. Representation bias can exist with respect to different groups of one or more protected attributes and might lead to prejudicial and discriminatory outcomes toward certain groups of individuals; in cases where a learning model is trained on such biased data. This paper presents MASC, a data augmentation approach that leverages affinity clustering to balance the representation of non-protected and protected groups of a target dataset by utilizing instances of the same protected attributes from similar datasets that are categorized in the same cluster as the target dataset by sharing instances of the protected attribute. The proposed method involves constructing an affinity matrix by quantifying distribution discrepancies between dataset pairs and transforming them into a symmetric pairwise similarity matrix. A non-parametric spectral clustering is then applied to this affinity matrix, automatically categorizing the datasets into an optimal number of clusters. We perform a step-by-step experiment as a demo of our method to show the procedure of the proposed data augmentation method and evaluate and discuss its performance. A comparison with other data augmentation methods, both pre- and post-augmentation, is conducted, along with a model evaluation analysis of each method. Our method can handle non-binary protected attributes so, in our experiments, bias is measured in a non-binary protected attribute setup w.r.t. racial groups distribution for two separate minority groups in comparison with the majority group before and after debiasing. Empirical results imply that our method of augmenting dataset biases using real (genuine) data from similar contexts can effectively debias the target datasets comparably to existing data augmentation strategies. △ Less

Submitted 2 June, 2023; originally announced June 2023.

Comments: 15 pages plus 2 pages of references, 3 figures, 2 tables, and 1 algorithm

arXiv:2302.07733 [pdf, other]

Explaining text classifiers through progressive neighborhood approximation with realistic samples

Authors: Yi Cai, Arthur Zimek, Eirini Ntoutsi, Gerhard Wunder

Abstract: The importance of neighborhood construction in local explanation methods has been already highlighted in the literature. And several attempts have been made to improve neighborhood quality for high-dimensional data, for example, texts, by adopting generative models. Although the generators produce more realistic samples, the intuitive sampling approaches in the existing solutions leave the latent… ▽ More The importance of neighborhood construction in local explanation methods has been already highlighted in the literature. And several attempts have been made to improve neighborhood quality for high-dimensional data, for example, texts, by adopting generative models. Although the generators produce more realistic samples, the intuitive sampling approaches in the existing solutions leave the latent space underexplored. To overcome this problem, our work, focusing on local model-agnostic explanations for text classifiers, proposes a progressive approximation approach that refines the neighborhood of a to-be-explained decision with a careful two-stage interpolation using counterfactuals as landmarks. We explicitly specify the two properties that should be satisfied by generative models, the reconstruction ability and the locality-preserving property, to guide the selection of generators for local explanation methods. Moreover, noticing the opacity of generative models during the study, we propose another method that implements progressive neighborhood approximation with probability-based editions as an alternative to the generator-based solution. The explanation results from both methods consist of word-level and instance-level explanations benefiting from the realistic neighborhood. Through exhaustive experiments, we qualitatively and quantitatively demonstrate the effectiveness of the two proposed methods. △ Less

Submitted 11 February, 2023; originally announced February 2023.

arXiv:2302.05995 [pdf, other]

Multi-dimensional discrimination in Law and Machine Learning -- A comparative overview

Authors: Arjun Roy, Jan Horstmann, Eirini Ntoutsi

Abstract: AI-driven decision-making can lead to discrimination against certain individuals or social groups based on protected characteristics/attributes such as race, gender, or age. The domain of fairness-aware machine learning focuses on methods and algorithms for understanding, mitigating, and accounting for bias in AI/ML models. Still, thus far, the vast majority of the proposed methods assess fairness… ▽ More AI-driven decision-making can lead to discrimination against certain individuals or social groups based on protected characteristics/attributes such as race, gender, or age. The domain of fairness-aware machine learning focuses on methods and algorithms for understanding, mitigating, and accounting for bias in AI/ML models. Still, thus far, the vast majority of the proposed methods assess fairness based on a single protected attribute, e.g. only gender or race. In reality, though, human identities are multi-dimensional, and discrimination can occur based on more than one protected characteristic, leading to the so-called ``multi-dimensional discrimination'' or ``multi-dimensional fairness'' problem. While well-elaborated in legal literature, the multi-dimensionality of discrimination is less explored in the machine learning community. Recent approaches in this direction mainly follow the so-called intersectional fairness definition from the legal domain, whereas other notions like additive and sequential discrimination are less studied or not considered thus far. In this work, we overview the different definitions of multi-dimensional discrimination/fairness in the legal domain as well as how they have been transferred/ operationalized (if) in the fairness-aware machine learning domain. By juxtaposing these two domains, we draw the connections, identify the limitations, and point out open research directions. △ Less

Submitted 12 February, 2023; originally announced February 2023.

arXiv:2301.03421 [pdf]

doi 10.1007/978-981-99-0026-8_2

A review of clustering models in educational data science towards fairness-aware learning

Authors: Tai Le Quy, Gunnar Friege, Eirini Ntoutsi

Abstract: Ensuring fairness is essential for every education system. Machine learning is increasingly supporting the education system and educational data science (EDS) domain, from decision support to educational activities and learning analytics. However, the machine learning-based decisions can be biased because the algorithms may generate the results based on students' protected attributes such as race… ▽ More Ensuring fairness is essential for every education system. Machine learning is increasingly supporting the education system and educational data science (EDS) domain, from decision support to educational activities and learning analytics. However, the machine learning-based decisions can be biased because the algorithms may generate the results based on students' protected attributes such as race or gender. Clustering is an important machine learning technique to explore student data in order to support the decision-maker, as well as support educational activities, such as group assignments. Therefore, ensuring high-quality clustering models along with satisfying fairness constraints are important requirements. This chapter comprehensively surveys clustering models and their fairness in EDS. We especially focus on investigating the fair clustering models applied in educational activities. These models are believed to be practical tools for analyzing students' data and ensuring fairness in EDS. △ Less

Submitted 9 January, 2023; originally announced January 2023.

Comments: This is a preprint of the following chapter: Tai Le Quy, Gunnar Friege, Eirini Ntoutsi, A review of clustering models in educational data science towards fair-ness-aware learning, published in Educational Data Science: Essentials, Ap-proaches, and Tendencies, edited by Alejandro Peña-Ayala , 2023, Springer. https://link.springer.com/book/9789819900251

Journal ref: Educational Data Science: Essentials, Approaches, and Tendencies, 2023

arXiv:2209.09975 [pdf, other]

Power of Explanations: Towards automatic debiasing in hate speech detection

Authors: Yi Cai, Arthur Zimek, Gerhard Wunder, Eirini Ntoutsi

Abstract: Hate speech detection is a common downstream application of natural language processing (NLP) in the real world. In spite of the increasing accuracy, current data-driven approaches could easily learn biases from the imbalanced data distributions originating from humans. The deployment of biased models could further enhance the existing social biases. But unlike handling tabular data, defining and… ▽ More Hate speech detection is a common downstream application of natural language processing (NLP) in the real world. In spite of the increasing accuracy, current data-driven approaches could easily learn biases from the imbalanced data distributions originating from humans. The deployment of biased models could further enhance the existing social biases. But unlike handling tabular data, defining and mitigating biases in text classifiers, which deal with unstructured data, are more challenging. A popular solution for improving machine learning fairness in NLP is to conduct the debiasing process with a list of potentially discriminated words given by human annotators. In addition to suffering from the risks of overlooking the biased terms, exhaustively identifying bias with human annotators are unsustainable since discrimination is variable among different datasets and may evolve over time. To this end, we propose an automatic misuse detector (MiD) relying on an explanation method for detecting potential bias. And built upon that, an end-to-end debiasing framework with the proposed staged correction is designed for text classifiers without any external resources required. △ Less

Submitted 7 September, 2022; originally announced September 2022.

Comments: IEEE DSAA'22

arXiv:2209.08309 [pdf, other]

AdaCC: Cumulative Cost-Sensitive Boosting for Imbalanced Classification

Authors: Vasileios Iosifidis, Symeon Papadopoulos, Bodo Rosenhahn, Eirini Ntoutsi

Abstract: Class imbalance poses a major challenge for machine learning as most supervised learning models might exhibit bias towards the majority class and under-perform in the minority class. Cost-sensitive learning tackles this problem by treating the classes differently, formulated typically via a user-defined fixed misclassification cost matrix provided as input to the learner. Such parameter tuning is… ▽ More Class imbalance poses a major challenge for machine learning as most supervised learning models might exhibit bias towards the majority class and under-perform in the minority class. Cost-sensitive learning tackles this problem by treating the classes differently, formulated typically via a user-defined fixed misclassification cost matrix provided as input to the learner. Such parameter tuning is a challenging task that requires domain knowledge and moreover, wrong adjustments might lead to overall predictive performance deterioration. In this work, we propose a novel cost-sensitive boosting approach for imbalanced data that dynamically adjusts the misclassification costs over the boosting rounds in response to model's performance instead of using a fixed misclassification cost matrix. Our method, called AdaCC, is parameter-free as it relies on the cumulative behavior of the boosting model in order to adjust the misclassification costs for the next boosting round and comes with theoretical guarantees regarding the training error. Experiments on 27 real-world datasets from different domains with high class imbalance demonstrate the superiority of our method over 12 state-of-the-art cost-sensitive boosting approaches exhibiting consistent improvements in different measures, for instance, in the range of [0.3%-28.56%] for AUC, [3.4%-21.4%] for balanced accuracy, [4.8%-45%] for gmean and [7.4%-85.5%] for recall. △ Less

Submitted 17 September, 2022; originally announced September 2022.

Comments: 30 pages

arXiv:2208.10625 [pdf, other]

doi 10.1007/978-3-031-23618-1_8

Evaluation of group fairness measures in student performance prediction problems

Authors: Tai Le Quy, Thi Huyen Nguyen, Gunnar Friege, Eirini Ntoutsi

Abstract: Predicting students' academic performance is one of the key tasks of educational data mining (EDM). Traditionally, the high forecasting quality of such models was deemed critical. More recently, the issues of fairness and discrimination w.r.t. protected attributes, such as gender or race, have gained attention. Although there are several fairness-aware learning approaches in EDM, a comparative eva… ▽ More Predicting students' academic performance is one of the key tasks of educational data mining (EDM). Traditionally, the high forecasting quality of such models was deemed critical. More recently, the issues of fairness and discrimination w.r.t. protected attributes, such as gender or race, have gained attention. Although there are several fairness-aware learning approaches in EDM, a comparative evaluation of these measures is still missing. In this paper, we evaluate different group fairness measures for student performance prediction problems on various educational datasets and fairness-aware learning models. Our study shows that the choice of the fairness measure is important, likewise for the choice of the grade threshold. △ Less

Submitted 22 August, 2022; originally announced August 2022.

Comments: SoGood2022 - The 7th Workshop on Data Science for Social Good - ECML PKDD 2022

Journal ref: Machine Learning and Principles and Practice of Knowledge Discovery in Databases. ECML PKDD 2022

arXiv:2206.11436 [pdf, other]

Context matters for fairness -- a case study on the effect of spatial distribution shifts

Authors: Siamak Ghodsi, Harith Alani, Eirini Ntoutsi

Abstract: With the ever growing involvement of data-driven AI-based decision making technologies in our daily social lives, the fairness of these systems is becoming a crucial phenomenon. However, an important and often challenging aspect in utilizing such systems is to distinguish validity for the range of their application especially under distribution shifts, i.e., when a model is deployed on data with d… ▽ More With the ever growing involvement of data-driven AI-based decision making technologies in our daily social lives, the fairness of these systems is becoming a crucial phenomenon. However, an important and often challenging aspect in utilizing such systems is to distinguish validity for the range of their application especially under distribution shifts, i.e., when a model is deployed on data with different distribution than the training set. In this paper, we present a case study on the newly released American Census datasets, a reconstruction of the popular Adult dataset, to illustrate the importance of context for fairness and show how remarkably can spatial distribution shifts affect predictive- and fairness-related performance of a model. The problem persists for fairness-aware learning models with the effects of context-specific fairness interventions differing across the states and different population groups. Our study suggests that robustness to distribution shifts is necessary before deploying a model to another context. △ Less

Submitted 24 June, 2022; v1 submitted 22 June, 2022; originally announced June 2022.

arXiv:2206.09895 [pdf, other]

Multiple Fairness and Cardinality constraints for Students-Topics Grouping Problem

Authors: Tai Le Quy, Gunnar Friege, Eirini Ntoutsi

Abstract: Group work is a prevalent activity in educational settings, where students are often divided into topic-specific groups based on their preferences. The grouping should reflect the students' aspirations as much as possible. Usually, the resulting groups should also be balanced in terms of protected attributes like gender or race since studies indicate that students might learn better in a diverse g… ▽ More Group work is a prevalent activity in educational settings, where students are often divided into topic-specific groups based on their preferences. The grouping should reflect the students' aspirations as much as possible. Usually, the resulting groups should also be balanced in terms of protected attributes like gender or race since studies indicate that students might learn better in a diverse group. Moreover, balancing the group cardinalities is also an essential requirement for fair workload distribution across the groups. In this paper, we introduce the multi-fair capacitated (MFC) grouping problem that fairly partitions students into non-overlapping groups while ensuring balanced group cardinalities (with a lower bound and an upper bound), and maximizing the diversity of members in terms of protected attributes. We propose two approaches: a heuristic method and a knapsack-based method to obtain the MFC grouping. The experiments on a real dataset and a semi-synthetic dataset show that our proposed methods can satisfy students' preferences well and deliver balanced and diverse groups regarding cardinality and the protected attribute, respectively. △ Less

Submitted 20 June, 2022; originally announced June 2022.

Comments: 15 pages, 4 figures, 1 table

Journal ref: The 27th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2023)

arXiv:2206.08403 [pdf, other]

Learning to Teach Fairness-aware Deep Multi-task Learning

Authors: Arjun Roy, Eirini Ntoutsi

Abstract: Fairness-aware learning mainly focuses on single task learning (STL). The fairness implications of multi-task learning (MTL) have only recently been considered and a seminal approach has been proposed that considers the fairness-accuracy trade-off for each task and the performance trade-off among different tasks. Instead of a rigid fairness-accuracy trade-off formulation, we propose a flexible app… ▽ More Fairness-aware learning mainly focuses on single task learning (STL). The fairness implications of multi-task learning (MTL) have only recently been considered and a seminal approach has been proposed that considers the fairness-accuracy trade-off for each task and the performance trade-off among different tasks. Instead of a rigid fairness-accuracy trade-off formulation, we propose a flexible approach that learns how to be fair in a MTL setting by selecting which objective (accuracy or fairness) to optimize at each step. We introduce the L2T-FMT algorithm that is a teacher-student network trained collaboratively; the student learns to solve the fair MTL problem while the teacher instructs the student to learn from either accuracy or fairness, depending on what is harder to learn for each task. Moreover, this dynamic selection of which objective to use at each step for each task reduces the number of trade-off weights from 2T to T, where T is the number of tasks. Our experiments on three real datasets show that L2T-FMT improves on both fairness (12-19%) and accuracy (up to 2%) over state-of-the-art approaches. △ Less

Submitted 16 June, 2022; originally announced June 2022.

Comments: Accepted to be published in the Proceedings of the "European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD)", Sept. 19th to 23rd 2022

arXiv:2204.08027 [pdf, other]

Attention Mechanism based Cognition-level Scene Understanding

Authors: Xuejiao Tang, Tai Le Quy, Eirini Ntoutsi, Kea Turner, Vasile Palade, Israat Haque, Peng Xu, Chris Brown, Wenbin Zhang

Abstract: Given a question-image input, the Visual Commonsense Reasoning (VCR) model can predict an answer with the corresponding rationale, which requires inference ability from the real world. The VCR task, which calls for exploiting the multi-source information as well as learning different levels of understanding and extensive commonsense knowledge, is a cognition-level scene understanding task. The VCR… ▽ More Given a question-image input, the Visual Commonsense Reasoning (VCR) model can predict an answer with the corresponding rationale, which requires inference ability from the real world. The VCR task, which calls for exploiting the multi-source information as well as learning different levels of understanding and extensive commonsense knowledge, is a cognition-level scene understanding task. The VCR task has aroused researchers' interest due to its wide range of applications, including visual question answering, automated vehicle systems, and clinical decision support. Previous approaches to solving the VCR task generally rely on pre-training or exploiting memory with long dependency relationship encoded models. However, these approaches suffer from a lack of generalizability and losing information in long sequences. In this paper, we propose a parallel attention-based cognitive VCR network PAVCR, which fuses visual-textual information efficiently and encodes semantic information in parallel to enable the model to capture rich information for cognition-level inference. Extensive experiments show that the proposed model yields significant improvements over existing methods on the benchmark VCR dataset. Moreover, the proposed model provides intuitive interpretation into visual commonsense reasoning. △ Less

Submitted 18 April, 2022; v1 submitted 17 April, 2022; originally announced April 2022.

Comments: arXiv admin note: text overlap with arXiv:2108.02924, arXiv:2107.01671

arXiv:2201.01148 [pdf, other]

Parity-based Cumulative Fairness-aware Boosting

Authors: Vasileios Iosifidis, Arjun Roy, Eirini Ntoutsi

Abstract: Data-driven AI systems can lead to discrimination on the basis of protected attributes like gender or race. One reason for this behavior is the encoded societal biases in the training data (e.g., females are underrepresented), which is aggravated in the presence of unbalanced class distributions (e.g., "granted" is the minority class). State-of-the-art fairness-aware machine learning approaches fo… ▽ More Data-driven AI systems can lead to discrimination on the basis of protected attributes like gender or race. One reason for this behavior is the encoded societal biases in the training data (e.g., females are underrepresented), which is aggravated in the presence of unbalanced class distributions (e.g., "granted" is the minority class). State-of-the-art fairness-aware machine learning approaches focus on preserving the \emph{overall} classification accuracy while improving fairness. In the presence of class-imbalance, such methods may further aggravate the problem of discrimination by denying an already underrepresented group (e.g., \textit{females}) the fundamental rights of equal social privileges (e.g., equal credit opportunity). To this end, we propose AdaFair, a fairness-aware boosting ensemble that changes the data distribution at each round, taking into account not only the class errors but also the fairness-related performance of the model defined cumulatively based on the partial ensemble. Except for the in-training boosting of the group discriminated over each round, AdaFair directly tackles imbalance during the post-training phase by optimizing the number of ensemble learners for balanced error performance (BER). AdaFair can facilitate different parity-based fairness notions and mitigate effectively discriminatory outcomes. Our experiments show that our approach can achieve parity in terms of statistical parity, equal opportunity, and disparate mistreatment while maintaining good predictive performance for all classes. △ Less

Submitted 4 January, 2022; originally announced January 2022.

Comments: arXiv admin note: text overlap with arXiv:1909.08982

arXiv:2110.00530 [pdf, other]

doi 10.1002/widm.1452

A survey on datasets for fairness-aware machine learning

Authors: Tai Le Quy, Arjun Roy, Vasileios Iosifidis, Wenbin Zhang, Eirini Ntoutsi

Abstract: As decision-making increasingly relies on Machine Learning (ML) and (big) data, the issue of fairness in data-driven Artificial Intelligence (AI) systems is receiving increasing attention from both research and industry. A large variety of fairness-aware machine learning solutions have been proposed which involve fairness-related interventions in the data, learning algorithms and/or model outputs.… ▽ More As decision-making increasingly relies on Machine Learning (ML) and (big) data, the issue of fairness in data-driven Artificial Intelligence (AI) systems is receiving increasing attention from both research and industry. A large variety of fairness-aware machine learning solutions have been proposed which involve fairness-related interventions in the data, learning algorithms and/or model outputs. However, a vital part of proposing new approaches is evaluating them empirically on benchmark datasets that represent realistic and diverse settings. Therefore, in this paper, we overview real-world datasets used for fairness-aware machine learning. We focus on tabular data as the most common data representation for fairness-aware machine learning. We start our analysis by identifying relationships between the different attributes, particularly w.r.t. protected attributes and class attribute, using a Bayesian network. For a deeper understanding of bias in the datasets, we investigate the interesting relationships using exploratory analysis. △ Less

Submitted 21 January, 2022; v1 submitted 1 October, 2021; originally announced October 2021.

Comments: 56 pages, 36 figures, 20 tables

arXiv:2109.15004 [pdf, other]

doi 10.1109/DSAA53316.2021.9564153

XPROAX-Local explanations for text classification with progressive neighborhood approximation

Authors: Yi Cai, Arthur Zimek, Eirini Ntoutsi

Abstract: The importance of the neighborhood for training a local surrogate model to approximate the local decision boundary of a black box classifier has been already highlighted in the literature. Several attempts have been made to construct a better neighborhood for high dimensional data, like texts, by using generative autoencoders. However, existing approaches mainly generate neighbors by selecting pur… ▽ More The importance of the neighborhood for training a local surrogate model to approximate the local decision boundary of a black box classifier has been already highlighted in the literature. Several attempts have been made to construct a better neighborhood for high dimensional data, like texts, by using generative autoencoders. However, existing approaches mainly generate neighbors by selecting purely at random from the latent space and struggle under the curse of dimensionality to learn a good local decision boundary. To overcome this problem, we propose a progressive approximation of the neighborhood using counterfactual instances as initial landmarks and a careful 2-stage sampling approach to refine counterfactuals and generate factuals in the neighborhood of the input instance to be explained. Our work focuses on textual data and our explanations consist of both word-level explanations from the original instance (intrinsic) and the neighborhood (extrinsic) and factual- and counterfactual-instances discovered during the neighborhood generation process that further reveal the effect of altering certain parts in the input text. Our experiments on real-world datasets demonstrate that our method outperforms the competitors in terms of usefulness and stability (for the qualitative part) and completeness, compactness and correctness (for the quantitative part). △ Less

Submitted 30 September, 2021; originally announced September 2021.

Comments: IEEE DSAA '21

arXiv:2108.06231 [pdf, other]

Online Fairness-Aware Learning with Imbalanced Data Streams

Authors: Vasileios Iosifidis, Wenbin Zhang, Eirini Ntoutsi

Abstract: Data-driven learning algorithms are employed in many online applications, in which data become available over time, like network monitoring, stock price prediction, job applications, etc. The underlying data distribution might evolve over time calling for model adaptation as new instances arrive and old instances become obsolete. In such dynamic environments, the so-called data streams, fairness-a… ▽ More Data-driven learning algorithms are employed in many online applications, in which data become available over time, like network monitoring, stock price prediction, job applications, etc. The underlying data distribution might evolve over time calling for model adaptation as new instances arrive and old instances become obsolete. In such dynamic environments, the so-called data streams, fairness-aware learning cannot be considered as a one-off requirement, but rather it should comprise a continual requirement over the stream. Recent fairness-aware stream classifiers ignore the problem of class imbalance, which manifests in many real-life applications, and mitigate discrimination mainly because they "reject" minority instances at large due to their inability to effectively learn all classes. In this work, we propose \ours, an online fairness-aware approach that maintains a valid and fair classifier over the stream. \ours~is an online boosting approach that changes the training distribution in an online fashion by monitoring stream's class imbalance and tweaks its decision boundary to mitigate discriminatory outcomes over the stream. Experiments on 8 real-world and 1 synthetic datasets from different domains with varying class imbalance demonstrate the superiority of our method over state-of-the-art fairness-aware stream approaches with a range (relative) increase [11.2\%-14.2\%] in balanced accuracy, [22.6\%-31.8\%] in gmean, [42.5\%-49.6\%] in recall, [14.3\%-25.7\%] in kappa and [89.4\%-96.6\%] in statistical parity (fairness). △ Less

Submitted 13 August, 2021; originally announced August 2021.

Comments: 21 pages, extension of Discovery Science paper

arXiv:2108.02924 [pdf, other]

Interpretable Visual Understanding with Cognitive Attention Network

Authors: Xuejiao Tang, Wenbin Zhang, Yi Yu, Kea Turner, Tyler Derr, Mengyu Wang, Eirini Ntoutsi

Abstract: While image understanding on recognition-level has achieved remarkable advancements, reliable visual scene understanding requires comprehensive image understanding on recognition-level but also cognition-level, which calls for exploiting the multi-source information as well as learning different levels of understanding and extensive commonsense knowledge. In this paper, we propose a novel Cognitiv… ▽ More While image understanding on recognition-level has achieved remarkable advancements, reliable visual scene understanding requires comprehensive image understanding on recognition-level but also cognition-level, which calls for exploiting the multi-source information as well as learning different levels of understanding and extensive commonsense knowledge. In this paper, we propose a novel Cognitive Attention Network (CAN) for visual commonsense reasoning to achieve interpretable visual understanding. Specifically, we first introduce an image-text fusion module to fuse information from images and text collectively. Second, a novel inference module is designed to encode commonsense among image, query and response. Extensive experiments on large-scale Visual Commonsense Reasoning (VCR) benchmark dataset demonstrate the effectiveness of our approach. The implementation is publicly available at https://github.com/tanjatang/CAN △ Less

Submitted 7 December, 2023; v1 submitted 5 August, 2021; originally announced August 2021.

Comments: ICANN21

arXiv:2107.07919 [pdf, other]

A Survey on Bias in Visual Datasets

Authors: Simone Fabbrizzi, Symeon Papadopoulos, Eirini Ntoutsi, Ioannis Kompatsiaris

Abstract: Computer Vision (CV) has achieved remarkable results, outperforming humans in several tasks. Nonetheless, it may result in significant discrimination if not handled properly as CV systems highly depend on the data they are fed with and can learn and amplify biases within such data. Thus, the problems of understanding and discovering biases are of utmost importance. Yet, there is no comprehensive s… ▽ More Computer Vision (CV) has achieved remarkable results, outperforming humans in several tasks. Nonetheless, it may result in significant discrimination if not handled properly as CV systems highly depend on the data they are fed with and can learn and amplify biases within such data. Thus, the problems of understanding and discovering biases are of utmost importance. Yet, there is no comprehensive survey on bias in visual datasets. Hence, this work aims to: i) describe the biases that might manifest in visual datasets; ii) review the literature on methods for bias discovery and quantification in visual datasets; iii) discuss existing attempts to collect bias-aware visual datasets. A key conclusion of our study is that the problem of bias discovery and quantification in visual datasets is still open, and there is room for improvement in terms of both methods and the range of biases that can be addressed. Moreover, there is no such thing as a bias-free dataset, so scientists and practitioners must become aware of the biases in their datasets and make them explicit. To this end, we propose a checklist to spot different types of bias during visual dataset collection. △ Less

Submitted 23 June, 2022; v1 submitted 16 July, 2021; originally announced July 2021.

arXiv:2104.13312 [pdf, other]

Multi-fairness under class-imbalance

Authors: Arjun Roy, Vasileios Iosifidis, Eirini Ntoutsi

Abstract: Recent studies showed that datasets used in fairness-aware machine learning for multiple protected attributes (referred to as multi-discrimination hereafter) are often imbalanced. The class-imbalance problem is more severe for the often underrepresented protected group (e.g. female, non-white, etc.) in the critical minority class. Still, existing methods focus only on the overall error-discriminat… ▽ More Recent studies showed that datasets used in fairness-aware machine learning for multiple protected attributes (referred to as multi-discrimination hereafter) are often imbalanced. The class-imbalance problem is more severe for the often underrepresented protected group (e.g. female, non-white, etc.) in the critical minority class. Still, existing methods focus only on the overall error-discrimination trade-off, ignoring the imbalance problem, thus amplifying the prevalent bias in the minority classes. Therefore, solutions are needed to solve the combined problem of multi-discrimination and class-imbalance. To this end, we introduce a new fairness measure, Multi-Max Mistreatment (MMM), which considers both (multi-attribute) protected group and class membership of instances to measure discrimination. To solve the combined problem, we propose a boosting approach that incorporates MMM-costs in the distribution update and post-training selects the optimal trade-off among accurate, balanced, and fair solutions. The experimental results show the superiority of our approach against state-of-the-art methods in producing the best balanced performance across groups and classes and the best accuracy for the protected groups in the minority class. △ Less

Submitted 21 June, 2022; v1 submitted 27 April, 2021; originally announced April 2021.

arXiv:2104.12116 [pdf, other]

Fair-Capacitated Clustering

Authors: Tai Le Quy, Arjun Roy, Gunnar Friege, Eirini Ntoutsi

Abstract: Traditionally, clustering algorithms focus on partitioning the data into groups of similar instances. The similarity objective, however, is not sufficient in applications where a fair-representation of the groups in terms of protected attributes like gender or race, is required for each cluster. Moreover, in many applications, to make the clusters useful for the end-user, a balanced cardinality am… ▽ More Traditionally, clustering algorithms focus on partitioning the data into groups of similar instances. The similarity objective, however, is not sufficient in applications where a fair-representation of the groups in terms of protected attributes like gender or race, is required for each cluster. Moreover, in many applications, to make the clusters useful for the end-user, a balanced cardinality among the clusters is required. Our motivation comes from the education domain where studies indicate that students might learn better in diverse student groups and of course groups of similar cardinality are more practical e.g., for group assignments. To this end, we introduce the fair-capacitated clustering problem that partitions the data into clusters of similar instances while ensuring cluster fairness and balancing cluster cardinalities. We propose a two-step solution to the problem: i) we rely on fairlets to generate minimal sets that satisfy the fair constraint and ii) we propose two approaches, namely hierarchical clustering and partitioning-based clustering, to obtain the fair-capacitated clustering. The hierarchical approach embeds the additional cardinality requirements during the merging step while the partitioning-based one alters the assignment step using a knapsack problem formulation to satisfy the additional requirements. Our experiments on four educational datasets show that our approaches deliver well-balanced clusters in terms of both fairness and cardinality while maintaining a good clustering quality. △ Less

Submitted 28 April, 2021; v1 submitted 25 April, 2021; originally announced April 2021.

Comments: 10 pages, 5 figures, 14th International Conference on Educational Data Mining - EDM 2021 (short paper)

Journal ref: 14th International Conference on Educational Data Mining - EDM 2021

arXiv:2104.05592 [pdf, other]

doi 10.1007/978-3-030-86520-7_42

Consequence-aware Sequential Counterfactual Generation

Authors: Philip Naumann, Eirini Ntoutsi

Abstract: Counterfactuals have become a popular technique nowadays for interacting with black-box machine learning models and understanding how to change a particular instance to obtain a desired outcome from the model. However, most existing approaches assume instant materialization of these changes, ignoring that they may require effort and a specific order of application. Recently, methods have been prop… ▽ More Counterfactuals have become a popular technique nowadays for interacting with black-box machine learning models and understanding how to change a particular instance to obtain a desired outcome from the model. However, most existing approaches assume instant materialization of these changes, ignoring that they may require effort and a specific order of application. Recently, methods have been proposed that also consider the order in which actions are applied, leading to the so-called sequential counterfactual generation problem. In this work, we propose a model-agnostic method for sequential counterfactual generation. We formulate the task as a multi-objective optimization problem and present a genetic algorithm approach to find optimal sequences of actions leading to the counterfactuals. Our cost model considers not only the direct effect of an action, but also its consequences. Experimental results show that compared to state-of-the-art, our approach generates less costly solutions, is more efficient and provides the user with a diverse set of solutions to choose from. △ Less

Submitted 2 July, 2021; v1 submitted 12 April, 2021; originally announced April 2021.

Comments: 16 pages, 6 figures, Accepted for publication in the research track at ECML-PKDD 2021

arXiv:2104.02055 [pdf, other]

Data augmentation for dealing with low sampling rates in NILM

Authors: Tai Le Quy, Sergej Zerr, Eirini Ntoutsi, Wolfgang Nejdl

Abstract: Data have an important role in evaluating the performance of NILM algorithms. The best performance of NILM algorithms is achieved with high-quality evaluation data. However, many existing real-world data sets come with a low sampling quality, and often with gaps, lacking data for some recording periods. As a result, in such data, NILM algorithms can hardly recognize devices and estimate their powe… ▽ More Data have an important role in evaluating the performance of NILM algorithms. The best performance of NILM algorithms is achieved with high-quality evaluation data. However, many existing real-world data sets come with a low sampling quality, and often with gaps, lacking data for some recording periods. As a result, in such data, NILM algorithms can hardly recognize devices and estimate their power consumption properly. An important step towards improving the performance of these energy disaggregation methods is to improve the quality of the data sets. In this paper, we carry out experiments using several methods to increase the sampling rate of low sampling rate data. Our results show that augmentation of low-frequency data can support the considered NILM algorithms in estimating appliances' consumption with a higher F-score measurement. △ Less

Submitted 30 March, 2021; originally announced April 2021.

Comments: 10 pages, 3 figures, 6 tables

arXiv:2012.14791 [pdf, ps, other]

doi 10.1109/BigData50022.2020.9378101

Drift-Aware Multi-Memory Model for Imbalanced Data Streams

Authors: Amir Abolfazli, Eirini Ntoutsi

Abstract: Online class imbalance learning deals with data streams that are affected by both concept drift and class imbalance. Online learning tries to find a trade-off between exploiting previously learned information and incorporating new information into the model. This requires both the incremental update of the model and the ability to unlearn outdated information. The improper use of unlearning, howev… ▽ More Online class imbalance learning deals with data streams that are affected by both concept drift and class imbalance. Online learning tries to find a trade-off between exploiting previously learned information and incorporating new information into the model. This requires both the incremental update of the model and the ability to unlearn outdated information. The improper use of unlearning, however, can lead to the retroactive interference problem, a phenomenon that occurs when newly learned information interferes with the old information and impedes the recall of previously learned information. The problem becomes more severe when the classes are not equally represented, resulting in the removal of minority information from the model. In this work, we propose the Drift-Aware Multi-Memory Model (DAM3), which addresses the class imbalance problem in online learning for memory-based models. DAM3 mitigates class imbalance by incorporating an imbalance-sensitive drift detector, preserving a balanced representation of classes in the model, and resolving retroactive interference using a working memory that prevents the forgetting of old information. We show through experiments on real-world and synthetic datasets that the proposed method mitigates class imbalance and outperforms the state-of-the-art methods. △ Less

Submitted 29 December, 2020; originally announced December 2020.

arXiv:2004.02173 [pdf, other]

FairNN- Conjoint Learning of Fair Representations for Fair Decisions

Authors: Tongxin Hu, Vasileios Iosifidis, Wentong Liao, Hang Zhang, Michael YingYang, Eirini Ntoutsi, Bodo Rosenhahn

Abstract: In this paper, we propose FairNN a neural network that performs joint feature representation and classification for fairness-aware learning. Our approach optimizes a multi-objective loss function in which (a) learns a fair representation by suppressing protected attributes (b) maintains the information content by minimizing a reconstruction loss and (c) allows for solving a classification task in… ▽ More In this paper, we propose FairNN a neural network that performs joint feature representation and classification for fairness-aware learning. Our approach optimizes a multi-objective loss function in which (a) learns a fair representation by suppressing protected attributes (b) maintains the information content by minimizing a reconstruction loss and (c) allows for solving a classification task in a fair manner by minimizing the classification error and respecting the equalized odds-based fairness regularized. Our experiments on a variety of datasets demonstrate that such a joint approach is superior to separate treatment of unfairness in representation learning or supervised learning. Additionally, our regularizers can be adaptively weighted to balance the different components of the loss function, thus allowing for a very general framework for conjoint fair representation learning and decision making. △ Less

Submitted 11 April, 2020; v1 submitted 5 April, 2020; originally announced April 2020.

Comments: Code will be available

arXiv:2002.00695 [pdf, other]

FAE: A Fairness-Aware Ensemble Framework

Authors: Vasileios Iosifidis, Besnik Fetahu, Eirini Ntoutsi

Abstract: Automated decision making based on big data and machine learning (ML) algorithms can result in discriminatory decisions against certain protected groups defined upon personal data like gender, race, sexual orientation etc. Such algorithms designed to discover patterns in big data might not only pick up any encoded societal biases in the training data, but even worse, they might reinforce such bias… ▽ More Automated decision making based on big data and machine learning (ML) algorithms can result in discriminatory decisions against certain protected groups defined upon personal data like gender, race, sexual orientation etc. Such algorithms designed to discover patterns in big data might not only pick up any encoded societal biases in the training data, but even worse, they might reinforce such biases resulting in more severe discrimination. The majority of thus far proposed fairness-aware machine learning approaches focus solely on the pre-, in- or post-processing steps of the machine learning process, that is, input data, learning algorithms or derived models, respectively. However, the fairness problem cannot be isolated to a single step of the ML process. Rather, discrimination is often a result of complex interactions between big data and algorithms, and therefore, a more holistic approach is required. The proposed FAE (Fairness-Aware Ensemble) framework combines fairness-related interventions at both pre- and postprocessing steps of the data analysis process. In the preprocessing step, we tackle the problems of under-representation of the protected group (group imbalance) and of class-imbalance by generating balanced training samples. In the post-processing step, we tackle the problem of class overlapping by shifting the decision boundary in the direction of fairness. △ Less

Submitted 3 February, 2020; originally announced February 2020.

Comments: 6 pages

Journal ref: IEEE International Conference on Big Data, 2019

arXiv:2001.09762 [pdf, other]

Bias in Data-driven AI Systems -- An Introductory Survey

Authors: Eirini Ntoutsi, Pavlos Fafalios, Ujwal Gadiraju, Vasileios Iosifidis, Wolfgang Nejdl, Maria-Esther Vidal, Salvatore Ruggieri, Franco Turini, Symeon Papadopoulos, Emmanouil Krasanakis, Ioannis Kompatsiaris, Katharina Kinder-Kurlanda, Claudia Wagner, Fariba Karimi, Miriam Fernandez, Harith Alani, Bettina Berendt, Tina Kruegel, Christian Heinze, Klaus Broelemann, Gjergji Kasneci, Thanassis Tiropanis, Steffen Staab

Abstract: AI-based systems are widely employed nowadays to make decisions that have far-reaching impacts on individuals and society. Their decisions might affect everyone, everywhere and anytime, entailing concerns about potential human rights issues. Therefore, it is necessary to move beyond traditional AI algorithms optimized for predictive performance and embed ethical and legal principles in their desig… ▽ More AI-based systems are widely employed nowadays to make decisions that have far-reaching impacts on individuals and society. Their decisions might affect everyone, everywhere and anytime, entailing concerns about potential human rights issues. Therefore, it is necessary to move beyond traditional AI algorithms optimized for predictive performance and embed ethical and legal principles in their design, training and deployment to ensure social good while still benefiting from the huge potential of the AI technology. The goal of this survey is to provide a broad multi-disciplinary overview of the area of bias in AI systems, focusing on technical challenges and solutions as well as to suggest new research directions towards approaches well-grounded in a legal frame. In this survey, we focus on data-driven AI, as a large part of AI is powered nowadays by (big) data and powerful Machine Learning (ML) algorithms. If otherwise not specified, we use the general term bias to describe problems related to the gathering or processing of data that might result in prejudiced decisions on the bases of demographic features like race, sex, etc. △ Less

Submitted 14 January, 2020; originally announced January 2020.

Comments: 19 pages, 1 figure

arXiv:1909.08982 [pdf, other]

doi 10.1145/3357384.3357974

AdaFair: Cumulative Fairness Adaptive Boosting

Authors: Vasileios Iosifidis, Eirini Ntoutsi

Abstract: The widespread use of ML-based decision making in domains with high societal impact such as recidivism, job hiring and loan credit has raised a lot of concerns regarding potential discrimination. In particular, in certain cases it has been observed that ML algorithms can provide different decisions based on sensitive attributes such as gender or race and therefore can lead to discrimination. Altho… ▽ More The widespread use of ML-based decision making in domains with high societal impact such as recidivism, job hiring and loan credit has raised a lot of concerns regarding potential discrimination. In particular, in certain cases it has been observed that ML algorithms can provide different decisions based on sensitive attributes such as gender or race and therefore can lead to discrimination. Although, several fairness-aware ML approaches have been proposed, their focus has been largely on preserving the overall classification accuracy while improving fairness in predictions for both protected and non-protected groups (defined based on the sensitive attribute(s)). The overall accuracy however is not a good indicator of performance in case of class imbalance, as it is biased towards the majority class. As we will see in our experiments, many of the fairness-related datasets suffer from class imbalance and therefore, tackling fairness requires also tackling the imbalance problem. To this end, we propose AdaFair, a fairness-aware classifier based on AdaBoost that further updates the weights of the instances in each boosting round taking into account a cumulative notion of fairness based upon all current ensemble members, while explicitly tackling class-imbalance by optimizing the number of ensemble members for balanced classification error. Our experiments show that our approach can achieve parity in true positive and true negative rates for both protected and non-protected groups, while it significantly outperforms existing fairness-aware methods up to 25% in terms of balanced error. △ Less

Submitted 17 September, 2019; originally announced September 2019.

Comments: 10 pages, to appear in proceedings of the 28th ACM International Conference on Information and Knowledge Management (CIKM)

arXiv:1907.07237 [pdf, other]

FAHT: An Adaptive Fairness-aware Decision Tree Classifier

Authors: Wenbin Zhang, Eirini Ntoutsi

Abstract: Automated data-driven decision-making systems are ubiquitous across a wide spread of online as well as offline services. These systems, depend on sophisticated learning algorithms and available data, to optimize the service function for decision support assistance. However, there is a growing concern about the accountability and fairness of the employed models by the fact that often the available… ▽ More Automated data-driven decision-making systems are ubiquitous across a wide spread of online as well as offline services. These systems, depend on sophisticated learning algorithms and available data, to optimize the service function for decision support assistance. However, there is a growing concern about the accountability and fairness of the employed models by the fact that often the available historic data is intrinsically discriminatory, i.e., the proportion of members sharing one or more sensitive attributes is higher than the proportion in the population as a whole when receiving positive classification, which leads to a lack of fairness in decision support system. A number of fairness-aware learning methods have been proposed to handle this concern. However, these methods tackle fairness as a static problem and do not take the evolution of the underlying stream population into consideration. In this paper, we introduce a learning mechanism to design a fair classifier for online stream based decision-making. Our learning model, FAHT (Fairness-Aware Hoeffding Tree), is an extension of the well-known Hoeffding Tree algorithm for decision tree induction over streams, that also accounts for fairness. Our experiments show that our algorithm is able to deal with discrimination in streaming environments, while maintaining a moderate predictive performance over the stream. △ Less

Submitted 16 July, 2019; originally announced July 2019.

Comments: Accepted to IJCAI 2019

arXiv:1907.07223 [pdf, other]

doi 10.1007/978-3-030-27615-7_20

Fairness-enhancing interventions in stream classification

Authors: Vasileios Iosifidis, Thi Ngoc Han Tran, Eirini Ntoutsi

Abstract: The wide spread usage of automated data-driven decision support systems has raised a lot of concerns regarding accountability and fairness of the employed models in the absence of human supervision. Existing fairness-aware approaches tackle fairness as a batch learning problem and aim at learning a fair model which can then be applied to future instances of the problem. In many applications, howev… ▽ More The wide spread usage of automated data-driven decision support systems has raised a lot of concerns regarding accountability and fairness of the employed models in the absence of human supervision. Existing fairness-aware approaches tackle fairness as a batch learning problem and aim at learning a fair model which can then be applied to future instances of the problem. In many applications, however, the data comes sequentially and its characteristics might evolve with time. In such a setting, it is counter-intuitive to "fix" a (fair) model over the data stream as changes in the data might incur changes in the underlying model therefore, affecting its fairness. In this work, we propose fairness-enhancing interventions that modify the input data so that the outcome of any stream classifier applied to that data will be fair. Experiments on real and synthetic data show that our approach achieves good predictive performance and low discrimination scores over the course of the stream. △ Less

Submitted 16 July, 2019; originally announced July 2019.

Comments: 15 pages, 7 figures. To appear in the proceedings of 30th International Conference on Database and Expert Systems Applications, Linz, Austria August 26 - 29, 2019

arXiv:1810.11017 [pdf, ps, other]

doi 10.1007/s00799-018-0257-7

Tracking the History and Evolution of Entities: Entity-centric Temporal Analysis of Large Social Media Archives

Authors: Pavlos Fafalios, Vasileios Iosifidis, Kostas Stefanidis, Eirini Ntoutsi

Abstract: How did the popularity of the Greek Prime Minister evolve in 2015? How did the predominant sentiment about him vary during that period? Were there any controversial sub-periods? What other entities were related to him during these periods? To answer these questions, one needs to analyze archived documents and data about the query entities, such as old news articles or social media archives. In par… ▽ More How did the popularity of the Greek Prime Minister evolve in 2015? How did the predominant sentiment about him vary during that period? Were there any controversial sub-periods? What other entities were related to him during these periods? To answer these questions, one needs to analyze archived documents and data about the query entities, such as old news articles or social media archives. In particular, user-generated content posted in social networks, like Twitter and Facebook, can be seen as a comprehensive documentation of our society, and thus meaningful analysis methods over such archived data are of immense value for sociologists, historians and other interested parties who want to study the history and evolution of entities and events. To this end, in this paper we propose an entity-centric approach to analyze social media archives and we define measures that allow studying how entities were reflected in social media in different time periods and under different aspects, like popularity, attitude, controversiality, and connectedness with other entities. A case study using a large Twitter archive of four years illustrates the insights that can be gained by such an entity-centric and multi-aspect analysis. △ Less

Submitted 24 October, 2018; originally announced October 2018.

Comments: This is a preprint of an article accepted for publication in the International Journal on Digital Libraries (2018)

arXiv:1810.10308 [pdf, ps, other]

doi 10.1007/978-3-319-93417-4_12

TweetsKB: A Public and Large-Scale RDF Corpus of Annotated Tweets

Authors: Pavlos Fafalios, Vasileios Iosifidis, Eirini Ntoutsi, Stefan Dietze

Abstract: Publicly available social media archives facilitate research in a variety of fields, such as data science, sociology or the digital humanities, where Twitter has emerged as one of the most prominent sources. However, obtaining, archiving and annotating large amounts of tweets is costly. In this paper, we describe TweetsKB, a publicly available corpus of currently more than 1.5 billion tweets, span… ▽ More Publicly available social media archives facilitate research in a variety of fields, such as data science, sociology or the digital humanities, where Twitter has emerged as one of the most prominent sources. However, obtaining, archiving and annotating large amounts of tweets is costly. In this paper, we describe TweetsKB, a publicly available corpus of currently more than 1.5 billion tweets, spanning almost 5 years (Jan'13-Nov'17). Metadata information about the tweets as well as extracted entities, hashtags, user mentions and sentiment information are exposed using established RDF/S vocabularies. Next to a description of the extraction and annotation process, we present use cases to illustrate scenarios for entity-centric information exploration, data integration and knowledge discovery facilitated by TweetsKB. △ Less

Submitted 23 October, 2018; originally announced October 2018.

arXiv:1509.01288 [pdf, other]

Incremental Active Opinion Learning Over a Stream of Opinionated Documents

Authors: Max Zimmermann, Eirini Ntoutsi, Myra Spiliopoulou

Abstract: Applications that learn from opinionated documents, like tweets or product reviews, face two challenges. First, the opinionated documents constitute an evolving stream, where both the author's attitude and the vocabulary itself may change. Second, labels of documents are scarce and labels of words are unreliable, because the sentiment of a word depends on the (unknown) context in the author's mind… ▽ More Applications that learn from opinionated documents, like tweets or product reviews, face two challenges. First, the opinionated documents constitute an evolving stream, where both the author's attitude and the vocabulary itself may change. Second, labels of documents are scarce and labels of words are unreliable, because the sentiment of a word depends on the (unknown) context in the author's mind. Most of the research on mining over opinionated streams focuses on the first aspect of the problem, whereas for the second a continuous supply of labels from the stream is assumed. Such an assumption though is utopian as the stream is infinite and the labeling cost is prohibitive. To this end, we investigate the potential of active stream learning algorithms that ask for labels on demand. Our proposed ACOSTREAM 1 approach works with limited labels: it uses an initial seed of labeled documents, occasionally requests additional labels for documents from the human expert and incrementally adapts to the underlying stream while exploiting the available labeled documents. In its core, ACOSTREAM consists of a MNB classifier coupled with "sampling" strategies for requesting class labels for new unlabeled documents. In the experiments, we evaluate the classifier performance over time by varying: (a) the class distribution of the opinionated stream, while assuming that the set of the words in the vocabulary is fixed but their polarities may change with the class distribution; and (b) the number of unknown words arriving at each moment, while the class polarity may also change. Our results show that active learning on a stream of opinionated documents, delivers good performance while requiring a small selection of labels △ Less

Submitted 3 September, 2015; originally announced September 2015.

Comments: 10 pages, 14 figures, conference: WISDOM (KDD'15)

Showing 1–39 of 39 results for author: Ntoutsi, E