Article

Free access

Optimized pre-processing for discrimination prevention

Authors:

Flavio P. Calmon,

Bhanukiran Vinzamuri,

Karthikeyan Natesan Ramamurthy,

Kush R. VarshneyAuthors Info & Claims

NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems

Pages 3995 - 4004

Published: 04 December 2017 Publication History

PDF eReader Publisher Site

Abstract

Non-discrimination is a recognized objective in algorithmic decision making. In this paper, we introduce a novel probabilistic formulation of data pre-processing for reducing discrimination. We propose a convex optimization for learning a data transformation with three goals: controlling discrimination, limiting distortion in individual data samples, and preserving utility. We characterize the impact of limited sample size in accomplishing this objective. Two instances of the proposed optimization are applied to datasets, including one on real-world criminal recidivism. Results show that discrimination can be greatly reduced at a small cost in classification accuracy.

References

[1]

T. Calders and I. Žliobaitė. Why unbiased computational processes can lead to discriminative decision procedures. In Discrimination and Privacy in the Information Society, pages 43-57. Springer, 2013.

[2]

A. Chouldechova. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. arXiv preprint arXiv:1610.07524, 2016.

[3]

S. Corbett-Davies, E. Pierson, A. Feller, S. Goel, and A. Huq. Algorithmic decision making and the cost of fairness. arXiv preprint arXiv:1701.08230, 2017.

Digital Library

[4]

S. Diamond and S. Boyd. CVXPY: A Python-embedded modeling language for convex optimization. Journal of Machine Learning Research, 17(83):1-5, 2016.

Digital Library

[5]

C. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. Zemel. Fairness through awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, pages 214-226. ACM, 2012.

Digital Library

[6]

T. U. EEOC. Uniform guidelines on employee selection procedures. https://www.eeoc.gov/policy/docs/qanda_clarify_procedures.html, Mar. 1979.

[7]

M. Feldman, S. A. Friedler, J. Moeller, C. Scheidegger, and S. Venkatasubramanian. Certifying and removing disparate impact. In Proc. ACM SIGKDD Int. Conf. Knowl. Disc. Data Min., pages 259-268, 2015.

Digital Library

[8]

B. Fish, J. Kun, and Á. D. Lelkes. A confidence-based approach for balancing fairness and accuracy. In Proceedings of the SIAM International Conference on Data Mining, pages 144-152. SIAM, 2016.

[9]

S. A. Friedler, C. Scheidegger, and S. Venkatasubramanian. On the (im) possibility of fairness. arXiv preprint arXiv:1609.07236, 2016.

[10]

S. Hajian. Simultaneous Discrimination Prevention and Privacy Protection in Data Publishing and Mining. PhD thesis, Universitat Rovira i Virgili, 2013. Available online: https://arxiv.org/abs/1306.6805.

[11]

S. Hajian and J. Domingo-Ferrer. A methodology for direct and indirect discrimination prevention in data mining. IEEE Trans. Knowl. Data Eng., 25(7):1445-1459, 2013.

Digital Library

[12]

M. Hardt, E. Price, and N. Srebro. Equality of opportunity in supervised learning. In Adv. Neur. Inf. Process. Syst. 29, pages 3315-3323, 2016.

[13]

K. D. Johnson, D. P. Foster, and R. A. Stine. Impartial predictive modeling: Ensuring fairness in arbitrary models. arXiv preprint arXiv:1608.00528, 2016.

[14]

F. Kamiran and T. Calders. Data preprocessing techniques for classification without discrimination. Knowledge and Information Systems, 33(1):1-33, 2012.

Digital Library

[15]

T. Kamishima, S. Akaho, and J. Sakuma. Fairness-aware learning through regularization approach. In Data Mining Workshops (ICDMW), IEEE 11th International Conference on, pages 643-650. IEEE, 2011.

Digital Library

[16]

J. Kleinberg, S. Mullainathan, and M. Raghavan. Inherent trade-offs in the fair determination of risk scores. In Proc. Innov. Theoret. Comp. Sci., 2017.

[17]

M. J. Kusner, J. R. Loftus, C. Russell, and R. Silva. Counterfactual fairness. arXiv preprint arXiv:1703.06856, 2017.

[18]

N. Li, T. Li, and S. Venkatasubramanian. t-closeness: Privacy beyond k-anonymity and l-diversity. In IEEE 23rd International Conference on Data Engineering, pages 106-115. IEEE, 2007.

[19]

M. Lichman. UCI machine learning repository, 2013. URL http://archive.ics.uci.edu/ml.

[20]

J. Pearl. Comment: understanding simpson's paradox. The American Statistician, 68(1):8-13, 2014.

[21]

D. Pedreschi, S. Ruggieri, and F. Turini. Discrimination-aware data mining. In Proc. ACM SIGKDD Int. Conf. Knowl. Disc. Data Min., pages 560-568. ACM, 2008.

Digital Library

[22]

D. Pedreschi, S. Ruggieri, and F. Turini. A study of top-k measures for discrimination discovery. In Proc. ACM Symp. Applied Comput., pages 126-131, 2012.

Digital Library

[23]

ProPublica. COMPAS Recidivism Risk Score Data and Analysis. https://www.propublica.org/datastore/dataset/compas-recidivism-risk-score-data-and-analysis, 2017.

[24]

S. Ruggieri. Using t-closeness anonymity to control for non-discrimination. Trans. Data Privacy, 7(2):99-129, 2014.

[25]

M. B. Zafar, I. Valera, M. G. Rodriguez, and K. P. Gummadi. Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. arXiv preprint arXiv:1610.08452, 2016.

Digital Library

[26]

R. Zemel, Y. L. Wu, K. Swersky, T. Pitassi, and C. Dwork. Learning fair representations. In Proc. Int. Conf. Mach. Learn., pages 325-333, 2013.

[27]

Z. Zhang and D. B. Neill. Identifying significant predictive bias in classifiers. In Proceedings of the NIPS Workshop on Interpretable Machine Learning in Complex Systems, 2016. Available online: https://arxiv.org/abs/1611.08292.

[28]

I. Žliobaitė, F. Kamiran, and T. Calders. Handling conditional discrimination. In Proc. IEEE Int. Conf. Data Mining, pages 992-1001, 2011.

Digital Library

Cited By

Xiao YZhang JLiu YMousavi MLiu SXue D(2024)MirrorFair: Fixing Fairness Bugs in Machine Learning Software via Counterfactual PredictionsProceedings of the ACM on Software Engineering10.1145/36608011:FSE(2121-2143)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660801
Züfle ASalim FAnderson TScotch MXiong LSokol KXue HKong RHeslop DPaik HMacIntyre C(2024)Leveraging Simulation Data to Understand Bias in Predictive Models of Infectious Disease SpreadACM Transactions on Spatial Algorithms and Systems10.1145/366063110:2(1-22)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3660631
Pirhadi AMoslemi MCloninger AMilani MSalimi B(2024)OTClean: Data Cleaning for Conditional Independence Violations using Optimal TransportProceedings of the ACM on Management of Data10.1145/36549632:3(1-26)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3654963
Show More Cited By

Optimized pre-processing for discrimination prevention
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning

Recommendations

Visual processing affects the neural basis of auditory discrimination

The interaction between auditory and visual speech streams is a seamless and surprisingly effective process. An intriguing example is the “McGurk effect”: The acoustic syllable /ba/ presented simultaneously with a mouth articulating /ga/ is typically ...
Prestimulus oscillations in the alpha band of the eeg are modulated by the difficulty of feature discrimination and predict activation of a sensory discrimination process

Recent work has demonstrated that the occipital-temporal N1 component of the ERP is sensitive to the difficulty of visual discrimination, in a manner that cannot be explained by simple differences in low-level visual features, arousal, or time on task. ...
Timing of Target Discrimination in Human Frontal Eye Fields

Frontal eye field (FEF) neurons discharge in response to behaviorally relevant stimuli that are potential targets for saccades. Distinct visual and motor processes have been dissociated in the FEF of macaque monkeys, but little is known about the visual ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems

December 2017

7104 pages

ISBN:9781510860964

Publisher

Curran Associates Inc.

Red Hook, NY, United States

Publication History

Published: 04 December 2017

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

68
Total Citations
View Citations
298
Total Downloads

Downloads (Last 12 months)135
Downloads (Last 6 weeks)44

Reflects downloads up to 28 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Xiao YZhang JLiu YMousavi MLiu SXue D(2024)MirrorFair: Fixing Fairness Bugs in Machine Learning Software via Counterfactual PredictionsProceedings of the ACM on Software Engineering10.1145/36608011:FSE(2121-2143)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660801
Züfle ASalim FAnderson TScotch MXiong LSokol KXue HKong RHeslop DPaik HMacIntyre C(2024)Leveraging Simulation Data to Understand Bias in Predictive Models of Infectious Disease SpreadACM Transactions on Spatial Algorithms and Systems10.1145/366063110:2(1-22)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3660631
Pirhadi AMoslemi MCloninger AMilani MSalimi B(2024)OTClean: Data Cleaning for Conditional Independence Violations using Optimal TransportProceedings of the ACM on Management of Data10.1145/36549632:3(1-26)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3654963
Shahbazi NSintos SAsudeh A(2024)FairHash: A Fair and Memory/Time-efficient HashmapProceedings of the ACM on Management of Data10.1145/36549392:3(1-29)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3654939
Dzakpasu DLiu JLi JLiu LSerra ESpezzano F(2024)Integrating Fair Representation Learning with Fairness Regularization for Intersectional Group FairnessProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679802(560-569)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679802
Cachel KRundensteiner ESerra ESpezzano F(2024)Wise Fusion: Group Fairness Enhanced Rank FusionProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679649(163-174)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679649
Yu ZChakraborty JMenzies T(2024)FairBalance: How to Achieve Equalized Odds With Data Pre-ProcessingIEEE Transactions on Software Engineering10.1109/TSE.2024.343144550:9(2294-2312)Online publication date: 22-Jul-2024
https://dl.acm.org/doi/10.1109/TSE.2024.3431445
Khalili MZhang XAbroshan MKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Loss balancing for fair supervised learningProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619075(16271-16290)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3619075
Sharma SHenderson JGhosh JElkind E(2023)FEAMOEProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/55(492-500)Online publication date: 19-Aug-2023
https://dl.acm.org/doi/10.24963/ijcai.2023/55
Shahbazi NDanevski NNargesian FAsudeh ASrivastava D(2023)Through the Fairness Lens: Experimental Analysis and Evaluation of Entity MatchingProceedings of the VLDB Endowment10.14778/3611479.361152516:11(3279-3292)Online publication date: 24-Aug-2023
https://dl.acm.org/doi/10.14778/3611479.3611525
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents