Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Responsible data management

Published: 20 May 2022 Publication History
  • Get Citation Alerts
  • Abstract

    Perspectives on the role and responsibility of the data-management research community in designing, developing, using, and overseeing automated decision systems.

    References

    [1]
    Abiteboul, S. and Stoyanovich, J. Transparency, fairness, data protection, neutrality: Data management challenges in the face of new regulation. J. of Data and Information Quality 11, 3 (2019), 15:1--15:9.
    [2]
    Asudeh, A., Jin, Z., and Jagadish, H.V. Assessing and remedying coverage for a given dataset. In 35th IEEE International Conference on Data Engineering (April 2019), 554--565.
    [3]
    Baeza-Yates, R. Bias on the web. Communications of the ACM 61, 6 (2018), 54--61.
    [4]
    Biessmann, F., Salinas, D., Schelter, S., Schmidt, P., and Lange, D. Deep learning for missing value imputation in tables with non-numerical data. In Proceedings of the 27th ACM Intern. Conf. on Information and Knowledge Management (2018), 2017--2025.
    [5]
    Bogen, M. and Rieke, A. Help wanted: An examination of hiring algorithms, equity, and bias. Upturn (2018).
    [6]
    Cauwenberghs, G. and Poggio, T. Incremental and decremental support vector machine learning. NeurIPS (2001), 409--415.
    [7]
    Chen, I., Johansson, F., and Sontag, D. Why is my classifier discriminatory? S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 3543--3554.
    [8]
    Chouldechova, A. and Roth, A. A snapshot of the frontiers of fairness in machine learning. Communications of the ACM 63, 5 (2020), 82--89.
    [9]
    Crenshaw, K. Demarginalizing the intersection of race and sex: A Black feminist critique of antidiscrimination doctrine, feminist theory and antiracist politics. University of Chicago Legal Forum 1 (1989), 139--167.
    [10]
    Datta, A., Sen, S., and Zick, Y. Algorithmic transparency via quantitative input influence: Theory and experiments with learning systems. In IEEE Symposium on Security and Privacy (May 2016), 598--617.
    [11]
    Friedler, S., Scheidegger, C., and Venkatasubramanian, S. The (im)possibility of fairness: Different value systems require different mechanisms for fair decision making. Communications of the ACM 64, 4 (2021), 136--143.
    [12]
    Friedman, B. and Nissenbaum, H. Bias in computer systems. ACM Transactions on Information Systems 14, 3 (1996), 330--347.
    [13]
    Gebru, T., Morgenstern, J., Vecchione, B., Vaughan, J., Wallach, H., Daumé III, H., and Crawford, K. Datasheets for datasets. CoRR (2018), abs/1803.09010.
    [14]
    Ginart, A., Guan, M., Valiant, G., and Zou, J. Making AI forget you: Data deletion in machine learning. In NeurIPS (2019), 3513--3526.
    [15]
    Grafberger, S., Stoyanovich, J., and Schelter, S. Lightweight inspection of data preprocessing in native machine learning pipelines. In 11th Conf. on Innovative Data Sys. Research, Online Proceedings (January 2021), http://www.cidrdb.org.
    [16]
    Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., and Pedreschi, D. A survey of methods for explaining black box models. ACM Computing Surveys 51, 5 (2019), 93:1--93:42.
    [17]
    Herschel, M., Diestelkämper, R., and Ben Lahmar, H. A survey on provenance: What for? What form? What from? VLDB Journal 26, 6 (2017), 881--906.
    [18]
    Holland, S., Hosny, A., Newman, S., Joseph, J., and Chmielinski, K. The dataset nutrition label: A framework to drive higher data quality standards. CoRR (2018), abs/1805.03677.
    [19]
    Jagadish, H.V., Gehrke, J., Labrinidis, A., Papakonstantinou, Y., Patel, J., Ramakrishnan, R., and Shahabi, C. Big data and its technical challenges. Communications of the ACM 57, 7 (2014), 86--94.
    [20]
    Kappelhof, J. Survey research and the quality of survey data among ethnic minorities. In Total Survey Error in Practice, Wiley (2017).
    [21]
    Kilbertus, N., Carulla, M., Parascandolo, G., Hardt, M., Janzing, D., and Schölkopf, B. Avoiding discrimination through causal reasoning. In Advances in Neural Information Processing Systems (2017), 656--666.
    [22]
    Kusner, M., Loftus, J., Russell, C., and Silva, R. Counterfactual fairness. I. Guyon, U. von Luxburg, S. Bengio, H.M. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, In Advances in Neural Information Processing Systems 30: (2017), 4066--4076.
    [23]
    Lehr, D. and Ohm, P. Playing with the data: What legal scholars should learn about machine learning. UC Davis Law Review 51, 2 (2017), 653--717.
    [24]
    Lewis, A. and Stoyanovich, J. Teaching responsible data science. Intern. J. of Artificial Intelligence in Education (2021).
    [25]
    Mitchell, M., et al. Model cards for model reporting. In Proceedings of the Conf. on Fairness, Accountability, and Transparency 2019, 220--229.
    [26]
    Olteanu, A., Castillo, C., Diaz, F., and Kiciman, E. Social data: Biases, methodological pitfalls, and ethical boundaries. Frontiers Big Data 2, 13 (2019).
    [27]
    Rabanser, S., Günnemann, S., and Lipton, Z. Failing loudly: An empirical study of methods for detecting dataset shift. H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Gannett, editors. In Advances in Neural Information Processing Systems 32 (December 2019), 1394--1406.
    [28]
    Reeves, R. and Halikias, D. Race gaps in SAT scores highlight inequality and hinder upward mobility. Brookings (2017), https://www.brookings.edu/research/race-gaps-in-sat-scores-highlight-inequality-and-hinder-upward-mobility.
    [29]
    Salimi, B., Rodriguez, L., Howe, B., and Suciu, D. Interventional fairness: Causal database repair for algorithmic fairness. P.A. Boncz, S. Manegold, A. Ailamaki, A. Deshpande, and T. Kraska, editors. In Proceedings of the 2019 Intern. Conf. on Management of Data, 793--810.
    [30]
    Sarkar, S., Papon, T., Staratzis, D., and Athanassoulis, M. Lethe: A tunable delete-aware LSM engine. In Proceedings of the 2020 Intern. Conf. on Management of Data.
    [31]
    Schelter, S. "Amnesia"--a selection of machine learning models that can forget user data very fast. Conf. on Innovative Data Systems Research, 2020.
    [32]
    Schelter, S., Grafberger, S., and Dunning, T. HedgeCut: Maintaining randomised trees for low-latency machine unlearning. In Proceedings of the 2021 Intern. Conf. on Management of Data.
    [33]
    Schelter, S. and Stoyanovich, J. Taming technical bias in machine learning pipelines. IEEE Data Engineering Bulletin 43, 4 (2020).
    [34]
    Selbst, A. Disparate impact in big data policing. Georgia Law Review 52, 109 (2017).
    [35]
    Shastri, S., Banakar, V., Wasserman, M., Kumar, A., and Chidambaram, V. Understanding and benchmarking the impact of GDPR on database systems. PVLDB (2020).
    [36]
    Stoyanovich, J. and Howe, B. Nutritional labels for data and models. IEEE Data Engineering Bulletin 42, 3 (2019), 13--23.
    [37]
    Stoyanovich, J., Howe, B., and Jagadish, H.V. Responsible data management. In Proceedings of the VLDB Endowment 13, 12 (2020), 3474--3488.
    [38]
    Yang, K., Loftus, J., and Stoyanovich, J. Causal intersectionality and fair ranking. K. Ligett and S. Gupta, editors. In 2nd Symposium on Foundations of Responsible Computing, Volume 192 of LIPICS, Schloss Dagstuhl--Leibniz Center for Informatics (June 2021), 7:1--7:20.
    [39]
    Yang, K., Stoyanovich, J., Asudeh, A., Howe, B., Jagadish, H.V., and Miklau, G. A nutritional label for rankings. G. Das, C. Jermaine, and P. Bernstein, editors. In Proceedings of the 2018 Intern. Conf. on Management of Data, 1773--1776.
    [40]
    Zehlike, M., Yang, K., and Stoyanovich, J. Fairness in ranking: A survey. CoRR (2021), abs/2103.14000.

    Cited By

    View all
    • (2024)FairHash: A Fair and Memory/Time-efficient HashmapProceedings of the ACM on Management of Data10.1145/36549392:3(1-29)Online publication date: 30-May-2024
    • (2024)Towards Interactively Improving ML Data Preparation Code via "Shadow Pipelines"Proceedings of the Eighth Workshop on Data Management for End-to-End Machine Learning10.1145/3650203.3663327(7-11)Online publication date: 9-Jun-2024
    • (2024)First Workshop on Governance, Understanding and Integration of Data for Effective and Responsible AI (GUIDE-AI)Companion of the 2024 International Conference on Management of Data10.1145/3626246.3655019(661-662)Online publication date: 9-Jun-2024
    • Show More Cited By

    Index Terms

    1. Responsible data management

                                    Recommendations

                                    Comments

                                    Information & Contributors

                                    Information

                                    Published In

                                    cover image Communications of the ACM
                                    Communications of the ACM  Volume 65, Issue 6
                                    June 2022
                                    98 pages
                                    ISSN:0001-0782
                                    EISSN:1557-7317
                                    DOI:10.1145/3538687
                                    Issue’s Table of Contents
                                    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                                    Publisher

                                    Association for Computing Machinery

                                    New York, NY, United States

                                    Publication History

                                    Published: 20 May 2022
                                    Published in CACM Volume 65, Issue 6

                                    Permissions

                                    Request permissions for this article.

                                    Check for updates

                                    Qualifiers

                                    • Research-article
                                    • Popular
                                    • Refereed

                                    Funding Sources

                                    Contributors

                                    Other Metrics

                                    Bibliometrics & Citations

                                    Bibliometrics

                                    Article Metrics

                                    • Downloads (Last 12 months)2,225
                                    • Downloads (Last 6 weeks)167

                                    Other Metrics

                                    Citations

                                    Cited By

                                    View all
                                    • (2024)FairHash: A Fair and Memory/Time-efficient HashmapProceedings of the ACM on Management of Data10.1145/36549392:3(1-29)Online publication date: 30-May-2024
                                    • (2024)Towards Interactively Improving ML Data Preparation Code via "Shadow Pipelines"Proceedings of the Eighth Workshop on Data Management for End-to-End Machine Learning10.1145/3650203.3663327(7-11)Online publication date: 9-Jun-2024
                                    • (2024)First Workshop on Governance, Understanding and Integration of Data for Effective and Responsible AI (GUIDE-AI)Companion of the 2024 International Conference on Management of Data10.1145/3626246.3655019(661-662)Online publication date: 9-Jun-2024
                                    • (2024)Policy advice and best practices on bias and fairness in AIEthics and Information Technology10.1007/s10676-024-09746-w26:2Online publication date: 29-Apr-2024
                                    • (2023)A Scientific Field in FormationEnhancing Business Communications and Collaboration Through Data Science Applications10.4018/978-1-6684-6786-2.ch004(60-82)Online publication date: 28-Apr-2023
                                    • (2023)“That’s important, but...”: How Computer Science Researchers Anticipate Unintended Consequences of Their Research InnovationsProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581347(1-16)Online publication date: 19-Apr-2023
                                    • (2023)Introducing contextual transparency for automated decision systemsNature Machine Intelligence10.1038/s42256-023-00623-75:3(187-195)Online publication date: 13-Mar-2023
                                    • (2023)Architecting and Engineering Value-Based EcosystemsSoftware Architecture10.1007/978-3-031-36847-9_3(41-68)Online publication date: 3-Jun-2023
                                    • (2022)Dados, Algoritmos, Máquinas E PessoasComputação Brasil10.5753/compbr.2022.47.4400(11-14)Online publication date: 1-Jul-2022
                                    • (2022)Maximizing fair content spread via edge suggestion in social networksProceedings of the VLDB Endowment10.14778/3551793.355182415:11(2692-2705)Online publication date: 1-Jul-2022

                                    View Options

                                    View options

                                    PDF

                                    View or Download as a PDF file.

                                    PDF

                                    eReader

                                    View online with eReader.

                                    eReader

                                    Digital Edition

                                    View this article in digital edition.

                                    Digital Edition

                                    Magazine Site

                                    View this article on the magazine site (external)

                                    Magazine Site

                                    Get Access

                                    Login options

                                    Full Access

                                    Media

                                    Figures

                                    Other

                                    Tables

                                    Share

                                    Share

                                    Share this Publication link

                                    Share on social media