Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

VFDV-IM: An Efficient and Securely Vertical Federated Data Valuation

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14850))

Included in the following conference series:

  • 705 Accesses

Abstract

Vertical federated learning enables multiple participants to build a joint machine learning model upon distributed features of overlapping samples. The performance of VFL models heavily depends on the quality of participants’ local data. It’s essential to measure the contributions of the participants for various purposes, e.g., participant selection and reward allocation. The Shapley value is widely adopted by previous works for contribution assessment. However, computing the Shapley value in VFL requires repetitive model training from scratch, incurring expensive computation and communication overheads. Inspired by this challenge, in this paper, we ask: can we efficiently and securely perform data valuation for participants via the Shapley value in VFL?

We call this problem Vertical Federated Data Valuation, and introduce VFDV-IM, a method utilizing an Inheritance Mechanism to expedite Shapley value calculations by leveraging historical training records. We first propose a simple, yet effective, strategy that directly inherits the model trained over the entire consortium. To further optimize VFDV-IM, we propose a model ensemble approach that measures the similarity of evaluated consortiums, based on which we reweight the historical models. We conduct extensive experiments on various datasets and show that our VFDV-IM can efficiently calculate the Shapley value while maintaining accuracy.

X. Zhou and X. Yan—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 74.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bonawitz, K., et al.: Practical secure aggregation for privacy-preserving machine learning. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 1175–1191 (2017)

    Google Scholar 

  2. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1–27:27 (2011)

    Google Scholar 

  3. Chen, T., Jin, X., Sun, Y., Yin, W.: VAFL: a method of vertical asynchronous federated learning. arXiv preprint arXiv:2007.06081 (2020)

  4. Fan, Z., Fang, H., Zhou, Z., Pei, J., Friedlander, M.P., Liu, C., Zhang, Y.: Improving fairness for data valuation in horizontal federated learning. In: 2022 IEEE 38th International Conference on Data Engineering, pp. 2440–2453. IEEE (2022)

    Google Scholar 

  5. Fan, Z., Fang, H., Zhou, Z., Pei, J., Friedlander, M.P., Zhang, Y.: Fair and efficient contribution valuation for vertical federated learning. arXiv preprint arXiv:2201.02658 (2022)

  6. Fu, F., Miao, X., Jiang, J., Xue, H., Cui, B.: Towards communication-efficient vertical federated learning training via cache-enabled local updates. arXiv preprint arXiv:2207.14628 (2022)

  7. Fu, F., Shao, Y., Yu, L., Jiang, J., Xue, H., Tao, Y., Cui, B.: VF2Boost: very fast vertical federated gradient boosting for cross-enterprise learning. In: Proceedings of the 2021 International Conference on Management of Data, pp. 563–576 (2021)

    Google Scholar 

  8. Han, X., Wang, L., Wu, J.: Data valuation for vertical federated learning: an information-theoretic approach. arXiv preprint arXiv:2112.08364 (2021)

  9. Hardy, S., et al.: Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption. arXiv preprint arXiv:1711.10677 (2017)

  10. Jiang, J., et al.: VF-PS: how to select important participants in vertical federated learning, efficiently and securely? In: Advances in Neural Information Processing Systems (2022)

    Google Scholar 

  11. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  12. Li, A., et al.: Efficient federated-learning model debugging. In: 2021 IEEE 37th International Conference on Data Engineering, pp. 372–383. IEEE (2021)

    Google Scholar 

  13. Liu, Z., Chen, Y., Yu, H., Liu, Y., Cui, L.: GTG-Shapley: efficient and accurate participant contribution evaluation in federated learning. ACM Trans. Intell. Syst. Technol. 13(4), 1–21 (2022)

    Google Scholar 

  14. Mohassel, P., Zhang, Y.: SecureML: a system for scalable privacy-preserving machine learning. In: 2017 IEEE Symposium on Security and Privacy, pp. 19–38. IEEE (2017)

    Google Scholar 

  15. Roth, A.E.: The Shapley Value: Essays in Honor of Lloyd S. Cambridge University Press, Shapley (1988)

    Book  Google Scholar 

  16. Shokri, R., Shmatikov, V.: Privacy-preserving deep learning. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 1310–1321 (2015)

    Google Scholar 

  17. Song, T., Tong, Y., Wei, S.: Profit allocation for federated learning. In: 2019 IEEE International Conference on Big Data, pp. 2577–2586. IEEE (2019)

    Google Scholar 

  18. Vepakomma, P., Gupta, O., Swedish, T., Raskar, R.: Split learning for health: distributed deep learning without sharing raw patient data. arXiv preprint arXiv:1812.00564 (2018)

  19. Wang, G., Dang, C.X., Zhou, Z.: Measure contribution of participants in federated learning. In: 2019 IEEE International Conference on Big Data, pp. 2597–2604. IEEE (2019)

    Google Scholar 

  20. Wang, J., Zhang, L., Li, A., You, X., Cheng, H.: Efficient participant contribution evaluation for horizontal and vertical federated learning. In: 2022 IEEE 38th International Conference on Data Engineering, pp. 911–923. IEEE (2022)

    Google Scholar 

  21. Wang, T., Rausch, J., Zhang, C., Jia, R., Song, D.: A principled approach to data valuation for federated learning. In: Federated Learning: Privacy and Incentive, pp. 153–167 (2020)

    Google Scholar 

  22. Wu, Y., Cai, S., Xiao, X., Chen, G., Ooi, B.C.: Privacy preserving vertical federated learning for tree-based models. arXiv preprint arXiv:2008.06170 (2020)

  23. Wu, Z., Li, Q., He, B.: A coupled design of exploiting record similarity for practical vertical federated learning. Adv. Neural. Inf. Process. Syst. 35, 21087–21100 (2022)

    Google Scholar 

  24. Xue, Y., et al.: Toward understanding the influence of individual clients in federated learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 10560–10567 (2021)

    Google Scholar 

  25. Yang, Z., et al.: OceanBase: a 707 million tpmC distributed relational database system. Proc. VLDB Endow. 15(12), 3385–3397 (2022)

    Article  Google Scholar 

  26. Yang, Z., et al.: OceanBase Paetica: a hybrid shared-nothing/shared-everything database for supporting single machine and distributed cluster. Proc. VLDB Endow. 16(12), 3728–3740 (2023)

    Article  Google Scholar 

  27. Zhang, J., Wu, Y., Pan, R.: Incentive mechanism for horizontal federated learning based on reputation and reverse auction. In: WWW 2021, p. 947–956. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3442381.3449888

  28. Zhang, Q., et al.: ASYSQN: faster vertical federated learning algorithms with better computation resource utilization. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3917–3927 (2021)

    Google Scholar 

  29. Zhao, J., Zhu, X., Wang, J., Xiao, J.: Efficient client contribution evaluation for horizontal federated learning. In: 2021 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 3060–3064. IEEE (2021)

    Google Scholar 

Download references

Acknowledgment

This work was sponsored by Key R&D Program of Hubei Province (No. 2023BAB077, No. 2023BAB170), and the Fundamental Research Funds for the Central Universities (No. 2042023kf0219). This work was supported by Ant Group through CCF-Ant Research Fund (CCF-AFSG RF20220001).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Zhaohui Cai or Jiawei Jiang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhou, X. et al. (2024). VFDV-IM: An Efficient and Securely Vertical Federated Data Valuation. In: Onizuka, M., et al. Database Systems for Advanced Applications. DASFAA 2024. Lecture Notes in Computer Science, vol 14850. Springer, Singapore. https://doi.org/10.1007/978-981-97-5552-3_28

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-5552-3_28

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-5551-6

  • Online ISBN: 978-981-97-5552-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics