VFDV-IM: An Efficient and Securely Vertical Federated Data Valuation

Zhou, Xiaokai; Yan, Xiao; Li, Xinyan; Huang, Hao; Xu, Quanqing; Zhang, Qinbo; Jerome, Yen; Cai, Zhaohui; Jiang, Jiawei

doi:10.1007/978-981-97-5552-3_28

Xiaokai Zhou¹⁵,
Xiao Yan¹⁶,
Xinyan Li¹⁵,
Hao Huang¹⁵,
Quanqing Xu¹⁷,
Qinbo Zhang¹⁵,
Yen Jerome¹⁸,
Zhaohui Cai¹⁵ &
…
Jiawei Jiang¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14850))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

705 Accesses

Abstract

Vertical federated learning enables multiple participants to build a joint machine learning model upon distributed features of overlapping samples. The performance of VFL models heavily depends on the quality of participants’ local data. It’s essential to measure the contributions of the participants for various purposes, e.g., participant selection and reward allocation. The Shapley value is widely adopted by previous works for contribution assessment. However, computing the Shapley value in VFL requires repetitive model training from scratch, incurring expensive computation and communication overheads. Inspired by this challenge, in this paper, we ask: can we efficiently and securely perform data valuation for participants via the Shapley value in VFL?

We call this problem Vertical Federated Data Valuation, and introduce VFDV-IM, a method utilizing an Inheritance Mechanism to expedite Shapley value calculations by leveraging historical training records. We first propose a simple, yet effective, strategy that directly inherits the model trained over the entire consortium. To further optimize VFDV-IM, we propose a model ensemble approach that measures the similarity of evaluated consortiums, based on which we reweight the historical models. We conduct extensive experiments on various datasets and show that our VFDV-IM can efficiently calculate the Shapley value while maintaining accuracy.

X. Zhou and X. Yan—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 74.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Robust Federated Learning Client Selection Using Shapley Value in Heterogeneous Environments

Evaluate the Contribution of Multiple Participants in Federated Learning

Fair-select: a federated learning approach to ensure fairness in selection of participants

Article 29 November 2024

References

Bonawitz, K., et al.: Practical secure aggregation for privacy-preserving machine learning. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 1175–1191 (2017)
Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1–27:27 (2011)
Google Scholar
Chen, T., Jin, X., Sun, Y., Yin, W.: VAFL: a method of vertical asynchronous federated learning. arXiv preprint arXiv:2007.06081 (2020)
Fan, Z., Fang, H., Zhou, Z., Pei, J., Friedlander, M.P., Liu, C., Zhang, Y.: Improving fairness for data valuation in horizontal federated learning. In: 2022 IEEE 38th International Conference on Data Engineering, pp. 2440–2453. IEEE (2022)
Google Scholar
Fan, Z., Fang, H., Zhou, Z., Pei, J., Friedlander, M.P., Zhang, Y.: Fair and efficient contribution valuation for vertical federated learning. arXiv preprint arXiv:2201.02658 (2022)
Fu, F., Miao, X., Jiang, J., Xue, H., Cui, B.: Towards communication-efficient vertical federated learning training via cache-enabled local updates. arXiv preprint arXiv:2207.14628 (2022)
Fu, F., Shao, Y., Yu, L., Jiang, J., Xue, H., Tao, Y., Cui, B.: VF2Boost: very fast vertical federated gradient boosting for cross-enterprise learning. In: Proceedings of the 2021 International Conference on Management of Data, pp. 563–576 (2021)
Google Scholar
Han, X., Wang, L., Wu, J.: Data valuation for vertical federated learning: an information-theoretic approach. arXiv preprint arXiv:2112.08364 (2021)
Hardy, S., et al.: Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption. arXiv preprint arXiv:1711.10677 (2017)
Jiang, J., et al.: VF-PS: how to select important participants in vertical federated learning, efficiently and securely? In: Advances in Neural Information Processing Systems (2022)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Li, A., et al.: Efficient federated-learning model debugging. In: 2021 IEEE 37th International Conference on Data Engineering, pp. 372–383. IEEE (2021)
Google Scholar
Liu, Z., Chen, Y., Yu, H., Liu, Y., Cui, L.: GTG-Shapley: efficient and accurate participant contribution evaluation in federated learning. ACM Trans. Intell. Syst. Technol. 13(4), 1–21 (2022)
Google Scholar
Mohassel, P., Zhang, Y.: SecureML: a system for scalable privacy-preserving machine learning. In: 2017 IEEE Symposium on Security and Privacy, pp. 19–38. IEEE (2017)
Google Scholar
Roth, A.E.: The Shapley Value: Essays in Honor of Lloyd S. Cambridge University Press, Shapley (1988)
Book Google Scholar
Shokri, R., Shmatikov, V.: Privacy-preserving deep learning. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 1310–1321 (2015)
Google Scholar
Song, T., Tong, Y., Wei, S.: Profit allocation for federated learning. In: 2019 IEEE International Conference on Big Data, pp. 2577–2586. IEEE (2019)
Google Scholar
Vepakomma, P., Gupta, O., Swedish, T., Raskar, R.: Split learning for health: distributed deep learning without sharing raw patient data. arXiv preprint arXiv:1812.00564 (2018)
Wang, G., Dang, C.X., Zhou, Z.: Measure contribution of participants in federated learning. In: 2019 IEEE International Conference on Big Data, pp. 2597–2604. IEEE (2019)
Google Scholar
Wang, J., Zhang, L., Li, A., You, X., Cheng, H.: Efficient participant contribution evaluation for horizontal and vertical federated learning. In: 2022 IEEE 38th International Conference on Data Engineering, pp. 911–923. IEEE (2022)
Google Scholar
Wang, T., Rausch, J., Zhang, C., Jia, R., Song, D.: A principled approach to data valuation for federated learning. In: Federated Learning: Privacy and Incentive, pp. 153–167 (2020)
Google Scholar
Wu, Y., Cai, S., Xiao, X., Chen, G., Ooi, B.C.: Privacy preserving vertical federated learning for tree-based models. arXiv preprint arXiv:2008.06170 (2020)
Wu, Z., Li, Q., He, B.: A coupled design of exploiting record similarity for practical vertical federated learning. Adv. Neural. Inf. Process. Syst. 35, 21087–21100 (2022)
Google Scholar
Xue, Y., et al.: Toward understanding the influence of individual clients in federated learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 10560–10567 (2021)
Google Scholar
Yang, Z., et al.: OceanBase: a 707 million tpmC distributed relational database system. Proc. VLDB Endow. 15(12), 3385–3397 (2022)
Article Google Scholar
Yang, Z., et al.: OceanBase Paetica: a hybrid shared-nothing/shared-everything database for supporting single machine and distributed cluster. Proc. VLDB Endow. 16(12), 3728–3740 (2023)
Article Google Scholar
Zhang, J., Wu, Y., Pan, R.: Incentive mechanism for horizontal federated learning based on reputation and reverse auction. In: WWW 2021, p. 947–956. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3442381.3449888
Zhang, Q., et al.: ASYSQN: faster vertical federated learning algorithms with better computation resource utilization. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3917–3927 (2021)
Google Scholar
Zhao, J., Zhu, X., Wang, J., Xiao, J.: Efficient client contribution evaluation for horizontal federated learning. In: 2021 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 3060–3064. IEEE (2021)
Google Scholar

Download references

Acknowledgment

This work was sponsored by Key R&D Program of Hubei Province (No. 2023BAB077, No. 2023BAB170), and the Fundamental Research Funds for the Central Universities (No. 2042023kf0219). This work was supported by Ant Group through CCF-Ant Research Fund (CCF-AFSG RF20220001).

Author information

Authors and Affiliations

Wuhan University, Wuhan, Hubei Province, China
Xiaokai Zhou, Xinyan Li, Hao Huang, Qinbo Zhang, Zhaohui Cai & Jiawei Jiang
Centre for Perceptual and Interactive Intelligence (CPII), Hong Kong SAR, China
Xiao Yan
OceanBase, Ant Group, China
Quanqing Xu
The University of Macau, Macau, China
Yen Jerome

Authors

Xiaokai Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Xiao Yan
View author publications
You can also search for this author in PubMed Google Scholar
Xinyan Li
View author publications
You can also search for this author in PubMed Google Scholar
Hao Huang
View author publications
You can also search for this author in PubMed Google Scholar
Quanqing Xu
View author publications
You can also search for this author in PubMed Google Scholar
Qinbo Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yen Jerome
View author publications
You can also search for this author in PubMed Google Scholar
Zhaohui Cai
View author publications
You can also search for this author in PubMed Google Scholar
Jiawei Jiang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Zhaohui Cai or Jiawei Jiang .

Editor information

Editors and Affiliations

Osaka University, Suita, Japan
Makoto Onizuka
KAIST, Daejeon, Korea (Republic of)
Jae-Gil Lee
Beihang University, Beijing, China
Yongxin Tong
Osaka University, Osaka, Japan
Chuan Xiao
Nagoya University, Nagoya, Japan
Yoshiharu Ishikawa
University of Grenoble Alpes, Saint-Martin d’Hères, France
Sihem Amer-Yahia
University of Michigan, Ann Arbor, MI, USA
H. V. Jagadish
Nagoya University, Nagoya, Japan
Kejing Lu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhou, X. et al. (2024). VFDV-IM: An Efficient and Securely Vertical Federated Data Valuation. In: Onizuka, M., et al. Database Systems for Advanced Applications. DASFAA 2024. Lecture Notes in Computer Science, vol 14850. Springer, Singapore. https://doi.org/10.1007/978-981-97-5552-3_28

Download citation

DOI: https://doi.org/10.1007/978-981-97-5552-3_28
Published: 01 October 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-5551-6
Online ISBN: 978-981-97-5552-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

VFDV-IM: An Efficient and Securely Vertical Federated Data Valuation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Robust Federated Learning Client Selection Using Shapley Value in Heterogeneous Environments

Evaluate the Contribution of Multiple Participants in Federated Learning

Fair-select: a federated learning approach to ensure fairness in selection of participants

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

VFDV-IM: An Efficient and Securely Vertical Federated Data Valuation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Robust Federated Learning Client Selection Using Shapley Value in Heterogeneous Environments

Evaluate the Contribution of Multiple Participants in Federated Learning

Fair-select: a federated learning approach to ensure fairness in selection of participants

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation