SecureBoost+: Large Scale and High-Performance Vertical Federated Gradient Boosting Decision Tree

Fan, Tao; Chen, Weijing; Ma, Guoqiang; Kang, Yan; Fan, Lixin; Yang, Qiang

Computer Science > Machine Learning

arXiv:2110.10927 (cs)

[Submitted on 21 Oct 2021 (v1), last revised 19 Jun 2024 (this version, v5)]

Title:SecureBoost+: Large Scale and High-Performance Vertical Federated Gradient Boosting Decision Tree

Authors:Tao Fan, Weijing Chen, Guoqiang Ma, Yan Kang, Lixin Fan, Qiang Yang

View PDF HTML (experimental)

Abstract:Gradient boosting decision tree (GBDT) is an ensemble machine learning algorithm, which is widely used in industry, due to its good performance and easy interpretation. Due to the problem of data isolation and the requirement of privacy, many works try to use vertical federated learning to train machine learning models collaboratively with privacy guarantees between different data owners. SecureBoost is one of the most popular vertical federated learning algorithms for GBDT. However, in order to achieve privacy preservation, SecureBoost involves complex training procedures and time-consuming cryptography operations. This causes SecureBoost to be slow to train and does not scale to large scale data.
In this work, we propose SecureBoost+, a large-scale and high-performance vertical federated gradient boosting decision tree framework. SecureBoost+ is secure in the semi-honest model, which is the same as SecureBoost. SecureBoost+ can be scaled up to tens of millions of data samples easily. SecureBoost+ achieves high performance through several novel optimizations for SecureBoost, including ciphertext operation optimization, the introduction of new training mechanisms, and multi-classification training optimization. The experimental results show that SecureBoost+ is 6-35x faster than SecureBoost, but with the same accuracy and can be scaled up to tens of millions of data samples and thousands of feature dimensions.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2110.10927 [cs.LG]
	(or arXiv:2110.10927v5 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2110.10927

Submission history

From: Weijing Chen [view email]
[v1] Thu, 21 Oct 2021 06:49:10 UTC (507 KB)
[v2] Wed, 22 Dec 2021 09:51:09 UTC (896 KB)
[v3] Thu, 23 Dec 2021 03:03:25 UTC (896 KB)
[v4] Fri, 31 May 2024 07:39:50 UTC (1,149 KB)
[v5] Wed, 19 Jun 2024 02:45:59 UTC (1,149 KB)

Computer Science > Machine Learning

Title:SecureBoost+: Large Scale and High-Performance Vertical Federated Gradient Boosting Decision Tree

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:SecureBoost+: Large Scale and High-Performance Vertical Federated Gradient Boosting Decision Tree

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators