We present Distributed Equivalent Substitution (DES) training, a novel distributed training framework for large-scale recommender systems with dynamic sparse features. DES introduces fully synchronous training to large-scale recommendation system for the first time by reducing communication, thus making the training of commercial recommender systems converge faster and reach better CTR. DES requires much less communication by substituting the weights-rich operators with the computationally equivalent sub-operators and aggregating partial results instead of transmitting the huge sparse weights directly through the network. Due to the use of synchronous training on large-scale Deep Learning Recommendation Models (DLRMs), DES achieves higher AUC(Area Under ROC). We successfully apply DES training on multiple popular DLRMs of industrial scenarios. Experiments show that our implementation outperforms the state-of-the-art PS-based training framework, achieving up to 68.7% communication savings and higher throughput compared to other PS-based recommender systems.

Supplementary Material

MP4 File (3397271.3401113.mp4)

We present Distributed Equivalent Substitution (DES) training, a novel distributed training framework for large-scale recommender systems with dynamic sparse features. DES introduces fully synchronous training to large-scale recommendation system for the first time by reducing communication, thus making the training of commercial recommender systems converge faster and reach better CTR . DES requires much less communication by substituting the weights-rich operators with the computationally equivalent sub- operators and aggregating partial results instead of transmitting the huge sparse weights directly through the network. Due to the use of synchronous training on large-scale DLRMs, DES achieves higher AUC. We successfully apply DES training on multiple popular DLRMs of industrial scenarios. Experiments show that our implementation outperforms the state-of-the-art PS-based training framework, achieving up to 68.7% communication savings and higher throughput compared to other PS-based recommender systems.

Download
259.86 MB

References

[1]

Shiyu Chang, Yang Zhang, Jiliang Tang, Dawei Yin, Yi Chang, Mark A. Hasegawa- Johnson, and Thomas S. Huang. 2017. Streaming Recommender Systems. In Proceedings of the 26th International Conference on World Wide Web (Perth, Australia) (WWW '17). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 381--389. https://doi.org/10.1145/3038912.3052627

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Acquiring User Information Needs for Recommender Systems

Distributed collaborative filtering with singular ratings for large scale recommendation

A Scalable, Accurate Hybrid Recommender System

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Get Access

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations