Abstract
User-Generated Content (UGC) is turning into the predominant type of internet traffic. Content popularity prediction plays a pivotal role in managing this large-scale traffic. As a result, popularity prediction is increasingly becoming an important area of research in computer networking. Generally, popularity prediction methods are classified into two groups, namely, feature-driven and early-stage. While feature-driven methods predict content popularity before publication, early-stage methods monitor early content popularities to forecast the future. Many papers have shown that early-stage popularity prediction performs better than feature-driven methods. In this paper, we improve the performance of early-stage popularity prediction by first, classifying the data into several clusters using k-means clustering with Pearson correlation distance, and then, training a Deep-Belief Network (DBN) for each cluster. We evaluate our method using a dataset of YouTube videos and show that using a generative model such as DBN for time series prediction significantly improves the performance. Numerical results indicate that our proposed method outperforms other state-of-the-art methods by reducing Mean Absolute Percentage Error (MAPE) and mean Relative Square Error (mRSE) by up to 47.86% and 25.18%.
Similar content being viewed by others
References
Almeida J, Gonçalves MA (2013) Using early view patterns to predict the popularity of YouTube videos, ACM, WSDM’13, Rome, Italy, 365–374, https://doi.org/10.1145/2433396.2433443
Alzubi, J., Nayyar, A., & Kumar, A. (2018). Machine learning from theory to algorithms: an overview. In Journal of physics: conference series, vol. 1142(1), IOP Publishing
Bao Z, Liu Y, Liu H, Zhang Z, et al (2017) Leveraging adaptive peeking window to improve Self-Exciting Point Process model for popularity prediction, IEEE Behavioral, Economic, Socio-cultural Computing (BESC), https://doi.org/10.1109/BESC.2017.8256373
Borghol Y, Mitra S, Ardon S et al (2011) Characterizing and modelling popularity of user-generated videos. Science Direct Performance Evaluation 68(11):1037–1055. https://doi.org/10.1016/j.peva.2011.07.008
Cha M, Kwak H, Rodriguez P (2009) Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems. IEEE/ACM Trans. on Networking 17(5):1357–1370. https://doi.org/10.1109/TNET.2008.2011358
Fischer A, Igel C (2014) Training Restricted Boltzmann Machines: An Introduction. Science Direct Pattern Recognition 47(1):25–39. https://doi.org/10.1016/j.patcog.2013.05.025
Gheisari M, Panwar D, Tomar et al (2019) An optimization model for software quality prediction with case study analysis using MATLAB. IEEE Access 7:85123–85138
Google Developers, Add YouTube functionality to your app, https://developers.google.com/youtube/v3, (accessed: 24, Nov. 2019)
Gursun G, Crovella M, Matta I (2011) Describing and Forecasting Video Access Patterns, IEEE INFOCOM, Shanghai, China, https://doi.org/10.1109/INFCOM.2011.5934965
Hassine NB, Marinca D, Minet P, Barth D (2015) Popularity Prediction in Content Delivery Networks, IEEE PIMRC:2083–2088, https://doi.org/10.1109/PIMRC.2015.7343641
Hassine NB, Milocco R, Minet P (2017) ARMA based Popularity Prediction in Content Delivery Networks, IEEE Wireless Days, https://doi.org/10.1109/WD.2017.7918125
Hinton GE (2012) A Practical Guide to Training Restricted Boltzmann Machines, Springer Neural Networks: Tricks of the Trade. Lect Notes Comput Sci 7700:599–619. https://doi.org/10.1007/978-3-642-35289-8_32
Hinton GE, Osindero S, The YW (2006) A fast learning algorithm for deep belief nets, J. Neural computation 18(7):1527–1554. https://doi.org/10.1162/neco.2006.18.7.1527
Hoiles W, Aprem A, Krishnamurthy V (2017) Engagement dynamics and sensitivity analysis of YouTube videos. IEEE Trans. on Knowledge and Data Engineering 29(7):1426–1437. https://doi.org/10.1109/TKDE.2017.2682858
Hou T, Feng G, Qin S, et al (2018) Proactive Content Caching by Exploiting Transfer Learning for Mobile Edge Computing, IEEE Global Communication, https://doi.org/10.1109/GLOCOM.2017.8254636
Hou T, Feng G, Qin S, Jiang W (2018) Proactive Content Caching by Exploiting Transfer Learning for Mobile Edge Computing, Wiley Communication Systems, vol. 31(11), https://doi.org/10.1002/dac.3706
Hrasko R, Pacheco AGC, Krohling RA (2015) Time Series Prediction using Restricted Boltzmann Machines and Backpropagation. Science Direct ITQM 55:990–999. https://doi.org/10.1016/j.procs.2015.07.104
Ibrahimi K, Serbouti Y (2017) Prediction of the Content Popularity in the 5G Network: Auto-Regressive, Moving-Average and Exponential Smoothing Approaches, IEEE WINCOM https://doi.org/10.1109/WINCOM.2017.8238196
Kurose JF, Ross KW (2013) Computer Networking a Top-Down Approach, Pearson Edu., US, 6th edition, 602–612
Li C, Liu J, Ouyang S (2016) Characterizing and Predicting the Popularity of Online Videos. IEEE, Access 4:1630–1641. https://doi.org/10.1109/ACCESS.2016.2552218
Li Y, Peng Q, Sun Z, Fu L, et al (2018) A Two-stage Prediction Method of News Popularity only using Content Features, IEEE Intelligent Control and Automation, Changsha, China, 767–772, https://doi.org/10.1109/WCICA.2018.8630557
Liu Y, Zhi T, Xi H et al (2019) A Novel Content Popularity Prediction Algorithm Based on Auto Regressive Model in Information-Centric IoT. IEEE Early Access 7:27555–27564. https://doi.org/10.1109/ACCESS.2019.2901525
Ma C, Yan Z, Chen CW (2017) LARM: A Lifetime Aware Regression Model for Predicting YouTube Video Popularity, ACM CIKM’17, Singapore, https://doi.org/10.1145/3132847.3132997
Martin T, Hofman JM, Sharma A, et al (2016) Exploring Limits to Prediction in Complex Social Systems, in International Conference on World Wide Web:683–694
Namous F, Rodan A, Javed Y (2018) Online News Popularity Prediction, IEEE Information Technology Trends, https://doi.org/10.1109/CTIT.2018.8649529
Ouyang S, Li C, Li X (2016) A Peek into the Future: Predicting the Popularity of Online Videos. IEEE Access 4:3026–3033. https://doi.org/10.1109/ACCESS.2016.2580911
Rahman S, Alam GR, Rahman M (2020) Deep Learning-based Predictive Caching in the Edge of a Network, IEEE ICOIN, https://doi.org/10.1109/ICOIN48656.2020.9016437
Szabo G, Huberman BA (2010) Predicting the popularity of online content. ACM Communications 53(8):80–88. https://doi.org/10.1145/1787234.1787254
Tan Z, Zhang Y (2019) Predicting the Top-N Popular Videos via a Cross-Domain Hybrid Model. IEEE Trans. on Multimedia 21(1):147–156. https://doi.org/10.1109/TMM.2018.2845688
Tan Z, Wang Y, Zhang Y et al (2016) A Novel Time Series Approach for Predicting the Long-Term Popularity of Online Videos. IEEE, Trans. on Broadcasting 62(2):436–445. https://doi.org/10.1109/TBC.2016.2540522
Tan J, Liu W, Wang T, et al (2020) A high-accurate content popularity prediction computational modeling for mobile edge computing using matrix completion technology, Wiley Trans. on Emerging Tel. Tech., 31(8) https://doi.org/10.1002/ett.3871
Wang X, Fang B, Zhang H, et al (2019) A Dynamic Model on News Popularity Prediction in Online Social Networks, IEEE ITNEC, Chengdu, China, 10.1109/ITNEC.2019.8729161
Yang J, Leskovec J (2011) Patterns of Temporal Variation in Online Media, ACM WSDM’11, Hong Kong, China, 177–186, https://doi.org/10.1145/1935826.1935863
Yang M, Chen K,Miao Z, Yang X (2014) Cost-Effective User Monitoring for Popularity Prediction of Online User-Generated Content, IEEE Data Mining Workshop, 944–951, https://doi.org/10.1109/ICDMW.2014.72
Youtube.com Traffic, Demographics and Competitors, www.alexa.com, 2019, (Accessed Aug. 2019)
Zhu C, Chen G, Wang AK (2017) Big Data Analytics for Program Popularity Prediction in Broadcast TV Industries. IEEE Early Access, vol. 5:24593–24601. https://doi.org/10.1109/ACCESS.2017.2767104
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Nia, Z.M., Khayyambashi, M.R. Improving content popularity prediction with k-means clustering and deep-belief networks. Multimed Tools Appl 80, 15745–15764 (2021). https://doi.org/10.1007/s11042-020-10463-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-10463-x