Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

DRL-HIFA: a dynamic recommendation system with deep reinforcement learning based Hidden Markov Weight Updation and factor analysis

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Recommendation Systems have obtained huge attention with notion to assist users in determining their interests by prognosticating their ratings or preferences on specific item. Concurrently, the unique capability of RL (Reinforcement Learning) agent to learn from environment for reward without training the data makes it specifically suitable approach for such systems. Due to such ability, traditional works have considered DRL (Deep RL) for recommendation system. However, existing studies faced several challenges like scalability issues, probability for overlapping of numerous values and information loss while passing into a NN (Neural Network) and improper model training which lead to incorrect recommendations. Hence, this study intends to resolve these existing pitfalls. To accomplish this, the research proposes a DRR (DRL based Recommendation) framework in accordance with actor-critic learning. In actor network, DWL-FA (Deep Weighted Likelihood-Factor Analysis) is proposed for modifying the existing DNN (Deep Neural Network) to new-environment for compensating vector through the removal of unwanted regions in network results. Attention mechanism considered in this process affords decoder with suitable information from each hidden-states of the encoder. This attention mechanism along with DWL-FA model is further capable of selectively concentrating on valuable input sequences thereby effectively learning association amongst them. This assists the trained model to learn better. Subsequently, in critic network, HMP-WU (Hidden Markov Probability-Weight Updation) is proposed for optimizing the interactions amongst users with their preferences for the recommended items (environment) and recommender system (agent). In this case, weight updation process assists in comprehending related sequences thereby resolving incorrect predictions. These proposed processes have made the system explore better results with an increase of 5.74% with regard to average of p-value.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm I:
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Abbreviations

R:

Reward function, determining the reward obtained by the agent for taking an action in a particular state

γ:

Discount rate, determining the importance of future rewards

π:

Policy, a function that maps states to actions

\(A{V}^{\pi }\) :

Action-value function, computing the expected return for taking an action in a particular state

Q-learning:

An approach to determining a good policy by computing the action-value function and selecting the action with the highest value

S:

State space, representing the set of possible states

A:

Action space, representing the set of possible actions

\({s}_{t}\) :

State at time step t, represented as \({\mathrm{f}}({{\mathrm{h}}}_{{\mathrm{t}}})\)

\(f(.)\) :

State representation framework

\({h}_{t}\) :

Embedding corresponding to recent history of positive interactions, \(\{{{\mathrm{q}}}_{1}, ..., {{\mathrm{q}}}_{{\mathrm{n}}}\}\)

\({q}_{i}\) :

Embedding vector of the ith item, \({{\mathrm{q}}}_{{\mathrm{i}}} \in {{\mathrm{R}}}^{1*{\mathrm{d}}}\)

\(a\) :

Action generated by the actor network, \(\mathrm{a }= {\uppi }_{\uptheta }\left({\mathrm{s}}\right)\)

\({\pi }_{\theta }\) :

Actor network with parameters \(\uptheta\)

\(scor{e}_{v}\) :

Ranking score of item v, calculated as \({{\mathrm{q}}}_{{\mathrm{v}}} * {{\mathrm{a}}}^{{\mathrm{T}}}\)

\({q}_{v}\) :

Embedding vector of item v

\(Q-value\) :

Approximated state-action value,denoted as \({{\mathrm{Q}}}_{{\mathrm{w}}}({\mathrm{s}},\mathrm{ a})\)

\({Q}_{w}(s, a)\) :

Critic network (DQN) approximating the actual state-action value \({{\mathrm{Q}}}^{\uppi }({\mathrm{s}},\mathrm{ a})\)

\([{p}_{u}, {q}_{v}]\) :

Concatenation of user and item embeddings

\(\grave{\mathrm{a_{uv}}}\)  :

Updated attention weight for item v

\(ReLU(.)\) :

Rectified linear unit activation function

\({W}_{1}\) :

Weight matrix for attention network, with dimensions \({{\mathrm{R}}}^{{{\mathrm{d}}}_{1}\mathrm{x }1}\)

\({W}_{2}\) :

Weight matrix for attention network, with dimensions \({{\mathrm{R}}}^{2\mathrm{d x }{{\mathrm{d}}}_{1}}\)

\({b}_{1}\) :

Bias vector for attention network, with dimensions \({{\mathrm{R}}}^{1}\)

\({b}_{2}\) :

Bias vector for attention network, with dimensions \({{\mathrm{R}}}^{1\mathrm{ x }{{\mathrm{d}}}_{1}}\)

\(s\) :

Dimensionality of s is 3d

\(r\) :

Output vector before Softmax activation, \(\mathrm{r }=\mathrm{ z}({{\mathrm{v}}}^{{\mathrm{L}}})\)

\(x\) :

Input feature

\(y\) :

Biased input feature

\(R(.)\) :

Complete non-linear function of DNN

\(r = R(y)\) :

Complete non-linear function of the deep neural network

\({v}_{n}\) :

Basis of nth acoustic factor

\({u}_{n}\) :

Loading matrix

\({\mathrm{r}}\grave{~}\)  :

Modified vector for compensating the network outcome, \({\mathrm{r}}\grave{~}=\mathrm R\left(\mathrm{y}\right)+\sum\limits_{\mathrm{i}=1}^\mathrm{n}{\mathrm{u}}_\mathrm{n}{\mathrm{v}}_\mathrm{n}\)  

\(\grave{\mathrm{a_{uv}}}\)  :

Attention weight for item v, updated with DWL-FA

\(ReLU(.)\) :

Rectified linear unit activation function

\({a}_{uv}\) :

Softmax activation of attention weight \(\grave{\mathrm{a_{uv}}}\)  

\({T}_{kad}^{\left(GP\right)}\) :

Individual basis transitional function indexed by kth latent-parameter, dimension (d), action (a) is GP utilizing only (s) as input with linear interaction with instance-oriented weights \(({{\mathrm{w}}}_{{\mathrm{bk}}}).\)

\(\grave{\mathrm{s_{d}}}\)  :

\({\mathrm{Dth}}\) Dimension corresponding to s.

\({w}_{bk}\) :

Kth latent-parameter.

\({\sigma }_{w}^{2}, {\sigma }_{n}^{2}\) :

Variance of \({{\mathrm{w}}}_{{\mathrm{bk}}}\mathrm\;{ and }\;{{\mathrm{w}}}_{{\mathrm{b}}}\), respectively.

 ∈ :

Random noise term.

\({T}^{BNN}(s,a,{w}_{b})\) :

Transitional function that includes instance-oriented weights \(({{\mathrm{w}}}_{{\mathrm{b}}})\) as an input to model output dimensions collaboratively.

P(W):

Distribution upon latent embedding

References

  1. Gupta S, Dave M (2020) An overview of recommendation system: methods and techniques. In: Sharma H, Govindan K, Poonia R, Kumar S, El-Medany W (eds) Advances in computing and intelligent systems. Algorithms for intelligent systems. Springer, Singapore. https://doi.org/10.1007/978-981-15-0222-4_2

  2. Visuwasam LMM, Geetha M, Gayathri G, Divya K, Elakkiya D (2021) Smart personalised recommendation system for wanderer using prediction analysis. Int J Intell Sustain Comput 1(3):223–232

    Google Scholar 

  3. Malik S, Rana A, Bansal M (2020) A survey of recommendation systems. Inform Resour Manage J (IRMJ) 33(4):53–73

    Article  Google Scholar 

  4. Naeem MZ, Rustam F, Mehmood A, Ashraf I, Choi GS (2022) Classification of movie reviews using term frequency-inverse document frequency and optimized machine learning algorithms. PeerJ Comput Sci 8:e914

    Article  Google Scholar 

  5. Khan A, Gul MA, Zareei M, Biswal RR, Zeb A, Naeem M, Saeed Y, Salim N (2020) Movie review summarization using supervised learning and graph-based ranking algorithm. Comput Intell Neurosci 2020:7526580. https://doi.org/10.1155/2020/7526580

  6. Cintia Ganesha Putri D, Leu J-S, Seda P (2020) Design of an unsupervised machine learning-based movie recommender system. Symmetry 12(2):185

    Article  Google Scholar 

  7. Datta D, Navamani T, Deshmukh R (2020) Products and movie recommendation system for social networking sites. Int J Sci Technol Res 9(10):262–270

    Google Scholar 

  8. Tan C, Han R, Ye R, Chen K (2020) Adaptive learning recommendation strategy based on deep Q-learning. Appl Psychol Meas 44(4):251–266

    Article  Google Scholar 

  9. Madani Y, Ezzikouri H, Erritali M, Hssina B (2020) Finding optimal pedagogical content in an adaptive e-learning platform using a new recommendation approach and reinforcement learning. J Ambient Intell Humaniz Comput 11(10):3921–3936

    Article  Google Scholar 

  10. Zhang J, Wang Y, Yuan Z, Jin Q (2019) Personalized real-time movie recommendation system: practical prototype and evaluation. Tsinghua Sci Technol 25(2):180–191

    Article  Google Scholar 

  11. Yassine A, Mohamed L, Al Achhab M (2021) Intelligent recommender system based on unsupervised machine learning and demographic attributes. Simul Model Pract Theory 107:102198

    Article  Google Scholar 

  12. Reddy SRS, Nalluri S, Kunisetti S, Ashok S, Venkatesh B (2019) Content-based movie recommendation system using genre correlation. In: Smart intelligent computing and applications. Proceedings of the second international conference on SCI 2018, vol 2. Springer, Singapore, pp 391–397. https://doi.org/10.1007/978-981-13-1927-3_42

  13. Zhao W et al (2019) Leveraging long and short-term information in content-aware movie recommendation via adversarial training. IEEE Trans Cybern 50(11):4680–4693

    Article  Google Scholar 

  14. Aghdam MH (2019) Context-aware recommender systems using hierarchical hidden Markov model. Physica A 518:89–98

    Article  Google Scholar 

  15. Yang Q (2018) A novel recommendation system based on semantics and context awareness. Computing 100(8):809–823

    Article  Google Scholar 

  16. Da’u A, Salim N, Rabiu I, Osman A (2020) Recommendation system exploiting aspect-based opinion mining with deep learning method. Inf Sci 512:1279–1292

    Article  Google Scholar 

  17. Ibrahim M, Bajwa IS, Ul-Amin R, Kasi B (2019) A neural network-inspired approach for improved and true movie recommendations. Computat Intell Neurosci 2019:4589060. https://doi.org/10.1155/2019/4589060

  18. Zhou Q (2020) A novel movies recommendation algorithm based on reinforcement learning with DDPG policy. Int J Intell Comput Cybern 13(1):67–79

  19. Tao S, Qiu R, Ping Y, Ma H (2021) Multi-modal knowledge-aware reinforcement Learning Network for Explainable recommendation. Knowl Based Syst 227:107217

    Article  Google Scholar 

  20. Lei Y, Li W (2019) Interactive recommendation with user-specific deep reinforcement learning. ACM Trans Knowl Discovery Data (TKDD) 13(6):1–15

    Article  Google Scholar 

  21. Zhao Z, Chen X, Xu Z, Cao L (2021) Tag-aware recommender system based on deep reinforcement learning. Math Problems Eng 2021:5564234. https://doi.org/10.1155/2021/5564234

  22. Li R, Kahou SE, Schulz H, Michalski V, Charlin L, Pal C (2018) Towards deep conversational recommendations. In: Advances in neural information processing systems, 31st, 32nd conference on neural information processing systems (NeurIPS 2018). NeurIPS, Montréal, Canada.

  23. Fu M, Agrawal A, Irissappane AA, Zhang J, Huang L, Qu H (2022) Deep reinforcement learning framework for category-based item recommendation. IEEE Trans Cybern 52(11):12028–12041. https://doi.org/10.1109/TCYB.2021.3089941

  24. Huang L, Fu M, Li F, Qu H, Liu Y, Chen W (2021) A deep reinforcement learning based long-term recommender system. Knowl Based Syst 213:106706

    Article  Google Scholar 

  25. Gao M, Zhang J, Yu J, Li J, Wen J, Xiong Q (2021) Recommender systems based on generative adversarial networks: a problem-driven perspective. Inf Sci 546:1166–1185

    Article  MathSciNet  Google Scholar 

Download references

Funding

The authors declare that they no funding for this study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Krishnamoorthi S.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of Interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

S, K., Shyam, G.K. DRL-HIFA: a dynamic recommendation system with deep reinforcement learning based Hidden Markov Weight Updation and factor analysis. Multimed Tools Appl 83, 72819–72843 (2024). https://doi.org/10.1007/s11042-024-18296-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-024-18296-8

Keywords