Abstract
A honeynet is a promising active cyber defense mechanism. It reveals the fundamental Indicators of Compromise (IoCs) by luring attackers to conduct adversarial behaviors in a controlled and monitored environment. The active interaction at the honeynet brings a high reward but also introduces high implementation costs and risks of adversarial honeynet exploitation. In this work, we apply infinite-horizon Semi-Markov Decision Process (SMDP) to characterize a stochastic transition and sojourn time of attackers in the honeynet and quantify the reward-risk trade-off. In particular, we design adaptive long-term engagement policies shown to be risk-averse, cost-effective, and time-efficient. Numerical results have demonstrated that our adaptive engagement policies can quickly attract attackers to the target honeypot and engage them for a sufficiently long period to obtain worthy threat information. Meanwhile, the penetration probability is kept at a low level. The results show that the expected utility is robust against attackers of a large range of persistence and intelligence. Finally, we apply reinforcement learning to the SMDP to solve the curse of modeling. Under a prudent choice of the learning rate and exploration policy, we achieve a quick and robust convergence of the optimal policy and value.
Q. Zhu—This research is supported in part by NSF under grant ECCS-1847056, CNS-1544782, and SES-1541164, and in part by ARO grant W911NF1910041.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
See the demo following URL: https://bit.ly/2QUz3Ok.
References
Al-Shaer, E.S., Wei, J., Hamlen, K.W., Wang, C.: Autonomous Cyber Deception: Reasoning, Adaptive Planning, and Evaluation of HoneyThings. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-02110-8
Bianco, D.: The pyramid of pain (2013). http://detect-respond.blogspot.com/2013/03/the-pyramid-of-pain.html
Bradtke, S.J., Duff, M.O.: Reinforcement learning methods for continuous-time Markov decision problems. In: Advances in Neural Information Processing Systems, pp. 393–400 (1995)
Chen, D., Trivedi, K.S.: Optimization for condition-based maintenance with semi-Markov decision process. Reliab. Eng. Syst. Saf. 90(1), 25–29 (2005)
Chen, J., Zhu, Q.: Security as a service for cloud-enabled internet of controlled things under advanced persistent threats: a contract design approach. IEEE Trans. Inf. Forensics Secur. 12(11), 2736–2750 (2017)
Even-Dar, E., Mansour, Y.: Learning rates for Q-learning. J. Mach. Learn. Res. 5(Dec), 1–25 (2003)
Farhang, S., Manshaei, M.H., Esfahani, M.N., Zhu, Q.: A dynamic Bayesian security game framework for strategic defense mechanism design. In: Poovendran, R., Saad, W. (eds.) GameSec 2014. LNCS, vol. 8840, pp. 319–328. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-12601-2_18
Garcıa, J., Fernández, F.: A comprehensive survey on safe reinforcement learning. J. Mach. Learn. Res. 16(1), 1437–1480 (2015)
Hayel, Y., Zhu, Q.: Attack-aware cyber insurance for risk sharing in computer networks. In: Khouzani, M., Panaousis, E., Theodorakopoulos, G. (eds.) GameSec 2015. LNCS, vol. 9406, pp. 22–34. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25594-1_2
Hecker, C.R.: A methodology for intelligent honeypot deployment and active engagement of attackers. Ph.D. thesis (2012). aAI3534194
Horák, K., Zhu, Q., Bošanskỳ, B.: Manipulating adversary’s belief: a dynamic game approach to deception by design for proactive network security. In: Rass, S., An, B., Kiekintveld, C., Fang, F., Schauer, S. (eds.) GameSec 2017. LNCS, vol. 10575, pp. 273–294. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68711-7_15
Hu, Q., Yue, W.: Markov Decision Processes with Their Applications, vol. 14. Springer, Boston (2008). https://doi.org/10.1007/978-0-387-36951-8
Huang, L., Chen, J., Zhu, Q.: Distributed and optimal resilient planning of large-scale interdependent critical infrastructures. In: 2018 Winter Simulation Conference (WSC), pp. 1096–1107. IEEE (2018)
Huang, L., Chen, J., Zhu, Q.: Factored Markov game theory for secure interdependent infrastructure networks. In: Rass, S., Schauer, S. (eds.) Game Theory for Security and Risk Management. SDGTFA, pp. 99–126. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-75268-6_5
Huang, L., Zhu, Q.: Adaptive strategic cyber defense for advanced persistent threats in critical infrastructure networks. ACM SIGMETRICS Perform. Eval. Rev. 46(2), 52–56 (2018)
Huang, L., Zhu, Q.: A dynamic games approach to proactive defense strategies against advanced persistent threats in cyber-physical systems. arXiv preprint arXiv:1906.09687 (2019)
Jajodia, S., Ghosh, A.K., Swarup, V., Wang, C., Wang, X.S.: Moving Target Defense: Creating Asymmetric Uncertainty for Cyber Threats, vol. 54. Springer, New York (2011). https://doi.org/10.1007/978-1-4614-0977-9
Kearns, M., Singh, S.: Near-optimal reinforcement learning in polynomial time. Mach. Learn. 49(2–3), 209–232 (2002)
Kearns, M.J., Singh, S.P.: Finite-sample convergence rates for q-learning and indirect algorithms. In: Advances in Neural Information Processing Systems, pp. 996–1002 (1999)
La, Q.D., Quek, T.Q., Lee, J., Jin, S., Zhu, H.: Deceptive attack and defense game in honeypot-enabled networks for the internet of things. IEEE Internet Things J. 3(6), 1025–1035 (2016)
Liang, H., Cai, L.X., Huang, D., Shen, X., Peng, D.: An SMDP-based service model for interdomain resource allocation in mobile cloud networks. IEEE Trans. Veh. Technol. 61(5), 2222–2232 (2012)
Luo, T., Xu, Z., Jin, X., Jia, Y., Ouyang, X.: IoTCandyJar: Towards an intelligent-interaction honeypot for IoT devices. Black Hat (2017)
Mudrinich, E.M.: Cyber 3.0: the department of defense strategy for operating in cyberspace and the attribution problem. AFL Rev. 68, 167 (2012)
Nakagawa, T.: Stochastic Processes: with Applications to Reliability Theory. Springer, London (2011). https://doi.org/10.1007/978-0-85729-274-2
Paruchuri, P., Pearce, J.P., Marecki, J., Tambe, M., Ordonez, F., Kraus, S.: Playing games for security: an efficient exact algorithm for solving Bayesian stackelberg games. In: Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems, vol. 2, pp. 895–902. International Foundation for Autonomous Agents and Multiagent Systems (2008)
Pauna, A., Iacob, A.C., Bica, I.: QRASSH-a self-adaptive SSH honeypot driven by Q-learning. In: 2018 International Conference on Communications (COMM), pp. 441–446. IEEE (2018)
Pawlick, J., Colbert, E., Zhu, Q.: Modeling and analysis of leaky deception using signaling games with evidence. IEEE Trans. Inf. Forensics Secur. 14(7), 1871–1886 (2018)
Pawlick, J., Colbert, E., Zhu, Q.: A game-theoretic taxonomy and survey of defensive deception for cybersecurity and privacy. ACM Comput. Surv. (CSUR) (2019, to appear )
Pawlick, J., Farhang, S., Zhu, Q.: Flip the cloud: cyber-physical signaling games in the presence of advanced persistent threats. In: Khouzani, M., Panaousis, E., Theodorakopoulos, G. (eds.) GameSec 2015. LNCS, vol. 9406, pp. 289–308. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25594-1_16
Pawlick, J., Nguyen, T.T.H., Colbert, E., Zhu, Q.: Optimal timing in dynamic and robust attacker engagement during advanced persistent threats. In: 2019 17th International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOpt), pp. 1–6. IEEE (2019)
Pawlick, J., Zhu, Q.: A Stackelberg game perspective on the conflict between machine learning and data obfuscation. In: 2016 IEEE International Workshop on Information Forensics and Security (WIFS), pp. 1–6. IEEE (2016). http://ieeexplore.ieee.org/abstract/document/7823893/
Pawlick, J., Zhu, Q.: A mean-field stackelberg game approach for obfuscation adoption in empirical risk minimization. In: 2017 IEEE Global Conference on Signal and Information Processing (GlobalSIP), pp. 518–522. IEEE (2017)
Pawlick, J., Zhu, Q.: Proactive defense against physical denial of service attacks using poisson signaling games. In: Rass, S., An, B., Kiekintveld, C., Fang, F., Schauer, S. (eds.) GameSec 2017. LNCS, vol. 10575, pp. 336–356. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68711-7_18
Pouget, F., Dacier, M., Debar, H.: White paper: honeypot, honeynet, honeytoken: terminological issues. Rapport technique EURECOM 1275 (2003)
Rid, T., Buchanan, B.: Attributing cyber attacks. J. Strateg. Stud. 38(1–2), 4–37 (2015)
Sahabandu, D., Xiao, B., Clark, A., Lee, S., Lee, W., Poovendran, R.: DIFT games: dynamic information flow tracking games for advanced persistent threats. In: 2018 IEEE Conference on Decision and Control (CDC), pp. 1136–1143. IEEE (2018)
Spitzner, L.: Honeypots: Tracking Hackers, vol. 1. Addison-Wesley, Reading (2003)
Sun, Y., Uysal-Biyikoglu, E., Yates, R.D., Koksal, C.E., Shroff, N.B.: Update or wait: how to keep your data fresh. IEEE Trans. Inf. Theory 63(11), 7492–7508 (2017)
Taylor, M.E., Stone, P.: Transfer learning for reinforcement learning domains: a survey. J. Mach. Learn. Res. 10(Jul), 1633–1685 (2009)
Wagener, G., State, R., Dulaunoy, A., Engel, T.: Self adaptive high interaction honeypots driven by game theory. In: Guerraoui, R., Petit, F. (eds.) SSS 2009. LNCS, vol. 5873, pp. 741–755. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-05118-0_51
Wang, K., Du, M., Maharjan, S., Sun, Y.: Strategic honeypot game model for distributed denial of service attacks in the smart grid. IEEE Trans. Smart Grid 8(5), 2474–2482 (2017)
Xu, Z., Zhu, Q.: A cyber-physical game framework for secure and resilient multi-agent autonomous systems. In: 2015 IEEE 54th Annual Conference on Decision and Control (CDC), pp. 5156–5161. IEEE (2015)
Zhang, R., Zhu, Q., Hayel, Y.: A bi-level game approach to attack-aware cyber insurance of computer networks. IEEE J. Sel. Areas Commun. 35(3), 779–794 (2017)
Zhang, T., Zhu, Q.: Dynamic differential privacy for ADMM-based distributed classification learning. IEEE Trans. Inf. Forensics Secur. 12(1), 172–187 (2017). http://ieeexplore.ieee.org/abstract/document/7563366/
Zhang, T., Zhu, Q.: Distributed privacy-preserving collaborative intrusion detection systems for vanets. IEEE Trans. Sig. Inf. Process. Netw. 4(1), 148–161 (2018)
Zhu, Q., Başar, T.: Game-theoretic methods for robustness, security, and resilience of cyberphysical control systems: games-in-games principle for optimal cross-layer resilient control systems. IEEE Control Syst. Mag. 35(1), 46–65 (2015)
Zhu, Q., Başar, T.: Dynamic policy-based IDS configuration. In: Proceedings of the 48th IEEE Conference on Decision and Control, 2009 Held Jointly with the 2009 28th Chinese Control Conference, CDC/CCC 2009, pp. 8600–8605. IEEE (2009)
Zhu, Q., Başar, T.: Game-theoretic approach to feedback-driven multi-stage moving target defense. In: Das, S.K., Nita-Rotaru, C., Kantarcioglu, M. (eds.) GameSec 2013. LNCS, vol. 8252, pp. 246–263. Springer, Cham (2013). https://doi.org/10.1007/978-3-319-02786-9_15
Zhu, Q., Clark, A., Poovendran, R., Basar, T.: Deployment and exploitation of deceptive honeybots in social networks. In: 2013 IEEE 52nd Annual Conference on Decision and Control (CDC), pp. 212–219. IEEE (2013)
Zhu, Q., Fung, C., Boutaba, R., Başar, T.: GUIDEX: a game-theoretic incentive-based mechanism for intrusion detection networks. IEEE J. Sel. Areas Commun. 30(11), 2220–2230 (2012)
Zhuang, J., Bier, V.M., Alagoz, O.: Modeling secrecy and deception in a multiple-period attacker-defender signaling game. Eur. J. Oper. Res. 203(2), 409–418 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Huang, L., Zhu, Q. (2019). Adaptive Honeypot Engagement Through Reinforcement Learning of Semi-Markov Decision Processes. In: Alpcan, T., Vorobeychik, Y., Baras, J., Dán, G. (eds) Decision and Game Theory for Security. GameSec 2019. Lecture Notes in Computer Science(), vol 11836. Springer, Cham. https://doi.org/10.1007/978-3-030-32430-8_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-32430-8_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32429-2
Online ISBN: 978-3-030-32430-8
eBook Packages: Computer ScienceComputer Science (R0)