Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Anomaly detection of policies in distributed firewalls using data log analysis

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

A distributed firewall is a security application that monitors and controls traffic on an organization’s network. While centralized firewalls are used against attacks coming from outside a network, distributed firewalls are considered for inside attacks from internal networks such as wireless access and VPN tunnel. Distributed firewalls use policies, which are stated by rules, to find anomalous packets. However, such static rules may be incomplete. In this case, by monitoring firewall logs, the anomalies can be detected. Such logs become big when networks have high traffic, but their hidden knowledge contains valuable information about existing anomalies. In this paper, to detect the anomalies, we extract patterns from big data logs of distributed firewalls using data mining and machine learning. The proposed method is applied to big logs from distributed firewalls in a real security environment, and results are analyzed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Availability of data and materials

The log dataset used in the case study of this research is big data, about one Terabyte. A part of the log (500 records) is located at: https://github.com/babamir/Firewall-data.

References

  1. Ucar E, Ozhan E (2017) The analysis of firewall policy through machine learning and data mining. Wirel Pers Commun 96(2):2891–2909

    Article  Google Scholar 

  2. Esmaeil Qasem H, Khamitkar SD (2020) Discovering anomalous rules in firewall logs using data mining and machine learning classifiers. Int J Sci Technol Res 9(2):2491–2497

    Google Scholar 

  3. He W (2018) Big-data analysis of multi-source logs for network anomaly detection, In: The 5th International Conference on Information, Cybernetics, and Computational Social Systems (ICCSS)

  4. Lkotun AM et al (2023) K-means clustering algorithms: a comprehensive review, variants analysis, and advances in the era of big data. Inf Sci 622:178–210

    Article  Google Scholar 

  5. Shami TM et al (2022) Particle swarm optimization: a comprehensive survey. IEEE Access 10:10031–10061

    Article  Google Scholar 

  6. Cheng Y, Shi Q (2022) Analysis of policy anomalies in distributed firewalls. Int J Netw Secur 24(4):617–627

    Google Scholar 

  7. Gutierrez R et al (2018) Cyber anomaly detection: using tabulated vectors and embedded analytics for efficient data mining. J Algorithms Comput Technol 12(4):293–310

    Article  MathSciNet  Google Scholar 

  8. Younas M (2019) Research challenges of big data. Serv Oriented Comput Appl 13:105–107

    Article  Google Scholar 

  9. Medjedovic D, Tahirovic E, Dedovic I (2022) Algorithms and Data Structures for Massive Datasets, Manning publication

  10. Adadi A (2021) A survey on data-efficient algorithms in big data era. J Big Data 8(24):1–54

    Google Scholar 

  11. Mahdi MA, Hosny KM, Elhenawy I (2021) Scalable clustering algorithms for big data: a review. IEEE Access 9:80015–80027

    Article  Google Scholar 

  12. Fahad A (2014) A survey of clustering techniques for big data: taxenomy and emprical analysis. IEEE Trans Emerg Top Comput 2(3):267–279

    Article  Google Scholar 

  13. Likas A, Vlassis N, Verbeek J (2003) The global K-means clustering algorithm. Pattern Recognit J 36(2):451–461

    Article  Google Scholar 

  14. Pelleg D, Pelleg D, Moore A (2000) X-means: extending k-means with efficient estimation of the number of clusters, In: Proceedings of the 17th International Conf on Machine Learning, pp 727–734

  15. Dhaenens C, Jourdan L (2016) Metaheuristics for big data. Wiley

    Book  MATH  Google Scholar 

  16. Kumar A et al (eds) (2021) Swarm Intelligence Optimization: Algorithms and Applications. Wiley

    Google Scholar 

  17. Gad AG (2022) Particle swarm optimization algorithm and its applications: a systematic review. Arch Comput Methods Eng 29:2531–2561

    Article  MathSciNet  Google Scholar 

  18. Andalib A, Babamir SM (2014) A new approach for test case generation by discrete PSO algorithm, In: 22nd Iranian Conference on Electrical Engineering (ICEE), pp 1180–1185

  19. Andalib A, Babamir SM (2016) A class-based link prediction using distance dependent chinese restaurant process. Physica A Stat Mech Appl 456:204–214

    Article  Google Scholar 

  20. Lin C, Liu J, Ho C (2008) Anomaly detection using LibSVM training tools. In: International Conference on Information Security and Assurance, Korea, pp. 166–171

  21. Nanda M (2018) A comparison study of kernel functions in the SVM and its application for termite detection. J Inf 9(1):5–11

    Google Scholar 

  22. Costa VG, Pedreira CE (2022) Recent advances in decision trees: an updated survey, Artificial Intelligence Review, Issue 5, Springer

  23. Alzubaidi L et al (2021) Review of deep learning: concepts CNN architectures, challenges, applications, future directions. J Big Data 8(53):1–74

    Google Scholar 

  24. Yang T, Ying Y (2023) AUC maximization in the era of big data and AI: a survey. Acm Comput Surv 55(8):1–36

    Article  Google Scholar 

  25. Chang C (2022) Comprehensive analysis of receiver operating characteristic (ROC) curves for hyperspectral anomaly detection. IEEE Trans Geosci Remote Sens 60:1–24

    Google Scholar 

  26. Al-Shaer E, Hamed H (2004) Design and implementation of firewall policy advisor tools. DePaul University, Technical Report

    Google Scholar 

  27. Al-Shaer E, Hamed H (2003) Firewall policy advisor for anomaly discovery and rule editing, In: IFIP/IEEE 8th International Symposium on Integrated Network Management, pp 17–30

  28. Rezvani M, Aryan R (2009) Analyzing and resolving anomalies in firewall security policies based on propositional logic, In: 13th IEEE International Multitopic Conference on INMIC

  29. Al-Shaer E and Hamed H (2004) Discovery of policy anomalies in distributed firewalls. In: Proceedings of the 23rd Annual Joint Conference of the IEEE Computer and Communications Societies, Vol 4, pp 2605–2616

  30. Chaure R, Shandilya S (2010) Firewall anomalies detection and removal techniques, a survey. Int J Emerg Technol 1(1):71–74

    Google Scholar 

  31. K. Golnabi, R. Min, L. Khan, and E. Al-Shaer, Analysis of Firewall Policy Rules using Data Mining Techniques, IEEE/IFIP Network Operations and Management Symposium NOMS, pp. 305–315, 2006.

  32. Cuppens F, Boulahia N, García-Alfaro J (2005) Detection and removal of firewall misconfiguration. In: The IASTED International Conference on Communication, Network and Information Security, pp 154–161

  33. Liu A, Gouda M (2005) Complete redundancy detection in firewalls, In: IFIP Annual Conference on Data and Applications Security and Privacy, Springer, pp 193- 206

  34. Qian J, Hinrichs S (2001) ACLA: a framework for access control list (ACL) analysis and optimization, In: The 5th Joint Working Conference on Communications and Multimedia Security Issues of the New Century, Springer, pp 197–211

  35. Khan B, Khan M, Mahmud M, Alghathbar K (2010) Security analysis of firewall rule sets in computer networks, In: The 4th IEEE International Conference on Emerging Security Information, Systems and Technologies

  36. Liu A, Gouda M (2009) Firewall policy queries. IEEE Trans Parallel Distrib Syst 20(6):766–777

    Article  Google Scholar 

  37. Bartal Y, Mayer A, Nissim K, Wool A (2002) Firmato: a novel firewall management toolkit, In: IEEE Symposium on Security and Privacy, pp 381–420

  38. Alfaro J, Cuppens F, Boulahia N (2006) Analysis of policy anomalies on distributed network security setups, In: European Symposium on Research in Computer Security, Springer, pp. 496–511

  39. Abedin M, Nessa S, Khan L, Thuraisingham B (2006) Detection and resolution of anomalies in firewall policy rules, In: IFIP Annual Conference on Data and Applications Security and Privacy, Springer, pp 15–29

  40. Katic T, Pale P (2007) Optimization of firewall rules. In: The 29th International Conference on Information Technology Interfaces, pp 685–690

  41. Capretta V, Stepien B, Felty A (2007) Formal correctness of conflict detection for firewalls. In: ACM Workshop on Formal Methods in Security Engineering, pp 22–30

  42. Liu A (2012) Firewall policy change-impact analysis. ACM Trans Internet Technol (TOIT) 11(4):1–24

    Article  Google Scholar 

  43. Ektefa M, Memar S, Sidi F, Suriani L (2010) Intrusion detection using data mining techniques, In: International Conference on Information Retrieval and Knowledge Management (CAMP)

  44. Kabiri P, Ghorbani A (2005) Research on intrusion detection and response: a survey. Int J Netw Secur 1(2):84–102

    Google Scholar 

  45. Baboescu F, Varghese G (2003) Fast and scalable conflict detection for packet classifiers. Comput Netw 42(6):717–735

    Article  MATH  Google Scholar 

  46. Tapdiya A, Fulp E (2009) Towards optimal firewall rule ordering utilizing directed acyclical graphs, In: Proceedings of 18th International Conference on Computer Communications and Networks

  47. Zhang C, Winslett M, Gunter C (2007) On the safety and efficiency of firewall policy deployment, In: IEEE Symposium on Security and Privacy (SP '07)

  48. Ahmed Z, Imine A, Rusinowitch M (2010) Safe and efficient strategies for updating firewall policies, In: International Conference on Trust, Privacy and Security in Digital Business, Springer

  49. Benelbahri M, Bouhoula M (2007) Tuple based approach for anomalies detection within firewall filtering rules, In: The 12th IEEE Symposium on Computers and Communications

  50. Yoon M, Chen S, Zhang Z (2010) Minimizing the maximum firewall rule set in a network with multiple firewalls. IEEE Trans Comput 59(2):218–230

    Article  MathSciNet  MATH  Google Scholar 

  51. Aneja A, Thapar V (2013) Optimizing packet filter firewall using duple decision scheme. SIJ Trans Comput Netw Commun Eng J (CNCE) 1(2):28–34

    Google Scholar 

  52. Karoui K, Ben Ftima F, Ben Ghezala H (2009) A multi-agent framework for anomalies detection on distributed firewalls using data mining techniques, data mining and multi-agent integration, Springer, pp 267–278

  53. Saboori E, Parsazad S, Sanatkhani Y (2010) Automatic Firewall Rules Generator for Anomaly Detection Systems with Apriori Algorithm, In: The 3rd International Conference on Advanced Computer Theory and Engineering (ICACTE)

  54. Acharya S, Wang J, Ge Z, Znati T, Greenberg A (2006) Simulation study of firewalls to aid improved performance, In: The 39th Annual Simulation Symposium (ANSS'06)

  55. Acharya S, Wang J, Ge Z, Znati T, Greenberg A (2006) Traffic-aware firewall optimization strategies, In: IEEE International Conference on Communications

  56. Denning D (1987) An intrusion-detection model. IEEE Trans Softw Eng 13(2):222–232

    Article  Google Scholar 

  57. Lubna K, Cyiac R, Kavitha Karun A (2013) Firewall log analysis and dynamic rule re-ordering in firewall policy anomaly management framework, In: International Conference on Green Computing, Communication and Conservation of Energy (ICGCE), pp 853–856, India

  58. Jakhale AR, Patil GA (2014) Anomaly detection system by mining frequent pattern using data mining algorithm from network flow. Int J Eng Res Technol 3(1):2278

    Google Scholar 

  59. Yuan L, Chen H, Mai J, Chuah CN, Su Z, Mohapatra P (2006) FIREMAN: a Toolkit for Firewall modeling and Analysis, In: IEEE Symposium on Security and Privacy

  60. Apostolopoulos G, Aubespin D, Peris V, Pradham P, Saha D (2000) Design, implementation and performance of a content-based switch, In: The 19th Annual Joint Conference of the IEEE Computer and Communications Societies, pp 1117–1126

  61. Rezvani M, Arian R (2009) Analyzing and resolving anomalies in firewall security policies based on propositional logic, In: IEEE 13th International Multitopic Conference

  62. Du M, Li F, Zheng G, Srikumar V (2017) Deeplog: anomaly detection and diagnosis from system logs through deep learning. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp 1285–1298

  63. Muniyandi AP, Rajeswari R, Rajaram R (2012) Network anomaly detection by cascading k-Means clustering and C45 decision tree algorithm. Procedia Eng 30:174–182

    Article  Google Scholar 

  64. Ying S, Bingming Wang Lu, Wang QL, Zhao Y, Shang J, Huang H, Cheng G, Yang Z, Geng J (2021) An improved KNN-based efficient log anomaly detection method with automatically labeled samples. ACM Trans Knowl Discov Data (TKDD) 15(3):1–22

    Article  Google Scholar 

  65. Rahman R, Toma DS (2017) Scalable security analytics framework using NoSQL databas. Int J Datab Theory Appl 10(11):27–46

    Google Scholar 

  66. Hossen MS (2020) Data processing, Chapter 3 of machine learning and big data: concepts, algorithms, tools and applications, In: UN Dulhare, K Ahmad, KAB Ahmad, Wiley

  67. Panjei E et al (2022) A survey on outlier explanations. VLDB J 31:977–1008

    Article  Google Scholar 

Download references

Acknowledgements

We thank University of Kashan for supporting this research by grant# 234340.

Funding

This research was fully supported and funded by University of Kashan.

Author information

Authors and Affiliations

Authors

Contributions

AA obtained the real case study and achieved the results by applying the proposed method to the study. In addition, she provided the related study. B proposed the suggested method and supervised the research.

Corresponding author

Correspondence to Seyed Morteza Babamir.

Ethics declarations

Ethical Approval

The authors declare that they used no ethical matters.

Conflict of interest

The authors declare that have no competing interests. Andlib is a PhD candidate, and her research has been focused on anomalies of firewalls and their logs as big data. Babamir is a faculty member with University of Kashan with interests in distributed systems and cloud computing. This research has received its support from University of Kashan, Kashan, Iran, and ICT Research Institute, Tehran, Iran.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Andalib, A., Babamir, S.M. Anomaly detection of policies in distributed firewalls using data log analysis. J Supercomput 79, 19473–19514 (2023). https://doi.org/10.1007/s11227-023-05417-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-023-05417-7

Keywords