Abstract
The increasing demand of web applications and social network websites generates a large volume of data for online accesses. Since web data stored across different web servers and online repositories have grown rapidly, understanding user’s pattern and their content usage trends is essential for service providers. Web mining is an emerging technique in the field of computational intelligence. It is used to discover useful knowledge and insights from web data for a variety of applications such as target marketing, intrusion detection, web monitoring and recommendation, fake news analysis, etc. Web data contains heterogeneous data such as online documents, web structure data, web log, and user profile. Web content mining, web structure mining, and web usage mining are broad categories of web mining based on the type of data used in pattern extraction. This chapter describes basic functionalities of web mining and explores the state-of-the-art web mining techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
J. Srivastava, P. Desikan, V. Kumar, Web Mining: Accomplishments & Future Directions (National Science Foundation Workshop on Next Generation Data Mining, 2002)
R. Cooley, B. Mobasher, J. Srivastava, Web mining: Information and pattern discovery on the world wide web, in Proceedings of the 9th IEEE International Conference on Tools with AI, (1997)
J. Han, J. Pei, Y. Yin, Mining frequent patterns without candidate generation, in Proceedings of the ACM SIGMOD International Conference on the Management of Data 2000, (2000), pp. 1–12
A.M. Kaplan, Social media, the digital revolution, and the business of media. Int. J. Media Manag. 17, 197–199 (2015)
C. Zhang, J. Sun, X. Zhu, Y. Fang, Privacy and security for online social networks: Challenges and opportunities. IEEE Netw. 2010, 13–18 (2010)
V.V.H. Pham, S. Yu, K. Sood, L. Cui, Privacy issues in social networks and analysis: A comprehensive survey. Inst. Eng. Technol. Netw. 7(2), 74–84 (2017)
P. van Schaik et al., Security and privacy in online social networking: Risk perceptions and precautionary behaviour. Comp. Hum. Behav. 78, 283–297 (2017)
A.C. Eberendu, Unstructured data: An overview of the data of big data. Int. J. Comp. Trends Technol. 38(1), 46–50 (2016)
R. Patel, M. Prajapati, M. Barot, Review paper for types of data in big data and text mining. Int. J. Eng. Res. Technol. 08(10) (2019)
P.D. Vo, A. Ginsca, H. Le Borgne, A. Popescu, Harnessing noisy web images for deep representation. Comp. Vision Image Understand. (2017)
Y. Hu, L. Zheng, Y. Yang, Y. Huang, Twitter100k: A real-world dataset for weakly supervised cross-media retrieval. IEEE Trans. Multimedia 20(4), 927–938 (2018)
J. Yang, X. Sun, Y. Lai, L. Zheng, M. Cheng, Recognition from web data: A progressive filtering approach, in IEEE Transactions on Image Processing, vol. 27, (2018), pp. 5303–5315
R. Sardhara, K.I. Lakhataria, Impact of different domain inlink, outlink and rechability on relevance of web page using correlation, in 2019 International Conference on Intelligent Computing and Control Systems (ICCS), (Madurai, India, 2019), pp. 755–759
M. Gandhi, K. Jeyebalan, J. Kallukalam, A. Rapkin, P. Reilly, N. Widodo, Web Research Infrastructure Project Final Report (Cornell University, 2004)
J.C. Bertot, C.R. McClure, W.E. Moen, J. Rubin, Web usage statistics: Measurement issues and analytical techniques. Gov. Inf. Q. 14(4), 373–395 (1997)
Next web page prediction using genetic algorithm and feed forward association rule based on web-log features. Int. J. Performability Eng. 16(1), 10–18 (2020)
R. Sardhara, K.L. Lakhataria, Web structure mining: A novel approach to reduce mutual reinforcement, in 2018 3rd International Conference and Workshops on Recent Advances and Innovations in Engineering (ICRAIE), (Jaipur, India, 2018), pp. 1–6
Y. Wang, H. Liu, Q. Liu, Application research of web log mining in the E-commerce, in 2020 Chinese Control and Decision Conference (CCDC), (Hefei, China, 2020), pp. 349–352
D.K. Singh, V. Sharma, S. Sharma, Graph based approach for mining frequent sequential access patterns of web pages. Int. J. Comp. Appl. 40(10), 33–37 (2012)
J. Srivastava, R. Cooley, M. Deshpande, P.-N. Tan, Web usage mining: Discovery and applications of usage patterns from Web Data. SIGKDD Explor. 1(2) (2000)
R. Cooley, B. Mobasher, J. Srivastava, Data preparation for mining World Wide Web browsing patterns. J. Knowl. Inf. Syst. 1(1) (1999)
W. Lin, S. A. Alvarez, and C . Ruiz, (2000), Collaborative recommendation via adaptive association rule mining. in Proceedings of the Web Mining for Ecommerce Workshop, Boston.
B. Mobasher, H. Dai, T. Luo, M. Nakagawa, Effective personalization based on association rule discovery from web usage data, in Proceedings of the 3rd International Workshop on Web Information and Data Management, (Atlanta, Georgia, USA, 2001), pp. 9–15
C. Wong, S. Shiu, S. Pal, Mining fuzzy association rules for Web access case adaptation, in Proceedings of the Workshop Program at the 4th International Conference on Case-Based Reasoning, (Vancouver, Canada, 2001)
J. Han, M. Kamber, Data Mining: Concepts and Techniques (Morgan Kaufmann Publishers, 2001)
M. Tang, Y. Xia, B. Tang, Y. Zhou, B. Cao, R. Hu, Mining collaboration patterns between APIs for mashup creation in web of things. IEEE Access 7, 14206–14215 (2019)
P. Tan, V. Kumar, Modeling of Web robot navigational patterns. in Proceedings of. ACM WebKDD Workshop. (2000)
B. Mobasher, R. Cooley, J. Srivastava, Creating adaptive web sites through usage-based clustering of URLs, in Proceedings of IEEE Knowledge and Data Engineering Workshop (KDEX'99), (1999)
E.-H. Han, G. Karypis, V. Kumar, B. Mobasher, Hypergraph based clustering in highdimensional data sets: A summary of results, in IEEE bulletin of the technical committee on data engineering, vol. 21, (1998)
O. Nasroui, H. Frigui, A. Joshi, R. Krishnapuram, Mining Web access logs using relational competitive fuzzy clustering, in Proceedings of the 8th International Fuzzy Systems Association World Congress, (1999)
A. Banerjee, J. Ghosh, Concept-Based clustering of clickstream data, in Proceedings of the 3rd International Conference on Information Technology, (2000)
R. Agrawal, R. Srikant, Mining sequential patterns, in Proceedings of the 11th International Conference on Data Engineering, (Taipei, Taiwan, 1995), pp. 3–14
J. Pei, J. Han, B. Mortazavi-asl, H. Zhu, Mining access patterns efficiently from web logs, in Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, (Kyoto, Japan, 2000), pp. 396–407
E. Uzun, A novel web scraping approach using the additional information obtained from web pages. IEEE Access 8, 61726–61740 (2020)
T. Tun, K.M.M. Tun, Web content outlier mining using machine learning and mathematical approaches, in 2019 International Conference on Advanced Information Technologies (ICAIT), (Yangon, Myanmar, 2019), pp. 286–291
M.E. Şahin, S. Özdemir, Detection of malicious requests on Web logs using data mining techniques, in 2019 4th International Conference on Computer Science and Engineering (UBMK), (Samsun, Turkey, 2019), pp. 463–468
R. Tomar, R. Tiwari, Sarishma, Information delivery system for early forest fire detection using Internet of things, in Advances in Computing and Data Sciences. ICACDS 2019. Communications in Computer and Information Science, ed. by M. Singh, P. Gupta, V. Tyagi, J. Flusser, T. Ören, R. Kashyap, vol. 1045, (Springer, Singapore, 2019). https://doi.org/10.1007/978-981-13-9939-8_42
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Singh, D.K., Srivastava, R., Choudhury, T., Yadav, A.K. (2022). Computational Intelligence in Web Mining. In: Tomar, R., Hina, M.D., Zitouni, R., Ramdane-Cherif, A. (eds) Innovative Trends in Computational Intelligence. EAI/Springer Innovations in Communication and Computing. Springer, Cham. https://doi.org/10.1007/978-3-030-78284-9_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-78284-9_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-78283-2
Online ISBN: 978-3-030-78284-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)