Abstract
With the explosive growth of data available on the World Wide Web, discovery and analysis of useful information from the World Wide Web becomes a practical necessity. Web access pattern, which is the sequence of accesses pursued by users frequently, is a kind of interesting and useful knowledge in practice.
In this paper, we study the problem of mining access patterns from Web logs efficiently. A novel data structure, called Web access pattern tree, or WAP-tree in short, is developed for efficient mining of access patterns from pieces of logs. The Web access pattern tree stores highly compressed, critical information for access pattern mining and facilitates the development of novel algorithms for mining access patterns in large set of log pieces. Our algorithm can find access patterns from Web logs quite efficiently. The experimental and performance studies show that our method is in general an order of magnitude faster than conventional methods.
The work was supported in part by the Natural Sciences and Engineering Research Council of Canada (grant NSERC-A3723), the Networks of Centres of Excellence of Canada (grant NCE/IRIS-3), and the Hewlett-Packard Lab.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proc. 1994 Int. Conf. Very Large Data Bases, pages 487–499, Santiago, Chile, September 1994.
R. Agrawal and R. Srikant. Mining sequential patterns. In Proc. 1995 Int. Conf. Data Engineering, pages 3–14, Taipei, Taiwan, March 1995.
C. Bettini, X. Sean Wang, and S. Jajodia. Mining temporal relationships with multiple granularities in time sequences. Data Engineering Bulletin, 21:32–38, 1998.
R. Cooley, B. Mobasher, and J. Srivastava. Data preparation for mining World Wide Web browsing patterns. In Journal of Knowledge & Information Systems, Vol.1, No. l, 1999.
J. Graham-Cumming. Hits and misses: A year watching the Web. In Proc. 6th Int’l World Wide Web Conf., Santa Clara, California, April 1997.
J. Han, G. Dong, and Y. Yin. Efficient mining of partial periodic patterns in time series database. In Proc. 1999 Int. Conf. Data Engineering (ICDE’99), pages 106–115, Sydney, Australia, April 1999.
H. Lu, J. Han, and L. Feng. Stock movement and n-dimensional inter-transaction association rules. In Proc. 1998 SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD’98), pages 12:1–12:7, Seattle, Washington, June 1998.
H. Mannila, H Toivonen, and A. I. Verkamo. Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery, 1:259–289, 1997.
B. Özden, S. Ramaswamy, and A. Silberschatz. Cyclic association rules. In Proc. 1998 Int. Conf. Data Engineering (ICDE’98), pages 412–421, Orlando, FL, Feb. 1998.
M. Perkowitz and O. Etzioni. Adaptive sites: Automatically learning from user access patterns. In Proc. 6th Int’l World Wide Web Conf., Santa Clara, California, April 1997.
M. Spiliopoulou and L. Faulstich. WUM: A tool for Web utilization analysis. In Proc. 6th Int’l Conf. on Extending Database Technology (EDBT’98), Valencia, Spain, March 1998.
R. Srikant and R. Agrawal. Mining quantitative association rules in large relational tables. In Proc. 1996 ACM-SIGMOD Int. Conf. Management of Data, pages 1–12, Montreal, Canada, June 1996.
T. Sullivan. Reading reader reaction: A proposal for inferential analysis of Web server log files. In Proc. 3rd Conf. Human Factors & The Web, Denver, Colorado, June 1997.
L. Tauscher and S. Greeberg. How people revisit Web pages: Empirical findings and implications for the design of history systems. In Int’l Journal of Human Computer Studies, Special Issue on World Wide Web Usability, 47:97–138, 1997.
O. Zaiane, M. Xin, and J. Han. Discovering Web access patterns and trends by applying OLAP and data mining technology on Web logs. In Proc. Advances in Digital Libraries Conf. (ADL’98), Melbourne, Australia, pages 144–158, April 1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pei, J., Han, J., Mortazavi-asl, B., Zhu, H. (2000). Mining Access Patterns Efficiently from Web Logs. In: Terano, T., Liu, H., Chen, A.L.P. (eds) Knowledge Discovery and Data Mining. Current Issues and New Applications. PAKDD 2000. Lecture Notes in Computer Science(), vol 1805. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45571-X_47
Download citation
DOI: https://doi.org/10.1007/3-540-45571-X_47
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67382-8
Online ISBN: 978-3-540-45571-4
eBook Packages: Springer Book Archive