Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/501158.501166acmconferencesArticle/Chapter ViewAbstractPublication PagesecConference Proceedingsconference-collections
Article

Discovering critical edge sequences in E-commerce catalogs

Published: 14 October 2001 Publication History

Abstract

Web sites allow the collection of vast amounts of navigational data -- clickstreams of user traversals through the site. These massive data stores offer the tantalizing possibility of uncovering interesting patterns within the dataset. For e-businesses, always looking for an edge in the hyper-competitive online marketplace, this possibility is of particular interest. Of significant particular interest to e-businesses is the discovery of Critical Edge Sequences (CES), which denote frequently traversed subpaths in the catalog. CESs can be used to improve site performance and site management, increase the effectiveness of advertising on the site, and gather additional knowledge of customer interest patterns on the site.Using traditional graph-based and web mining strategies to find CESs could turn out to be expensive in both space and time. In this paper, we propose a method to compute the most popular paths bewteen node pairs in a catalog, which are then used to discover CESs. Our method is both space-efficient and accurate, providing a vast reduction in the storage requirement with a minimum impact on accuracy. This algorithm, executed off-line in batch mode, is also practical with respect to running time. As a variant of single-source shortest-path, it runs in log linear time.

References

[1]
R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pages 207-216, May 1993.
[2]
R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In Proceedings of the Twentieth International Conference onVery Large Databases (VLDB 1994), pages 487-499, 1994.
[3]
R. Agrawal and R. Srikant. Mining sequential patterns. In Proceedings of the Eleventh International Conference on Data Engineering, pages 3-14, March 1995.
[4]
A. Aho, J. Hopcroft, and J. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley, 1974.
[5]
R. Bayardo and R. Agrawal. Mining the most interesting rules. In Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 145-154, 1999.
[6]
M.-S. Chen, J. Park, and P. Yu. Data mining for path traversal patterns in a web environment. In Proceedings of the International Conference on Distributed Computing Systems, pages 385-392, 1996.
[7]
T. Cormen, C. Lieserson, and R. Rivest. Introduction to Algorithms. McGraw Hill, 1998.
[8]
A. Datta, D. VanderMeer, K. Ramamritham, and S. Navathe. Toward a comprehensive model of the content and structure of, and user interaction over, a web site. In Proceedings of the VLDB Workshop on Technologies for E-Services, Cairo, Egypt, September 2000.
[9]
E. Dijkstra. A note on two problems in connexion with graphs. Numerische Mathematik, 1:269-271, 1959.
[10]
K. Dutta, D. VanderMeer, A. Datta, and K. Ramamritham. Discovering critical edge sequences in e-commerce catalogs. Technical report, Chutney Technologies Technical Report TR2001-15, 2001.
[11]
D. Florescu, A. Levy, D. Suciu, and K. Yagoub. Optimization of run-time management of data intensive web sites. In Proceedings of the 25th VLDB Conference, pages 627-638, September 1999.
[12]
F. Korn, A. Labrinidis, Y. Kotidis, and C. Faloutsos. Quantifiable data mining using ratio rules. VLDB Journal, pages 254-266, 2000.
[13]
B. Mosbasher, N. Jain, E. Han, and J. Srivastava. Web mining: Pattern discovery from world wide web transactions. Technical Report 96-050, University of Minnesota, Dept. of Computer Science, Minneapolis, 1996.
[14]
J. Pitkow and P. Priolli. Mining longest repeating subsequences to predict world wide web surfing. In Proceedings of USITS'99: The 2nd USENIX Symposium on Internet Technologies and Systems, Boulder, Colorado, October 1999.
[15]
D. Simpson. Corral your storage management costs. Datamation, pages 88-93, April 1997.
[16]
M. Spiliopoulou, L. Faulstich, and K. Winkler. A data miner analyzing the navigaitional behavior of web users. In International Conference ofACAI'99: Workshop on Machine Learning in User Modelling, 1999.
[17]
R. Srikant and R. Agrawal. Mining sequential patterns: Generalizations and performance improvements. In Advances in Database Technology - EDBT'96, 5th International Conference on Extending Database Technology, pages 3-17, March 1996.
[18]
M. Zaki, N. Lesh, and M. Ogihara. Planmine: Sequence mining for plan failures. In Proceedings of the 4th Intl. Conference on Knowledge Discovery and Data Mining, pages 369-373, 1998.

Cited By

View all
  • (2018)Performance tuning and cost discovery of mobile web-based applicationsInternational Journal of Web Engineering and Technology10.1504/IJWET.2007.0120563:3(254-270)Online publication date: 20-Dec-2018
  • (2007)Mining Nonambiguous Temporal Patterns for Interval-Based EventsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2007.19061319:6(742-758)Online publication date: 1-Jun-2007
  • (2005)Cost and Response Time Simulation forWeb-based Applications on Mobile ChannelsProceedings of the Fifth International Conference on Quality Software10.1109/QSIC.2005.21(83-90)Online publication date: 19-Sep-2005

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
EC '01: Proceedings of the 3rd ACM conference on Electronic Commerce
October 2001
277 pages
ISBN:1581133871
DOI:10.1145/501158
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 October 2001

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. accuracy
  2. approximation
  3. critical edge sequence
  4. web site performance
  5. web usage analysis

Qualifiers

  • Article

Conference

EC01
Sponsor:
EC01: Third ACM Conference on Electronic Commerce
October 14 - 17, 2001
Florida, Tampa, USA

Acceptance Rates

EC '01 Paper Acceptance Rate 35 of 100 submissions, 35%;
Overall Acceptance Rate 664 of 2,389 submissions, 28%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Performance tuning and cost discovery of mobile web-based applicationsInternational Journal of Web Engineering and Technology10.1504/IJWET.2007.0120563:3(254-270)Online publication date: 20-Dec-2018
  • (2007)Mining Nonambiguous Temporal Patterns for Interval-Based EventsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2007.19061319:6(742-758)Online publication date: 1-Jun-2007
  • (2005)Cost and Response Time Simulation forWeb-based Applications on Mobile ChannelsProceedings of the Fifth International Conference on Quality Software10.1109/QSIC.2005.21(83-90)Online publication date: 19-Sep-2005

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media