Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1109/WI.2006.201guideproceedingsArticle/Chapter ViewAbstractPublication PageswiConference Proceedingsconference-collections
Article
Free access

WISE: Hierarchical Soft Clustering of Web Page Search Results Based on Web Content Mining Techniques

Published: 18 December 2006 Publication History

Abstract

Typically, search engines are low precision in response to a query, retrieving lots of useless web pages, and missing some other important ones. In this paper, we study the problem of the hierarchical clustering of web pages search results. In particular, we propose an architecture called WISE [1], a meta-search engine that automatically builds clusters of related web pages embodying one meaning of the query. These clusters are then hierarchically organized and labeled with a phrase representing the key concept of the cluster and the corresponding web documents. The system which is a web-based interface (soon available at wise.di.ubi.pt), introduces some interesting new ideas, such as the pre-selection of the retrieved web pages, the capacity to statistically detect phrases within documents and the representation of documents based on their most relevant key concepts by using web content mining techniques. The final step of the system is supported by a graph-based overlapping clustering algorithm which groups the selected documents into a hierarchy of clusters.

Cited By

View all
  • (2013)A segment-based approach to clustering multi-topic documentsKnowledge and Information Systems10.1007/s10115-012-0556-z34:3(563-595)Online publication date: 1-Mar-2013
  • (2012)Web data mining trends and techniquesProceedings of the International Conference on Advances in Computing, Communications and Informatics10.1145/2345396.2345551(961-965)Online publication date: 3-Aug-2012
  • (2011)Statistical approach for improving the quality of search resultsProceedings of the 10th WSEAS international conference on Applied computer and applied computational science10.5555/1965610.1965624(89-93)Online publication date: 8-Mar-2011
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
WI '06: Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
December 2006
1023 pages
ISBN:0769527477

Publisher

IEEE Computer Society

United States

Publication History

Published: 18 December 2006

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 118 of 178 submissions, 66%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)16
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2013)A segment-based approach to clustering multi-topic documentsKnowledge and Information Systems10.1007/s10115-012-0556-z34:3(563-595)Online publication date: 1-Mar-2013
  • (2012)Web data mining trends and techniquesProceedings of the International Conference on Advances in Computing, Communications and Informatics10.1145/2345396.2345551(961-965)Online publication date: 3-Aug-2012
  • (2011)Statistical approach for improving the quality of search resultsProceedings of the 10th WSEAS international conference on Applied computer and applied computational science10.5555/1965610.1965624(89-93)Online publication date: 8-Mar-2011
  • (2007)Digital libraries and engines of searchProceedings of the 2007 Euro American conference on Telematics and information systems10.1145/1352694.1352703(1-9)Online publication date: 14-May-2007

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media