Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
Volume 6, Issue 2December 2004
Publisher:
  • Association for Computing Machinery
  • New York
  • NY
  • United States
ISSN:1931-0145
EISSN:1931-0153
Recommend ACM DL
ALREADY A SUBSCRIBER?SIGN IN
Reflects downloads up to 28 Jan 2025Bibliometrics
article
Editorial: special issue on web content mining

With the phenomenal growth of the Web, there is an everincreasing volume of data and information published in numerous Web pages. The research in Web mining aims to develop new techniques to effectively extract and mine useful knowledge or information ...

article
Extracting relational data from HTML repositories

There is a vast amount of valuable information in HTML documents, widely distributed across the World Wide Web and across corporate intranets. Unfortunately, HTML is mainly presentation oriented and hard to query. In this paper, we develop a system to ...

article
Learning important models for web page blocks based on layout and content analysis

Previous work shows that a web page can be partitioned into multiple segments or blocks, and often the importance of those blocks in a page is not equivalent. It has also been proven that differentiating noisy and unimportant blocks from pages can ...

article
Learning by googling

The goal of giving a well-defined meaning to information is currently shared by endeavors such as the Semantic Web as well as by current trends within Knowledge Management. They all depend on the large-scale formalization of knowledge and on the ...

article
Correlating summarization of multi-source news with k-way graph bi-clustering

With the emergence of enormous amount of online news, it is desirable to construct text mining methods that can extract, compare and highlight similarities of them. In this paper, we explore the research issue and methodology of correlated summarization ...

article
Information diffusion through blogspace

We study the dynamics of information propagation in environments of low-overhead personal publishing, using a large collection of WebLogs over time as our example domain. We characterize and model this collection at two levels. First, we present a ...

article
Mining structures for semantics

Online data is available in two avors: unstructured data that resides as free text in HTML pages, and structured data that resides in databases and knowledge bases. Unstructured data is easily accessed as human-readable text on a browser, while ...

article
Learning to extract information from large domain-specific websites using sequential models

In this article we describe a novel information extraction task on the web and show how it can be solved effectively using the emerging conditional exponential models. The task involves learning to find specific goal pages on large domain-specific ...

article
Mining semantics for large scale integration on the web: evidences, insights, and challenges

The Web has been rapidly "deepened" -- with myriad searchable databases online, where data are hidden behind query interfaces. Toward large scale integration over this "deep Web," we are facing a new challenge- With its dynamic and ad-hoc nature, such ...

Subjects

Currently Not Available

Comments