The 2005 International Workshop on Web Information and Data Management (WIDM 2005) is the seventh in a series of workshops on Web Information and Data Management held in conjunction with the International Conference on Information and Knowledge Management (CIKM). The objective of the workshop is to bring together researchers, industrial practitioners, and developers to study how Web information can be extracted, stored, analyzed, and processed to provide useful knowledge to the end users for various advanced database applications. WIDM 2005 has received the sponsorship from ACM SIGIR and the cooperation of ACM SIGMOD.The call for papers resulted in the submission of 44 papers from 15 countries around the world. Starting from this year, a one-day workshop schedule lets accommodate regular papers (up to 8 pages long) along with a few short papers (up to 6 pages long). All papers were thoroughly reviewed by the program committee and external reviewers. The program committee accepted 12 papers (8 full and 4 short papers) for this year novel one-day program, resulting in competitive 27% acceptance rate. The authors of these papers are from 7 countries. The 12 accepted papers were divided into 3 sessions: "Web Ranking and Retrieval," "XML Data Management and Web Discovery," and "Web Clustering, Filtering and Applications". In addition, the WIDM 2005 program also includes an invited talk on "A Web of Data: New Architectures for New Technology?" by Prof. Donald Kossmann, from ETH Zurich (Switzerland).The workshop would not be possible without the support from the NIKE (Nittany Information, Knowledge and wEb) Research Group of The Pennsylvania State University. The group provided both the manpower and computing resources to host the workshop Web site and to run the ConfMan paper submission and review system.
Proceeding Downloads
A web of data: new architectures for new technology?
The last decade has seen a wave of new technology to publish, access, and integrate data on the Web. Furthermore, many new applications have emerged and Web technologies have penetrated almost all systems from small mobile applications to large-scale ...
Web path recommendations based on page ranking and Markov models
Markov models have been widely used for modelling users' navigational behaviour in the Web graph, using the transitional probabilities between web pages, as recorded in the web logs. The recorded users' navigation is used to extract popular web paths ...
Semantic similarity methods in wordNet and their application to information retrieval on the web
- Giannis Varelas,
- Epimenidis Voutsakis,
- Paraskevi Raftopoulou,
- Euripides G.M. Petrakis,
- Evangelos E. Milios
Semantic Similarity relates to computing the similarity between concepts which are not lexicographically similar. We investigate approaches to computing semantic similarity by mapping terms (concepts) to an ontology and by examining their relationships ...
DirectoryRank: ordering pages in web directories
Web Directories are repositories of Web pages organized in a hierarchy of topics and sub-topics. In this paper, we present DirectoryRank, a ranking framework that orders the pages within a given topic according to how informative they are about the ...
Exploiting native XML indexing techniques for XML retrieval in relational database systems
In XML retrieval, two distinct approaches have been established and pursued without much cross-fertilization taking place so far. On the one hand, native XML databases tailored to the semistructured data model have received considerable attention, and a ...
Query translation scheme for heterogeneous XML data sources
In order to formulate a meaningful XML query, a user must have some knowledge of the schema of the XML documents to be queried. The query will succeed only if the schema of the actual documents is consistent with the user's information. When a user ...
Impact of XML schema evolution on valid documents
In this paper we investigate the problem of XML Schema evolution. We first discuss the different kinds of changes that may be needed on an XML Schema. Then, we investigate how to minimize document revalidation, that is, detecting the document parts ...
A framework for semantic web services discovery
This paper describes a framework for ontology-based flexible discovery of Semantic Web services. The proposed approach relies on user-supplied, context-specific mappings from an user ontology to relevant domain ontologies used to specify Web services. ...
Narrative text classification for automatic key phrase extraction in web document corpora
Automatic key phrase extraction is a useful tool in many text related applications such as clustering and summarization. State-of-the-art methods are aimed towards extracting key phrases from traditional text such as technical papers. Application of ...
On improving local website search using web server traffic logs: a preliminary report
In this paper we give a preliminary report on our study of the use of web server traffic logs to improve local search. Web server traffic logs are, typically, private to individual websites and as such -- are unavailable to traditional web search ...
Preventing shilling attacks in online recommender systems
Collaborative filtering techniques have been successfully employed in recommender systems in order to help users deal with information overload by making high quality personalized recommendations. However, such systems have been shown to be vulnerable ...
Looking at both the present and the past to efficiently update replicas of web content
Since Web sites are autonomous and independently updated, applications that keep replicas of Web data, such as Web warehouses and search engines, must periodically poll the sites and check for changes.Since this is a resource-intensive task, in order to ...
A search result clustering method using informatively named entities
Clustering the results of a search helps the user to overview the information returned. In this paper, we regard the clustering task as indexing the search results. Here, an index means a structured label list that can makes it easier for the user to ...
- Proceedings of the 7th annual ACM international workshop on Web information and data management