Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/882082acmconferencesBook PagePublication PagesmodConference Proceedingsconference-collections
DMKD '03: Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
ACM2003 Proceeding
Publisher:
  • Association for Computing Machinery
  • New York
  • NY
  • United States
Conference:
DMKD03: 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery ( held in conjunction with MOD/PODS 2003 conference / co-located with FCRC 2003 Conference ) San Diego California 13 June 2003
ISBN:
978-1-4503-7422-4
Published:
13 June 2003
Sponsors:
Recommend ACM DL
ALREADY A SUBSCRIBER?SIGN IN

Reflects downloads up to 20 Feb 2025Bibliometrics
Abstract

No abstract available.

Skip Table Of Content Section
SESSION: Invited talk
Article
Analyzing massive data streams: past, present, and future

Continuous data streams arise naturally, for example, in the installations of large telecom and Internet service providers where detailed usage information (Call-Detail-Records, SNMP-/RMON packet-flow data, etc.) from different parts of the underlying ...

SESSION: Data streams I
Article
A symbolic representation of time series, with implications for streaming algorithms

The parallel explosions of interest in streaming data, and data mining of time series have had surprisingly little intersection. This is in spite of the fact that time series data are typically streaming data. The main reason for this apparent paradox ...

Article
Clustering binary data streams with K-means

Clustering data streams is an interesting Data Mining problem. This article presents three variants of the K-means algorithm to cluster binary data streams. The variants include On-line K-means, Scalable K-means, and Incremental K-means, a proposed ...

SESSION: DB integration
Article
Processing frequent itemset discovery queries by division and set containment join operators

SQL-based data mining algorithms are rarely used in practice today. Most performance experiments have shown that SQL-based approaches are inferior to main-memory algorithms. Nevertheless, database vendors try to integrate analysis functionalities to ...

Article
Efficient OLAP operations for spatial data using peano trees

Online Analytical Processing (OLAP) is an important application of data warehouses. With more and more spatial data being collected, such as remotely sensed images, geographical information, digital sky survey data, efficient OLAP for spatial data is in ...

Article
Clustering gene expression data in SQL using locally adaptive metrics

The clustering problem concerns the discovery of homogeneous groups of data according to a certain similarity measure. Clustering suffers from the curse of dimensionality. It is not meaningful to look for clusters in high dimensional spaces as the ...

SESSION: WWW mining
Article
Graph-based ranking algorithms for e-mail expertise analysis

In this paper we study graph--based ranking measures for the purpose of using them to rank email correspondents according to their degree of expertise on subjects of interest. While this complete expertise analysis consists of several steps, in this ...

Article
Deriving link-context from HTML tag tree

HTML anchors are often surrounded by text that seems to describe the destination page appropriately. The text surrounding a link or the link-context is used for a variety of tasks associated with Web information retrieval. These tasks can benefit by ...

SESSION: Data streams II
Article
Clustering of streaming time series is meaningless

Time series data is perhaps the most frequently encountered type of data examined by the data mining community. Clustering is perhaps the most frequently used data mining algorithm, being useful in it's own right as an exploratory technique, and also as ...

Article
A learning-based approach to estimate statistics of operators in continuous queries: a case study

Statistic estimation such as output size estimation of operators is a well-studied subject in the database research community, mainly for the purpose of query optimization. The assumption, however, is that queries are ad-hoc and therefore the emphasis ...

SESSION: Bioinformatics
Article
Using transposition for pattern discovery from microarray data

We analyze expression matrices to identify a priori interesting sets of genes, e.g., genes that are frequently co-regulated. Such matrices provide expression values for given biological situations (the lines) and given genes (columns). The frequent ...

Article
Weave amino acid sequences for protein secondary structure prediction

Given a known protein sequence, predicting its secondary structure can help understand its three-dimensional (tertiary) structure, i.e., the folding. In this paper, we present an approach for predicting protein secondary structures. Different from the ...

SESSION: Privacy & security
Article
Assuring privacy when big brother is watching

Homeland security measures are increasing the amount of data collected, processed and mined. At the same time, owners of the data raised legitimate concern about their privacy and potential abuses of the data. Privacy-preserving data mining techniques ...

Article
Dynamic inference control

An inference problem exists in a multilevel database if knowledge of some objects in the database allows information with a higher security level to be inferred. Many such inferences may be prevented prior to any query processing by raising the security ...

Contributors
  • Rensselaer Polytechnic Institute
  • IBM Thomas J. Watson Research Center

Recommendations