SIGMOD: Vol 26, No 2

Volume 26, Issue 2June 1997

Volume 26, Issue 2

June 1997

Editor:

Joan M. Peckham
Univ. of Rhode Island, Kingston

Publisher:

Association for Computing Machinery
New York
NY
United States

ISSN:0163-5808

Tags:

Bibliometrics

Select All

Export Citations Save to Binder

article

Free

Fast parallel similarity search in multimedia databases

Pages 1–12https://doi.org/10.1145/253262.253263

Most similarity search techniques map the data objects into some high-dimensional feature space. The similarity search then corresponds to a nearest-neighbor search in the feature space which is computationally very intensive. In this paper, we present ...

article

Free

Similarity-based queries for time series data

Pages 13–25https://doi.org/10.1145/253262.253264

We study a set of linear transformations on the Fourier series representation of a sequence that can be used as the basis for similarity queries on time-series data. We show that our set of transformations is rich enough to formulate operations such as ...

article

Free

Meaningful change detection in structured data

Pages 26–37https://doi.org/10.1145/253262.253266

Detecting changes by comparing data snapshots is an important requirement for difference queries, active databases, and version and configuration management. In this paper we focus on detecting meaningful changes in hierarchically structured data, such ...

article

Free

Improved query performance with variant indexes

Pages 38–49https://doi.org/10.1145/253262.253268

The read-mostly environment of data warehousing makes it possible to use more complex indexes to speed up queries than in situations where concurrent updates are present. The current paper presents a short review of current indexing technology, ...

article

Free

Highly concurrent cache consistency for indices in client-server database systems

Pages 50–61https://doi.org/10.1145/253262.253269

In this paper, we present four approaches to providing highly concurrent B⁺-tree indices in the context of a data-shipping, client-server OODBMS architecture. The first performs all index operations at the server, while the other approaches support ...

article

Free

Concurrency and recovery in generalized search trees

Pages 62–72https://doi.org/10.1145/253262.253272

This paper presents general algorithms for concurrency control in tree-based access methods as well as a recovery protocol and a mechanism for ensuring repeatable read. The algorithms are developed in the context of the Generalized Search Tree (GiST) ...

article

Free

Range queries in OLAP data cubes

Pages 73–88https://doi.org/10.1145/253262.253274

A range query applies an aggregation operation over all selected cells of an OLAP data cube where the selection is specified by providing ranges of values for numeric dimensions. We present fast algorithms for range queries for two types of aggregation ...

article

Free

Cubetree: organization of and bulk incremental updates on the data cube

Pages 89–99https://doi.org/10.1145/253262.253276

The data cube is an aggregate operator which has been shown to be very powerful for On Line Analytical Processing (OLAP) in the context of data warehousing. It is, however, very expensive to compute, access, and maintain. In this paper we define the “...

article

Free

Maintenance of data cubes and summary tables in a warehouse

Pages 100–111https://doi.org/10.1145/253262.253277

Data warehouses contain large amounts of information, often collected from a variety of independent sources. Decision-support functions in a warehouse, such as on-line analytical processing (OLAP), involve hundreds of complex aggregate queries over ...

article

Free

Database buffer size investigation for OLTP workloads

Pages 112–122https://doi.org/10.1145/253262.253279

It is generally accepted that On-Line Transaction Processing (OLTP) systems benefit from large database memory buffers. As enterprise database systems become larger and more complex, hardware vendors are building increasingly large systems capable of ...

article

Free

Database performance in the real world: TPC-D and SAP R/3

Pages 123–134https://doi.org/10.1145/253262.253280

Traditionally, database systems have been evaluated in isolation on the basis of standardized benchmarks (e.g., Wisconsin, TPC-C, TPC-D). We argue that very often such a performance analysis does not reflect the actual use of the DBMSs in the “real ...

article

Free

The BUCKY object-relational benchmark

Pages 135–146https://doi.org/10.1145/253262.253283

According to various trade journals and corporate marketing machines, we are now on the verge of a revolution—the object-relational database revolution. Since we believe that no one should face a revolution without appropriate armaments, this paper ...

article

Free

The STRIP rule system for efficiently maintaining derived data

Pages 147–158https://doi.org/10.1145/253262.253287

Derived data is maintained in a database system to correlate and summarize base data which records real world facts. As base data changes, derived data needs to be recomputed. This is often implemented by writing active rules that are triggered by ...

article

Free

An array-based algorithm for simultaneous multidimensional aggregates

Pages 159–170https://doi.org/10.1145/253262.253288

Computing multiple related group-bys and aggregates is one of the core operations of On-Line Analytical Processing (OLAP) applications. Recently, Gray et al. [GBLP95] proposed the “Cube” operator, which computes group-by aggregations over all possible ...

article

Free

Online aggregation

Pages 171–182https://doi.org/10.1145/253262.253291

Aggregation in traditional database systems is performed in batch mode: a query is submitted, the system processes a large volume of data over a long period of time, and, eventually, the final answer is returned. This archaic approach is frustrating to ...

article

Free

Balancing push and pull for data broadcast

Pages 183–194https://doi.org/10.1145/253262.253293

The increasing ability to interconnect computers through internet-working, wireless networks, high-bandwidth satellite, and cable networks has spawned a new class of information-centered applications based on data dissemination. These applications ...

article

Free

InfoSleuth: agent-based semantic integration of information in open and dynamic environments

Pages 195–206https://doi.org/10.1145/253262.253294

The goal of the InfoSleuth project at MCC is to exploit and synthesize new technologies into a unified system that retrieves and processes information in an ever-changing network of information sources. InfoSleuth has its roots in the Carnot project at ...

article

Free

STARTS: Stanford proposal for Internet meta-searching

Pages 207–218https://doi.org/10.1145/253262.253299

Document sources are available everywhere, both within the internal networks of organizations and on the Internet. Even individual organizations use search engines from different vendors to index their internal document collections. These search engines ...

article

Free

On saying “Enough already!” in SQL

Pages 219–230https://doi.org/10.1145/253262.253302

In this paper, we study a simple SQL extension that enables query writers to explicitly limit the cardinality of a query result. We examine its impact on the query optimization and run-time execution components of a relational DBMS, presenting two ...

article

Free

A framework for implementing hypothetical queries

Pages 231–242https://doi.org/10.1145/253262.253304

Previous approaches to supporting hypothetical queries have been “eager”: some representation of the hypothetical state (or the corresponding delta) is materialized, and query evaluation is filtered through that representation. This paper develops a ...

article

Free

High-performance sorting on networks of workstations

Pages 243–254https://doi.org/10.1145/253262.253322

We report the performance of NOW-Sort, a collection of sorting implementations on a Network of Workstations (NOW). We find that parallel sorting on a NOW is competitive to sorting on the large-scale SMPs that have traditionally held the performance ...

article

Free

Dynamic itemset counting and implication rules for market basket data

Pages 255–264https://doi.org/10.1145/253262.253325

We consider the problem of analyzing market-basket data and present several important contributions. First, we present a new algorithm for finding large itemsets which uses fewer passes over the data than classic algorithms, and yet uses fewer candidate ...

article

Free

Beyond market baskets: generalizing association rules to correlations

Pages 265–276https://doi.org/10.1145/253262.253327

One of the most well-studied problems in data mining is mining for association rules in market basket data. Association rules, whose significance is measured via support and confidence, are intended to identify rules of the type, “A customer purchasing ...

article

Free

Scalable parallel data mining for association rules

Pages 277–288https://doi.org/10.1145/253262.253330

One of the important problems in data mining is discovering association rules from databases of transactions where each transaction consists of a set of items. The most time consuming operation in this discovery process is the computation of the ...

article

Free

Efficiently supporting ad hoc queries in large datasets of time sequences

Pages 289–300https://doi.org/10.1145/253262.253332

Ad hoc querying is difficult on very large datasets, since it is usually not possible to have the entire dataset on disk. While compression can be used to decrease the size of the dataset, compressed data is notoriously difficult to index or access.

In ...

article

Free

DEVise: integrated querying and visual exploration of large datasets

Pages 301–312https://doi.org/10.1145/253262.253335

DEVise is a data exploration system that allows users to easily develop, browse, and share visual presentation of large tabular datasets (possibly containing or referencing multimedia objects) from several sources. The DEVise framework is being ...

article

Free

Partitioned garbage collection of a large object store

Pages 313–323https://doi.org/10.1145/253262.253338

We present new techniques for efficient garbage collection in a large persistent object store. The store is divided into partitions that are collected independently using information about inter-partition references. This information is maintained on ...

article

Free

Size separation spatial join

Pages 324–335https://doi.org/10.1145/253262.253340

We introduce a new algorithm to compute the spatial join of two or more spatial data sets, when indexes are not available on them. Size Separation Spatial Join (S³J) imposes a hierarchical decomposition of the data space and, in contrast with previous ...

article

Free

Building a scaleable geo-spatial DBMS: technology, implementation, and evaluation

Pages 336–347https://doi.org/10.1145/253262.253342

This paper presents a number of new techniques for parallelizing geo-spatial database systems and discusses their implementation in the Paradise object-relational database system. The effectiveness of these techniques is demonstrated using a variety of ...

article

Free

A toolkit for negotiation support interfaces to multi-dimensional data

Pages 348–356https://doi.org/10.1145/253262.253344

CoDecide is an experimental user interface toolkit that offers an extension to spreadsheet concepts specifically geared towards support for cooperative analysis of the kinds of multi-dimensional data encountered in data warehousing. It is distinguished ...

Sections

Save to Binder

Subjects

Comments