Keyword: parallel databases : Search

panel

Future of Database System Architectures

SIGMOD '23: Companion of the 2023 International Conference on Management of DataPages 261–262https://doi.org/10.1145/3555041.3589360

Over the past two decades, we have experienced major technology disruptions on multiple fronts, none bigger than the emergence of cloud computing, which has led to fundamental changes in how database software is architected. We are seeing several new ...

research-article

Public Access

Breaking BAD: a data serving vision for big active data

DEBS '16: Proceedings of the 10th ACM International Conference on Distributed and Event-based SystemsPages 181–186https://doi.org/10.1145/2933267.2933313

Virtually all of today's Big Data systems are passive in nature. Here we describe a project to shift Big Data platforms from passive to active. We detail a vision for a scalable system that can continuously and reliably capture Big Data to enable timely ...

article

Horizontal partitioning method for test verification in parallel database systems

Feras Ahmad Hanandeh

International Journal of Advanced Intelligence Paradigms (IJAIP), Volume 9, Issue 1Pages 96–106https://doi.org/10.1504/IJAIP.2017.081182

In parallel database systems the partitioning methods considered in current researches are static. This research paper presents a partitioning method to divide the database relations into dynamic horizontal partitions. Every partition contains some ...

research-article

Thrifty: Offering Parallel Database as a Service using the Shared-Process Approach

SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of DataPages 1063–1068https://doi.org/10.1145/2723372.2735352

Recently, Amazon has announced Redshift, a Parallel-Database-as-a Service (PDaaS). Redshift adopts the "virtual cluster" approach to implement multitenancy, which has the merit of hard isolation among tenants (i.e., tenants do not interfere even when ...

Article

Implementing the Palomar Transient Factory Real-Time Detection Pipeline in GLADE: Results and Observations

DNIS 2014: Proceedings of the 9th International Workshop on Databases in Networked Information Systems - Volume 8381Pages 53–66https://doi.org/10.1007/978-3-319-05693-7_4

Palomar Transient Factory is a comprehensive detection system for the identification and classification of transient astrophysical objects. The central piece in the identification pipeline is represented by an automated classifier that distinguishes ...

Article

Sampling estimators for parallel online aggregation

BNCOD'13: Proceedings of the 29th British National conference on Big DataPages 204–217https://doi.org/10.1007/978-3-642-39467-6_19

Online aggregation provides estimates to the final result of a computation during the actual processing. The user can stop the computation as soon as the estimate is accurate enough, typically early in the execution. When coupled with parallel ...

research-article

Parallel analytics as a service

SIGMOD '13: Proceedings of the 2013 ACM SIGMOD International Conference on Management of DataPages 25–36https://doi.org/10.1145/2463676.2463714

Recently, massively parallel processing relational database systems (MPPDBs) have gained much momentum in the big data analytic market. With the advent of hosted cloud computing, we envision that the offering of MPPDB-as-a-Service (MPPDBaaS) will become ...

demonstration

Iterative parallel data processing with stratosphere: an inside look

SIGMOD '13: Proceedings of the 2013 ACM SIGMOD International Conference on Management of DataPages 1053–1056https://doi.org/10.1145/2463676.2463693

Iterative algorithms occur in many domains of data analysis, such as machine learning or graph analysis. With increasing interest to run those algorithms on very large data sets, we see a need for new techniques to execute iterations in a massively ...

research-article

SciDB: A Database Management System for Applications with Complex Analytics

Computing in Science and Engineering (IEEECS_CISE-NEW), Volume 15, Issue 3Pages 54–62https://doi.org/10.1109/MCSE.2013.19

A description and discussion of the SciDB database management system focuses on lessons learned, application areas, performance comparisons against other solutions, and additional approaches to managing data and complex analytics.

article

Cogset: a high performance MapReduce engine

Concurrency and Computation: Practice & Experience (CCOMP), Volume 25, Issue 1Pages 2–23https://doi.org/10.1002/cpe.2827

Cogset is a generic and efficient engine for reliable storage and parallel processing of distributed data sets. It supports a number of high-level programming interfaces, including a MapReduce interface compatible with Hadoop. In this paper, we present ...

research-article

Parallel pipelined filter ordering with precedence constraints

ACM Transactions on Algorithms (TALG), Volume 8, Issue 4Article No.: 41, Pages 1–38https://doi.org/10.1145/2344422.2344431

In the parallel pipelined filter ordering problem, we are given a set of n filters that run in parallel. The filters need to be applied to a stream of elements, to determine which elements pass all filters. Each filter has a rate limit r_i on the number ...

Article

High Performance Database Processing

David Taniar

AINA '12: Proceedings of the 2012 IEEE 26th International Conference on Advanced Information Networking and ApplicationsPages 5–6https://doi.org/10.1109/AINA.2012.140

The sizes of databases have seen exponential growth in the past, and such growth is expected to accelerate in the future, with the steady drop in storage cost accompanied by a rapid increase in storage capacity. Many years ago, a terabyte database was ...

research-article

Massively parallel in-database predictions using PMML

PMML '11: Proceedings of the 2011 workshop on Predictive markup language modelingPages 22–27https://doi.org/10.1145/2023598.2023601

Like all open standards, the Predictive Model Markup Language (PMML) enables interoperability and portability in the world of data mining and predictive analytics. This means that models developed in any environment and tool set can be deployed and used ...

research-article

ArrayStore: a storage manager for complex parallel array processing

SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of dataPages 253–264https://doi.org/10.1145/1989323.1989351

We present the design, implementation, and evaluation of ArrayStore, a new storage manager for complex, parallel array processing. ArrayStore builds on prior work in the area of multidimensional data storage, but considers the new problem of supporting ...

Article

A Hybrid Shared-Nothing/Shared-Data Storage Scheme for Large-Scale Data Processing

ISPA '11: Proceedings of the 2011 IEEE Ninth International Symposium on Parallel and Distributed Processing with ApplicationsPages 161–166https://doi.org/10.1109/ISPA.2011.43

Shared-nothing and shared-disk are the two most common storage architectures of parallel databases in the past two decades. Both two types of systems have their own merits for different applications. However, there are no much efforts in investigating ...

Article

Tradeoffs between parallel database systems, Hadoop, and HadoopDB as platforms for petabyte-scale analysis

Daniel J. Abadi

SSDBM'10: Proceedings of the 22nd international conference on Scientific and statistical database managementPages 1–3

As the market demand for analyzing data sets of increasing variety and scale continues to explode, the software options for performing this analysis are beginning to proliferate. No fewer than a dozen companies have launched in the past few years that ...

research-article

Toward visual analysis of ensemble data sets

UltraVis '09: Proceedings of the 2009 Workshop on Ultrascale VisualizationPages 48–53https://doi.org/10.1145/1838544.1838551

The rapid and continuing increase in available high-performance computing resources has driven simulation-based science in two directions. First, the simulations themselves are growing more complex, whether in the fidelity of the models, spatiotemporal ...

research-article

Dependency-aware reordering for parallelizing query optimization in multi-core CPUs

SIGMOD '09: Proceedings of the 2009 ACM SIGMOD International Conference on Management of dataPages 45–58https://doi.org/10.1145/1559845.1559853

The state of the art commercial query optimizers employ cost-based optimization and exploit dynamic programming (DP) to find the optimal query execution plan (QEP) without evaluating redundant sub-plans. The number of alternative QEPs enumerated by the ...

Article

Efficient, Chunk-Replicated Node Partitioned Data Warehouses

Pedro Furtado

ISPA '08: Proceedings of the 2008 IEEE International Symposium on Parallel and Distributed Processing with ApplicationsPages 578–583https://doi.org/10.1109/ISPA.2008.86

Much has been said about processing efficiently data in parallel database servers, and some data warehouse applications must process in the order of tens to hundreds of Gigabytes efficiently. Yet, there is no effective approach targeted at using non-...

Article

Progressive optimization in a shared-nothing parallel database

SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of dataPages 809–820https://doi.org/10.1145/1247480.1247569

Commercial enterprise data warehouses are typically implemented on parallel databases due to the inherent scalability and performance limitation of a serial architecture. Queries used in such large data warehouses can contain complex predicates as well ...

Applied Filters

People

Names

Institutions

Authors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Future of Database System Architectures

Breaking BAD: a data serving vision for big active data

Horizontal partitioning method for test verification in parallel database systems

Thrifty: Offering Parallel Database as a Service using the Shared-Process Approach

Implementing the Palomar Transient Factory Real-Time Detection Pipeline in GLADE: Results and Observations

Upcoming Conferences

Sampling estimators for parallel online aggregation

Parallel analytics as a service

Iterative parallel data processing with stratosphere: an inside look

SciDB: A Database Management System for Applications with Complex Analytics

Cogset: a high performance MapReduce engine

Parallel pipelined filter ordering with precedence constraints

High Performance Database Processing

Massively parallel in-database predictions using PMML

ArrayStore: a storage manager for complex parallel array processing

A Hybrid Shared-Nothing/Shared-Data Storage Scheme for Large-Scale Data Processing

Tradeoffs between parallel database systems, Hadoop, and HadoopDB as platforms for petabyte-scale analysis

Toward visual analysis of ensemble data sets

Dependency-aware reordering for parallelizing query optimization in multi-core CPUs

Efficient, Chunk-Replicated Node Partitioned Data Warehouses

Progressive optimization in a shared-nothing parallel database

Applied Filters

People

Names

Institutions

Authors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Save to Binder

Upcoming Conferences