Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
Volume 14, Issue 10June 2021
Reflects downloads up to 14 Oct 2024Bibliometrics
Skip Table Of Content Section
research-article
How to design robust algorithms using noisy comparison Oracle

Metric based comparison operations such as finding maximum, nearest and farthest neighbor are fundamental to studying various clustering techniques such as k-center clustering and agglomerative hierarchical clustering. These techniques crucially rely on ...

research-article
SAND: streaming subsequence anomaly detection

With the increasing demand for real-time analytics and decision making, anomaly detection methods need to operate over streams of values and handle drifts in data distribution. Unfortunately, existing approaches have severe limitations: they either ...

research-article
Optimizing fitness-for-use of differentially private linear queries

In practice, differentially private data releases are designed to support a variety of applications. A data release is fit for use if it meets target accuracy requirements for each application. In this paper, we consider the problem of answering linear ...

research-article
Cryptanalysis of an encrypted database in SIGMOD '14

Encrypted database is an innovative technology proposed to solve the data confidentiality issue in cloud-based DB systems. It allows a data owner to encrypt its database before uploading it to the service provider; and it allows the service provider to ...

research-article
Unconstrained submodular maximization with modular costs: tight approximation and application to profit maximization

Given a set V, the problem of unconstrained submodular maximization with modular costs (USM-MC) asks for a subset SV that maximizes f(S) - c(S), where f is a non-negative, monotone, and submodular function that gauges the utility of S, and c is a non-...

research-article
Distributed deep learning on data systems: a comparative analysis of approaches

Deep learning (DL) is growing in popularity for many data analytics applications, including among enterprises. Large business-critical datasets in such settings typically reside in RDBMSs or other data systems. The DB community has long aimed to bring ...

research-article
PR-sketch: monitoring per-key aggregation of streaming data with nearly full accuracy

Computing per-key aggregation is indispensable in streaming data analysis formulated as two phases, an update phase and a recovery phase. As the size and speed of data streams rise, accurate per-key information is useful in many applications like ...

research-article
Tensors: an abstraction for general data processing

Deep Learning (DL) has created a growing demand for simpler ways to develop complex models and efficient ways to execute them. Thus, a significant effort has gone into frameworks like PyTorch or TensorFlow to support a variety of DL models and run ...

research-article
Budget sharing for multi-analyst differential privacy

Large organizations that collect data about populations (like the US Census Bureau) release summary statistics that are used by multiple stakeholders for resource allocation and policy making problems. These organizations are also legally required to ...

research-article
In the land of data streams where synopses are missing, one framework to bring them all

In pursuit of real-time data analysis, approximate summarization structures, i.e., synopses, have gained importance over the years. However, existing stream processing systems, such as Flink, Spark, and Storm, do not support synopses as first class ...

research-article
Data acquisition for improving machine learning models

The vast advances in Machine Learning (ML) over the last ten years have been powered by the availability of suitably prepared data for training purposes. The future of ML-enabled enterprise hinges on data. As such, there is already a vibrant market ...

research-article
Efficiently answering reachability and path queries on temporal bipartite graphs

Bipartite graphs are naturally used to model relationships between two different types of entities, such as people-location, author-paper, and customer-product. When modeling real-world applications like disease outbreaks, edges are often enriched with ...

research-article
Preference queries over taxonomic domains

When composing multiple preferences characterizing the most suitable results for a user, several issues may arise. Indeed, preferences can be partially contradictory, suffer from a mismatch with the level of detail of the actual data, and even lack ...

research-article
Revisiting the design of LSM-tree Based OLTP storage engine with persistent memory

The recent byte-addressable and large-capacity commercialized persistent memory (PM) is promising to drive database as a service (DBaaS) into unchartered territories. This paper investigates how to leverage PMs to revisit the conventional LSM-tree based ...

research-article
Kamino: constraint-aware differentially private data synthesis

Organizations are increasingly relying on data to support decisions. When data contains private and sensitive information, the data owner often desires to publish a synthetic database instance that is similarly useful as the true data, while ensuring ...

research-article
Towards cost-effective and elastic cloud database deployment via memory disaggregation

It is challenging for cloud-native relational databases to meet the ever-increasing needs of scaling compute and memory resources independently and elastically. The recent emergence of memory disaggregation architecture, relying on high-speed RDMA ...

research-article
Dual-objective fine-tuning of BERT for entity matching

An increasing number of data providers have adopted shared numbering schemes such as GTIN, ISBN, DUNS, or ORCID numbers for identifying entities in the respective domain. This means for data integration that shared identifiers are often available for a ...

Subjects

Comments