Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleDecember 2024
A graph pattern mining framework for large graphs on GPU: A Graph Pattern Mining...
The VLDB Journal — The International Journal on Very Large Data Bases (VLDB), Volume 34, Issue 1https://doi.org/10.1007/s00778-024-00883-8AbstractGraph pattern mining (GPM) is an important problem in graph processing. There are many parallel frameworks for GPM, many of which suffer from low performance. GPU is a powerful option for accelerating graph processing, but parallel GPM algorithms ...
- research-articleJune 2024
Performant almost-latch-free data structures using epoch protection in more depth
The VLDB Journal — The International Journal on Very Large Data Bases (VLDB), Volume 33, Issue 6Pages 1793–1812https://doi.org/10.1007/s00778-024-00859-8AbstractMulti-core scalability presents a major implementation challenge for data system designers today. Traditional methods such as latching no longer scale in today’s highly parallel architectures. While the designer can make use of techniques such as ...
- research-articleDecember 2023
RCBench: an RDMA-enabled transaction framework for analyzing concurrency control algorithms
- Hongyao Zhao,
- Jingyao Li,
- Wei Lu,
- Qian Zhang,
- Wanqing Yang,
- Jiajia Zhong,
- Meihui Zhang,
- Haixiang Li,
- Xiaoyong Du,
- Anqun Pan
The VLDB Journal — The International Journal on Very Large Data Bases (VLDB), Volume 33, Issue 2Pages 543–567https://doi.org/10.1007/s00778-023-00821-0AbstractDistributed transaction processing over the TCP/IP network suffers from the weak transaction scalability problem, i.e., its performance drops significantly when the number of involved data nodes per transaction increases. Although quite a few of ...
-
- research-articleSeptember 2023
A survey on transactional stream processing
The VLDB Journal — The International Journal on Very Large Data Bases (VLDB), Volume 33, Issue 2Pages 451–479https://doi.org/10.1007/s00778-023-00814-zAbstractTransactional stream processing (TSP) strives to create a cohesive model that merges the advantages of both transactional and stream-oriented guarantees. Over the past decade, numerous endeavors have contributed to the evolution of TSP solutions, ...
- correctionJune 2022
- research-articleJanuary 2022
Data distribution debugging in machine learning pipelines
The VLDB Journal — The International Journal on Very Large Data Bases (VLDB), Volume 31, Issue 5Pages 1103–1126https://doi.org/10.1007/s00778-021-00726-wAbstractMachine learning (ML) is increasingly used to automate impactful decisions, and the risks arising from this widespread use are garnering attention from policy makers, scientists, and the media. ML applications are often brittle with respect to ...
- research-articleAugust 2021
RDFFrames: knowledge graph access for machine learning tools
The VLDB Journal — The International Journal on Very Large Data Bases (VLDB), Volume 31, Issue 2Pages 321–346https://doi.org/10.1007/s00778-021-00690-5AbstractKnowledge graphs represented as RDF datasets are integral to many machine learning applications. RDF is supported by a rich ecosystem of data management systems and tools, most notably RDF database systems that provide a SPARQL query interface. ...
- research-articleAugust 2021
PrefixFPM: a parallel framework for general-purpose mining of frequent and closed patterns
The VLDB Journal — The International Journal on Very Large Data Bases (VLDB), Volume 31, Issue 2Pages 253–286https://doi.org/10.1007/s00778-021-00687-0AbstractA frequent pattern is a substructure that appears in a database with frequency (aka. support) no less than a user-specified threshold, while a closed pattern is one that has no super-pattern that has the same support. Here, a substructure can ...
- research-articleAugust 2021
G-thinker: a general distributed framework for finding qualified subgraphs in a big graph with load balancing
The VLDB Journal — The International Journal on Very Large Data Bases (VLDB), Volume 31, Issue 2Pages 287–320https://doi.org/10.1007/s00778-021-00688-zAbstractFinding from a big graph those subgraphs that satisfy certain conditions is useful in many applications such as community detection and subgraph matching. These problems have a high time complexity, but existing systems that attempt to scale them ...
- research-articleJune 2021
Tidy Tuples and Flying Start: fast compilation and fast execution of relational queries in Umbra
The VLDB Journal — The International Journal on Very Large Data Bases (VLDB), Volume 30, Issue 5Pages 883–905https://doi.org/10.1007/s00778-020-00643-4AbstractAlthough compiling queries to efficient machine code has become a common approach for query execution, a number of newly created database system projects still refrain from using compilation. It is sometimes claimed that the intricacies of code ...
- research-articleMay 2021
Formal semantics and high performance in declarative machine learning using Datalog
The VLDB Journal — The International Journal on Very Large Data Bases (VLDB), Volume 30, Issue 5Pages 859–881https://doi.org/10.1007/s00778-021-00665-6AbstractWith an escalating arms race to adopt machine learning (ML) in diverse application domains, there is an urgent need to support declarative machine learning over distributed data platforms. Toward this goal, a new framework is needed where users ...
- research-articleDecember 2019
DB: bolt-on versioning for relational databases (extended version)
- research-articleMarch 2022
Interleaving with coroutines: a systematic and practical approach to hide memory latency in index joins
The VLDB Journal — The International Journal on Very Large Data Bases (VLDB), Volume 28, Issue 4Pages 451–471https://doi.org/10.1007/s00778-018-0533-6AbstractIndex joins present a case of pointer-chasing code that causes data cache misses. In principle, we can hide these cache misses by overlapping them with computation: The lookups involved in an index join are parallel tasks whose execution can be ...
- articleDecember 2018
Generating custom code for efficient query execution on heterogeneous processors
The VLDB Journal — The International Journal on Very Large Data Bases (VLDB), Volume 27, Issue 6Pages 797–822https://doi.org/10.1007/s00778-018-0512-yProcessor manufacturers build increasingly specialized processors to mitigate the effects of the power wall in order to deliver improved performance. Currently, database engines have to be manually optimized for each processor which is a costly and ...
- articleOctober 2012
SCOPE: parallel databases meet MapReduce
The VLDB Journal — The International Journal on Very Large Data Bases (VLDB), Volume 21, Issue 5Pages 611–636https://doi.org/10.1007/s00778-012-0280-zCompanies providing cloud-scale data services have increasing needs to store and analyze massive data sets, such as search logs, click streams, and web graph data. For cost and performance reasons, processing is typically done on large clusters of tens ...
- articleOctober 2012
On the optimization of schedules for MapReduce workloads in the presence of shared scans
- Joel Wolf,
- Andrey Balmin,
- Deepak Rajan,
- Kirsten Hildrum,
- Rohit Khandekar,
- Sujay Parekh,
- Kun-Lung Wu,
- Rares Vernica
The VLDB Journal — The International Journal on Very Large Data Bases (VLDB), Volume 21, Issue 5Pages 589–609https://doi.org/10.1007/s00778-012-0279-5We consider MapReduce clusters designed to support multiple concurrent jobs, concentrating on environments in which the number of distinct datasets is modest relative to the number of jobs. In such scenarios, many individual datasets are likely to be ...
- articleApril 2010
A framework for testing DBMS features
The VLDB Journal — The International Journal on Very Large Data Bases (VLDB), Volume 19, Issue 2Pages 203–230https://doi.org/10.1007/s00778-009-0157-yTesting a specific feature of a DBMS requires controlling the inputs and outputs of the operators in the query execution plan. However, that is practically difficult to achieve because the inputs/outputs of a query depend on the content of the test ...