Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Factorized Databases

Published: 28 September 2016 Publication History

Abstract

This paper overviews factorized databases and their application to machine learning. The key observation underlying this work is that state-of-the-art relational query processing entails a high degree of redundancy in the computation and representation of query results. This redundancy can be avoided and is not necessary for subsequent analytics such as learning regression models.

References

[1]
S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995.
[2]
A. Atserias, M. Grohe, and D. Marx. Size bounds and query plans for relational joins. In FOCS, pages 739--748, 2008.
[3]
N. Bakibayev, T. Kociský, D. Olteanu, and J. Závodný. Aggregation and ordering in factorised databases. PVLDB, 6(14):1990--2001, 2013.
[4]
N. Bakibayev, D. Olteanu, and J. Závodný. FDB: A query engine for factorised relational databases. PVLDB, 5(11):1232--1243, 2012.
[5]
C. M. Bishop. Pattern Recognition and Machine Learning (Information Science and Statistics), 2006.
[6]
R. Ciucanu and D. Olteanu. Worst-case optimal join at a time. Technical report, Oxford, Nov 2015.
[7]
G. Gottlob. On minimal constraint networks. Artif. Intell., 191-192:42--60, 2012.
[8]
T. J. Green, G. Karvounarakis, and V. Tannen. Provenance semirings. In PODS, pages 31--40, 2007.
[9]
M. A. Khamis, H. Q. Ngo, and A. Rudra. FAQ: Questions Asked Frequently, CoRR:1504.04044, April 2015.
[10]
P. Koutris, P. Beame, and D. Suciu. Worst-case optimal algorithms for parallel query processing. In ICDT, pages 8:1--8:18, 2016.
[11]
H. Q. Ngo, E. Porat, C. Ré, and A. Rudra. Worst-case optimal join algorithms. In PODS, pages 37--48, 2012.
[12]
D. Olteanu and J. Huang. Using OBDDs for efficient query evaluation on probabilistic databases. In SUM, pages 326--340, 2008.
[13]
D. Olteanu, C. Koch, and L. Antova. World-set decompositions: Expressiveness and efficient algorithms. TCS, 403(2-3):265--284, 2008.
[14]
D. Olteanu and M. Schleich. F: Regression models over factorized views. PVLDB, 9(10), 2016.
[15]
D. Olteanu and J. Závodný. On factorisation of provenance polynomials. In TaPP, 2011.
[16]
D. Olteanu and J. Závodný. Factorised representations of query results: size bounds and readability. In ICDT, pages 285--298, 2012.
[17]
D. Olteanu and J. Závodný. Size bounds for factorised representations of query results. TODS, 40(1):2, 2015.
[18]
J. Pearl. Probabilistic reasoning in intelligent systems: Networks of plausible inference. Morgan Kaufmann, 1989.
[19]
M. Schleich, D. Olteanu, and R. Ciucanu. Learning linear regression models over factorized joins. In SIGMOD, 2016.
[20]
P. Sen, A. Deshpande, and L. Getoor. Read-once functions and query evaluation in probabilistic databases. PVLDB, 3(1):1068--1079, 2010.
[21]
J. Shute, R. Vingralek, B. Samwel, B. Handy, C. Whipkey, E. Rollins, M. Oancea, K. Littlefield, D. Menestrina, S. Ellner, J. Cieslewicz, I. Rae, T. Stancescu, and H. Apte. F1: A distributed SQL database that scales. PVLDB, 6(11):1068--1079, 2013.
[22]
Simons Institute for the Theory of Computing, UC Berkeley. Workshop on "Succinct Data Representations and Applications", September 2013.
[23]
T. L. Veldhuizen. Triejoin: A simple, worst-case optimal join algorithm. In ICDT, pages 96--106, 2014.

Cited By

View all
  • (2024)SplitDF: Splitting Dataframes for Memory-Efficient Data AnalysisProceedings of the VLDB Endowment10.14778/3665844.366584917:9(2175-2184)Online publication date: 1-May-2024
  • (2024)Minimally Factorizing the Provenance of Self-join Free Conjunctive QueriesProceedings of the ACM on Management of Data10.1145/36516052:2(1-24)Online publication date: 14-May-2024
  • (2024)The Ring: Worst-case Optimal Joins in Graph Databases using (Almost) No Extra SpaceACM Transactions on Database Systems10.1145/364482449:2(1-45)Online publication date: 23-Mar-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGMOD Record
ACM SIGMOD Record  Volume 45, Issue 2
June 2016
66 pages
ISSN:0163-5808
DOI:10.1145/3003665
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 September 2016
Published in SIGMOD Volume 45, Issue 2

Check for updates

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)87
  • Downloads (Last 6 weeks)7
Reflects downloads up to 25 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)SplitDF: Splitting Dataframes for Memory-Efficient Data AnalysisProceedings of the VLDB Endowment10.14778/3665844.366584917:9(2175-2184)Online publication date: 1-May-2024
  • (2024)Minimally Factorizing the Provenance of Self-join Free Conjunctive QueriesProceedings of the ACM on Management of Data10.1145/36516052:2(1-24)Online publication date: 14-May-2024
  • (2024)The Ring: Worst-case Optimal Joins in Graph Databases using (Almost) No Extra SpaceACM Transactions on Database Systems10.1145/364482449:2(1-45)Online publication date: 23-Mar-2024
  • (2024)Pearl: A Multi-Derivation Approach to Efficient CFL-Reachability SolvingIEEE Transactions on Software Engineering10.1109/TSE.2024.343768450:9(2379-2397)Online publication date: 5-Aug-2024
  • (2024)Givens rotations for QR decomposition, SVD and PCA over database joinsThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-023-00818-933:4(1013-1037)Online publication date: 1-Jul-2024
  • (2023)JoinBoost: Grow Trees over Normalized Data Using Only SQLProceedings of the VLDB Endowment10.14778/3611479.361150916:11(3071-3084)Online publication date: 24-Aug-2023
  • (2023)ADOPT: Adaptively Optimizing Attribute Orders for Worst-Case Optimal Join Algorithms via Reinforcement LearningProceedings of the VLDB Endowment10.14778/3611479.361148916:11(2805-2817)Online publication date: 24-Aug-2023
  • (2023)Free Join: Unifying Worst-Case Optimal and Traditional JoinsProceedings of the ACM on Management of Data10.1145/35892951:2(1-23)Online publication date: 20-Jun-2023
  • (2023)Space-Time Tradeoffs for Conjunctive Queries with Access PatternsProceedings of the 42nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3584372.3588675(59-68)Online publication date: 18-Jun-2023
  • (2023)Tractable Orders for Direct Access to Ranked Answers of Conjunctive QueriesACM Transactions on Database Systems10.1145/357851748:1(1-45)Online publication date: 13-Mar-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media