Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3410566.3410608acmotherconferencesArticle/Chapter ViewAbstractPublication PagesideasConference Proceedingsconference-collections
research-article

Benchmarking a distributed database design that supports patient cohort identification

Published: 25 August 2020 Publication History

Abstract

In this article we present the implementation and benchmarking of a medical information system on top of a distributed relational database system. We enhanced a distributed database system with the implementation of a clustering (based on similarity of disease terms) that induces a primary horizontal fragmentation of a data table and derived fragmentations of secondary tables. With our clustering-based fragmentation, data locality for similarity-based query answering is ensured so that data do not have to be sent unnecessarily over the network. In our benchmark we show that we achieve a significant efficiency gain when retrieving all relevant related answers.

References

[1]
S. Gombar, A. Callahan, R. Califf, R. Harrington, and N. H. Shah. It is time to learn from patients like mine. npj Digital Medicine, 2(1):1--3, 2019.
[2]
T. F. Gonzalez. Clustering to minimize the maximum intercluster distance. Theoretical Computer Science, 38:293 -- 306, 1985.
[3]
K. Inoue and L. Wiese. Generalizing conjunctive queries for informative answers. In International Conference on Flexible Query Answering Systems, pages 1--12. Springer, 2011.
[4]
A. E. Johnson, T. J. Pollard, L. Shen, H. L. Li-Wei, M. Feng, M. Ghassemi, B. Moody, P. Szolovits, L. A. Celi, and R. G. Mark. MIMIC-III, a freely accessible critical care database. Scientific data, 3(1):1--9, 2016.
[5]
V. Kantere. Query similarity for approximate query answering. In International Conference on Database and Expert Systems Applications, pages 355--367. Springer, 2016.
[6]
Y. Lu, A. Shanbhag, A. Jindal, and S. Madden. Adaptdb: adaptive partitioning for distributed joins. Proceedings of the VLDB Endowment, 10(5):589--600, 2017.
[7]
D. Martinenghi and R. Torlone. Taxonomy-based relaxation of query answering in relational databases. The VLDB Journal, 23(5):747--769, 2014.
[8]
S. Murphy and A. Wilcox. Mission and sustainability of informatics for integrating biology and the bedside (i2b2). eGEMs, 2(2), 2014.
[9]
National Library of Medicine. Medical subject headings, Nov 2019.
[10]
S. Navathe, S. Ceri, G. Wiederhold, and J. Dou. Vertical partitioning algorithms for database design. ACM Transactions on Database Systems (TODS), 9(4):680--710, 1984.
[11]
M. T. Özsu and P. Valduriez. Principles of Distributed Database Systems. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1991.
[12]
A. Poulovassilis. Applications of flexible querying to graph data. In Graph Data Management, Data-Centric Systems and Applications, pages 97--142. Springer, 2018.
[13]
K. Tan. Distributed database systems. In Encyclopedia of Database Systems (2nd ed.). Springer, 2018.
[14]
A. Tashkandi, I. Wiese, and L. Wiese. Efficient in-database patient similarity analysis for personalized medical decision support systems. Big data research, 13:52--64, 2018.
[15]
I. Wiese, N. Sarna, L. Wiese, A. Tashkandi, and U. Sax. Concept acquisition and improved in-database similarity analysis for medical data. Distributed and Parallel Databases, pages 1--25, 2018.
[16]
L. Wiese. Advanced Data Management for SQL, NoSQL, Cloud and Distributed Databases. DeGruyter/Oldenbourg, 2015.
[17]
L. Wiese, A. O. Schmitt, and M. Gültas. Big data technologies for DNA sequencing. In Encyclopedia of Big Data Technologies. Springer, 2019.
[18]
E. Zamanian, C. Binnig, and A. Salama. Locality-aware partitioning in parallel database systems. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pages 17--30, 2015.

Cited By

View all
  • (2023)Security and Privacy Challenges in Distributed Database Management Systems2023 Global Conference on Information Technologies and Communications (GCITC)10.1109/GCITC60406.2023.10426303(1-6)Online publication date: 1-Dec-2023
  • (2021)Load Balanced Semantic Aware Distributed RDF GraphProceedings of the 25th International Database Engineering & Applications Symposium10.1145/3472163.3472167(127-133)Online publication date: 14-Jul-2021

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
IDEAS '20: Proceedings of the 24th Symposium on International Database Engineering & Applications
August 2020
252 pages
ISBN:9781450375030
DOI:10.1145/3410566
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 August 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. distributed database system
  2. relational databases
  3. similarity-based query answering

Qualifiers

  • Research-article

Funding Sources

  • Fraunhofer

Conference

IDEAS 2020

Acceptance Rates

IDEAS '20 Paper Acceptance Rate 27 of 57 submissions, 47%;
Overall Acceptance Rate 74 of 210 submissions, 35%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)37
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Security and Privacy Challenges in Distributed Database Management Systems2023 Global Conference on Information Technologies and Communications (GCITC)10.1109/GCITC60406.2023.10426303(1-6)Online publication date: 1-Dec-2023
  • (2021)Load Balanced Semantic Aware Distributed RDF GraphProceedings of the 25th International Database Engineering & Applications Symposium10.1145/3472163.3472167(127-133)Online publication date: 14-Jul-2021

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media