Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1142351.1142352acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
Article

Principles of dataspace systems

Published: 26 June 2006 Publication History

Abstract

The most acute information management challenges today stem from organizations relying on a large number of diverse, interrelated data sources, but having no means of managing them in a convenient, integrated, or principled fashion. These challenges arise in enterprise and government data management, digital libraries, "smart" homes and personal information management. We have proposed dataspaces as a data management abstraction for these diverse applications and DataSpace Support Platforms (DSSPs) as systems that should be built to provide the required services over dataspaces. Unlike data integration systems, DSSPs do not require full semantic integration of the sources in order to provide useful services. This paper lays out specific technical challenges to realizing DSSPs and ties them to existing work in our field. We focus on query answering in DSSPs, the DSSP's ability to introspect on its content, and the use of human attention to enhance the semantic relationships in a dataspace.

References

[1]
Shaul Dar aand Gadi Entin, Shai Geva, and Eran Palmon. DTL's dataspot: Database exploration using plain language. In Proc. of VLDB, pages 645--649, 1998.]]
[2]
S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995.]]
[3]
Sanjay Agrawal, Surajit Chaudhuri, and Gautam Das. Dbxplorer: A system for keyword-based search over relational databases. In Proc. of ICDE, pages 5--16, 2002.]]
[4]
Sihem Amer-Yahia, Nick Koudas, Amelie Marian, Divesh Srivastava, and David Toman. Structure and content scoring for xml. In Proc. of VLDB, pages 361--372, 2005.]]
[5]
M. Arenas, L. E. Bertossi, and J. Chomicki. Consistent Query Answers in Inconsistent Databases. In Proc. of ACM PODS, 1999.]]
[6]
D. Barbará, H. Garcia-Molina, and D. Porter. The Management of Probabilistic Data. IEEE Trans. Knowl. Data Eng., 1992.]]
[7]
O. Benjelloun, A. Das Sarma, A. Halevy, and J. Widom. ULDBs: Databases with uncertainty and lineage. http://dbpubs.stanford.edu/pub/2005-39, 2005.]]
[8]
D. Bhagwat, L. Chiticariu, W. Tan, and G. Vijayvargiya. An annotation management system for relational databases. Proc. of VLDB, 2004.]]
[9]
Gaurav Bhalotia, Arvind Hulgeri, Charuta Nakhe, Soumen Chakrabarti, and S. Sudarshan. Keyword searching and browsing in databases using BANKS. In Proc. of ICDE, pages 431--440, 2002.]]
[10]
Shawn Bowers, Lois M. L. Delcambre, and David Maier. Superimposed schematics: Introducing e-r structure for in-situ information selections. In ER, pages 90--104, 2002.]]
[11]
P. Buneman, S. Khanna, and W. Tan. Why and where: A charaterization of data provenance. Proc. of ICDT, 2001.]]
[12]
A.K. Chandra and P.M. Merlin. Optimal implementation of conjunctive queries in relational databases. In Proceedings of the Ninth Annual ACM Symposium on Theory of Computing, pages 77--90, 1977.]]
[13]
Surajit Chaudhuri, Raghu Ramakrishnan, and Gerhard Weikum. Integrating db and ir technologies: what is the sound of one hand clapping. In Proc. of CIDR, 2005.]]
[14]
Y. Cui and J. Widom. Lineage tracing for general data warehouse transformations. VLDB Journal, 2003.]]
[15]
Y. Cui, J. Widom, and J. L. Wiener. Tracing the lineage of view data in a warehousing environment. ACM TODS, 2000.]]
[16]
N. Dalvi and D. Suciu. Efficient Query Evaluation on Probabilistic Databases. In Proc. of VLDB, 2004.]]
[17]
N. Dalvi and D. Suciu. Answering Queries from Statistics and Probabilistic Views. In Proc. of VLDB, 2005.]]
[18]
A. Das Sarma, O. Benjelloun, A. Halevy, and J. Widom. Working Models for Uncertain Data. In Proc. of ICDE, April 2006.]]
[19]
Lois M. L. Delcambre, David Maier, Shawn Bowers, Mathew Weaver, Longxing Deng, Paul Gorman, Joan Ash, Mary Lavelle, and Jason Lyman. Bundles in captivity: An application of superimposed information. In Proc. of ICDE, pages 111--120, 2001.]]
[20]
Anhai Doan, Pedro Domingos, and Alon Halevy. Reconciling schemas of disparate data sources: a machine learning approach. In Proc. of SIGMOD, 2001.]]
[21]
Xin Dong and Alon Halevy. A Platform for Personal Information Management and Integration. In Proc. of CIDR, 2005.]]
[22]
Xin (Luna) Dong, Alon Y. Halevy, Jayant Madhavan, Ema Nemes, and Jun Zhang. Similarity search for web services. In Proc. of VLDB, 2004.]]
[23]
S. T. Dumais, E. Cutrell, J. J. Cadiz E., G. Jancke, R. Sarin, and D. C. Robbins. Stuff i've seen: A system for personal information retrieval and re-use. In SIGIR, 2003.]]
[24]
M. Franklin, A. Halevy, and D. Maier. From databases to dataspaces: A new abstraction for information management. Sigmod Record, 34(4):27--33, 2005.]]
[25]
Ariel Fuxman, Elham Fazli, and Renee J. Miller. Conquer: efficient management of inconsistent databases. In Proc. of SIGMOD, pages 155--166, New York, NY, USA, 2005. ACM Press.]]
[26]
Jim Gemmell, Roger Lueder, and Gordon Bell. Living with a lifetime store. In Workshop on Ubiquitous Experience Media, 2003.]]
[27]
Lise Getoor and John Grant. Prl: A logical approach to probabilistic relational models. Machine Learning Journal, 62, 2006.]]
[28]
Google.com. Google base. base.google.com, 2005.]]
[29]
G. Grahne. Dependency Satisfaction in Databases with Incomplete Information. In Proc. of VLDB, 1984.]]
[30]
Lin Guo, Feng Shao, Chavdar Botev, and Jayavel Shanmugasundaram. XRANK: Ranked keyword search over XML documents. In Proc. of SIGMOD, pages 16--27, 2003.]]
[31]
Alon Y. Halevy. Answering queries using views: A survey. VLDB Journal, 10(4), 2001.]]
[32]
Bin He and Kevin Chen-Chuan Chang. Statistical schema integration across the deep web. In Proc. of SIGMOD, 2003.]]
[33]
Vagelis Hristidis, Luis Gravano, and Yannis Papakonstantinou. Efficient ir-style keyword search over relational databases. In Proc. of VLDB, pages 850--861, 2003.]]
[34]
T. Imielinski and W. Lipski Jr. Incomplete Information in Relational Databases. Journal of the ACM, 1984.]]
[35]
Z. G. Ives, N. Khandelwal, A. Kapur, and M. Cakir. Orchestra: Rapid, collaborative sharing of dynamic data. In Proc. of CIDR, 2005.]]
[36]
Phokion Kolaitis. Schema mappings, data exchange, and metadata management. In Proc. of ACM PODS, pages 61--75, 2005.]]
[37]
D. Koller and A. Pfeffer. Probabilistic frame-based systems. In Proceedings of the Fifteenth National Conference on Artificial Intelligence, pages 580--587, Madison, WI, 1998. AAAI Press.]]
[38]
L. V. S. Lakshmanan, N. Leone, R. Ross, and V.S. Subrahmanian. ProbView: A Flexible Probabilistic Database System. ACM TODS, 1997.]]
[39]
Maurizio Lenzerini. Data integration: A theoretical perspective. In Proc. of PODS, 2002.]]
[40]
A. Y. Levy, R. E. Fikes, and S. Sagiv. Speeding up inferences using relevance reasoning: A formalism and algorithms. Artificial Intelligence, 1997.]]
[41]
Alon Y. Levy, Anand Rajaraman, and Joann J. Ordille. Querying heterogeneous information sources using source descriptions. In Proc. of VLDB, pages 251--262, Bombay, India, 1996.]]
[42]
Jayant Madhavan, Philip A. Bernstein, AnHai Doan, and Alon Halevy. Corpus-based schema matching. In Proc. of ICDE, pages 57--68, 2005.]]
[43]
David Maier and Lois M. L. Delcambre. Superimposed information for the internet. In WebDB, pages 1--9, 1999.]]
[44]
R. McCann, A. Doan, A. Kramnik, and V. Varadarajan. Building data integration systems via mass collaboration. In Proc. of the SIGMOD-03 Workshop on the Web and Databases (WebDB-03), 2003.]]
[45]
Sudarshan Murthy, Lois M. L. Delcambre, David Maier, and Shawn Bowers. Putting integrated information in context: Superimposing conceptual models with sparce. In APCCM, pages 71--80, 2004.]]
[46]
Sudarshan Murthy, David Maier, and Lois M. L. Delcambre. Querying bi-level information. In WebDB, pages 7--12, 2004.]]
[47]
Dennis Quan, David Huynh, and David R. Karger. Haystack: a platform for authoring end user semantic web applications. In ISWC, 2003.]]
[48]
S. Sarawagi and A. Bhamidipaty. Interactive deduplication using active learning. In SIGKDD, 2002.]]
[49]
Nicholas E. Taylor and Zachary G. Ives. Reconciling while tolerating disagreement in collaborative data sharing. In Proc. of SIGMOD, 2006.]]
[50]
Luis von Ahn and Laura Dabbish. Labeling images with a computer game. In Proceedings of ACM CHI, Vienna, Austria, 2004.]]
[51]
J. Widom. Trio: A System for Integrated Management of Data, Accuracy, and Lineage. In Proc. of CIDR, 2005.]]

Cited By

View all
  • (2024)Standards for Data Space Building BlocksRemote Sensing10.3390/rs1620382416:20(3824)Online publication date: 14-Oct-2024
  • (2024)Industrial data ecosystems and data spacesElectronic Markets10.1007/s12525-024-00724-034:1Online publication date: 6-Aug-2024
  • (2024)CDMiA: Revealing Impacts of Data Migrations on Schemas in Multi-model SystemsIntelligent Information Systems10.1007/978-3-031-61000-4_14(120-128)Online publication date: 29-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PODS '06: Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
June 2006
382 pages
ISBN:1595933182
DOI:10.1145/1142351
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 June 2006

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data integration
  2. dataspaces
  3. information retrieval and databases
  4. personal information management

Qualifiers

  • Article

Conference

SIGMOD/PODS06

Acceptance Rates

PODS '06 Paper Acceptance Rate 35 of 185 submissions, 19%;
Overall Acceptance Rate 642 of 2,707 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)195
  • Downloads (Last 6 weeks)21
Reflects downloads up to 17 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Standards for Data Space Building BlocksRemote Sensing10.3390/rs1620382416:20(3824)Online publication date: 14-Oct-2024
  • (2024)Industrial data ecosystems and data spacesElectronic Markets10.1007/s12525-024-00724-034:1Online publication date: 6-Aug-2024
  • (2024)CDMiA: Revealing Impacts of Data Migrations on Schemas in Multi-model SystemsIntelligent Information Systems10.1007/978-3-031-61000-4_14(120-128)Online publication date: 29-May-2024
  • (2024)Industrial Data Sharing Ecosystems: An Innovative Value Chain Traceability Platform Based in Data SpacesGood Practices and New Perspectives in Information Systems and Technologies10.1007/978-3-031-60221-4_40(423-432)Online publication date: 13-May-2024
  • (2023)Data Management and Ontology Development for Provenance-Aware Organizations in Linked Data SpaceEuropean Journal of Technic10.36222/ejt.1402149Online publication date: 26-Dec-2023
  • (2023)Cooperating and Competing Digital Twins for Industrie 4.0 in Urban Planning ContextsSci10.3390/sci50400445:4(44)Online publication date: 28-Nov-2023
  • (2023)Incremental schema integration for data wrangling via knowledge graphsSemantic Web10.3233/SW-233347(1-38)Online publication date: 8-Jun-2023
  • (2023)An exploratory approach to data driven knowledge creationJournal of Big Data10.1186/s40537-023-00702-x10:1Online publication date: 6-Mar-2023
  • (2023)Linked Data - The Story So FarLinking the World’s Information10.1145/3591366.3591378(115-143)Online publication date: 5-Sep-2023
  • (2023)Semantics in Dataspaces: Origin and Future DirectionsCompanion Proceedings of the ACM Web Conference 202310.1145/3543873.3587689(1504-1507)Online publication date: 30-Apr-2023
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media