Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleJune 2016
Automatic Entity Recognition and Typing in Massive Text Data
SIGMOD '16: Proceedings of the 2016 International Conference on Management of DataJune 2016, Pages 2235–2239https://doi.org/10.1145/2882903.2912567In today's computerized and information-based society, individuals are constantly presented with vast amounts of text data, ranging from news articles, scientific publications, product reviews, to a wide range of textual information from social media. ...
- research-articleJune 2016
Interactive and Deterministic Data Cleaning
SIGMOD '16: Proceedings of the 2016 International Conference on Management of DataJune 2016, Pages 893–907https://doi.org/10.1145/2882903.2915242We present Falcon, an interactive, deterministic, and declarative data cleaning system, which uses SQL update queries as the language to repair data. Falcon does not rely on the existence of a set of pre-defined data quality rules. On the contrary, it ...
- research-articleJune 2016
Extracting Databases from Dark Data with DeepDive
SIGMOD '16: Proceedings of the 2016 International Conference on Management of DataJune 2016, Pages 847–859https://doi.org/10.1145/2882903.2904442DeepDive is a system for extracting relational databases from dark data: the mass of text, tables, and images that are widely collected and stored but which cannot be exploited by standard relational tools. If the information in dark data --- scientific ...
- research-articleJune 2016
SQLShare: Results from a Multi-Year SQL-as-a-Service Experiment
SIGMOD '16: Proceedings of the 2016 International Conference on Management of DataJune 2016, Pages 281–293https://doi.org/10.1145/2882903.2882957We analyze the workload from a multi-year deployment of a database-as-a-service platform targeting scientists and data scientists with minimal database experience. Our hypothesis was that relatively minor changes to the way databases are delivered can ...
- research-articleJune 2016
To Join or Not to Join?: Thinking Twice about Joins before Feature Selection
SIGMOD '16: Proceedings of the 2016 International Conference on Management of DataJune 2016, Pages 19–34https://doi.org/10.1145/2882903.2882952Closer integration of machine learning (ML) with data processing is a booming area in both the data management industry and academia. Almost all ML toolkits assume that the input is a single table, but many datasets are not stored as single tables due ...
- research-articleJune 2016
Automatic Generation of Normalized Relational Schemas from Nested Key-Value Data
SIGMOD '16: Proceedings of the 2016 International Conference on Management of DataJune 2016, Pages 295–310https://doi.org/10.1145/2882903.2882924Self-describing key-value data formats such as JSON are becoming increasingly popular as application developers choose to avoid the rigidity imposed by the relational model. Database systems designed for these self-describing formats, such as MongoDB, ...