Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleAugust 2009
Learning string transformations from examples
Proceedings of the VLDB Endowment (PVLDB), Volume 2, Issue 1Pages 514–525https://doi.org/10.14778/1687627.1687686"Robert" and "Bob" refer to the same first name but are textually far apart. Traditional string similarity functions do not allow a flexible way to account for such synonyms, abbreviations and aliases. Recently, string transformations have been proposed ...
- research-articleJune 2009
Extending autocompletion to tolerate errors
SIGMOD '09: Proceedings of the 2009 ACM SIGMOD International Conference on Management of dataPages 707–718https://doi.org/10.1145/1559845.1559919Autocompletion is a useful feature when a user is doing a look up from a table of records. With every letter being typed, autocompletion displays strings that are present in the table containing as their prefix the search string typed so far. Just as ...
- demonstrationJune 2008
Incorporating string transformations in record matching
SIGMOD '08: Proceedings of the 2008 ACM SIGMOD international conference on Management of dataPages 1231–1234https://doi.org/10.1145/1376616.1376742Today's record matching infrastructure does not allow a flexible way to account for synonyms such as "Robert" and "Bob" which refer to the same name, and more general forms of string transformations such as abbreviations. We expand the problem of record ...
- ArticleApril 2008
Transformation-based Framework for Record Matching
ICDE '08: Proceedings of the 2008 IEEE 24th International Conference on Data EngineeringPages 40–49https://doi.org/10.1109/ICDE.2008.4497412Today's record matching infrastructure does not allow a flexible way to account for synonyms such as "Robert" and "Bob" which refer to the same name, and more general forms of string transformations such as abbreviations. We propose a programmatic ...
- research-articleSeptember 2007
Stop-and-restart style execution for long running decision support queries
Long running decision support queries can be resource intensive and often lead to resource contention in data warehousing systems. Today, the only real option available to the DBAs when faced with such contention is to carefully select one or more ...
- research-articleSeptember 2007
Example-driven design of efficient record matching queries
Record matching is the task of identifying records that match the same real world entity. This is a problem of great significance for a variety of business intelligence applications. Implementations of record matching rely on exact as well as ...
- ArticleJune 2007
Leveraging aggregate constraints for deduplication
SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of dataPages 437–448https://doi.org/10.1145/1247480.1247530We show that aggregate constraints (as opposed to pairwise constraints) that often arise when integrating multiple sources of data, can be leveraged to enhance the quality of deduplication. However, despite its appeal, we show that the problem is ...
- ArticleApril 2006
A Primitive Operator for Similarity Joins in Data Cleaning
ICDE '06: Proceedings of the 22nd International Conference on Data EngineeringPage 5https://doi.org/10.1109/ICDE.2006.9Data cleaning based on similarities involves identification of "close" tuples, where closeness is evaluated using a variety of similarity functions chosen to suit the domain and application. Current approaches for efficiently implementing such ...
- ArticleApril 2006
Robust Cardinality and Cost Estimation for Skyline Operator
ICDE '06: Proceedings of the 22nd International Conference on Data EngineeringPage 64https://doi.org/10.1109/ICDE.2006.131Incorporating the skyline operator inside the relational engine requires solving the cardinality estimation and the cost estimation problem, hitherto unaddressed. We propose robust techniques to estimate the cardinality and the computational cost of ...
- ArticleJune 2005
When can we trust progress estimators for SQL queries?
SIGMOD '05: Proceedings of the 2005 ACM SIGMOD international conference on Management of dataPages 575–586https://doi.org/10.1145/1066157.1066223The problem of estimating progress for long-running queries has recently been introduced. We analyze the characteristics of the progress estimation problem, from the perspective of providing robust, worst-case guarantees. Our first result is that in the ...
- ArticleJune 2003
On relational support for XML publishing: beyond sorting and tagging
SIGMOD '03: Proceedings of the 2003 ACM SIGMOD international conference on Management of dataPages 611–622https://doi.org/10.1145/872757.872831In this paper, we study whether the need for efficient XML publishing brings any new requirements for relational query engines, or if sorting query results in the relational engine and tagging them in middleware is sufficient. We observe that the ...