Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- demonstrationJune 2010
MapDupReducer: detecting near duplicates over massive datasets
SIGMOD '10: Proceedings of the 2010 ACM SIGMOD International Conference on Management of dataJune 2010, Pages 1119–1122https://doi.org/10.1145/1807167.1807296Near duplicate detection benefits many applications, e.g., on-line news selection over the Web by keyword search. The purpose of this demo is to show the design and implementation of MapDupReducer, a MapReduce based system capable of detecting near ...
- research-articleJune 2010
Google fusion tables: web-centered data management and collaboration
- Hector Gonzalez,
- Alon Y. Halevy,
- Christian S. Jensen,
- Anno Langen,
- Jayant Madhavan,
- Rebecca Shapley,
- Warren Shen,
- Jonathan Goldberg-Kidon
SIGMOD '10: Proceedings of the 2010 ACM SIGMOD International Conference on Management of dataJune 2010, Pages 1061–1066https://doi.org/10.1145/1807167.1807286It has long been observed that database management systems focus on traditional business applications, and that few people use a database management system outside their workplace. Many have wondered what it will take to enable the use of data ...
- research-articleJune 2010
A comparison of join algorithms for log processing in MaPreduce
SIGMOD '10: Proceedings of the 2010 ACM SIGMOD International Conference on Management of dataJune 2010, Pages 975–986https://doi.org/10.1145/1807167.1807273The MapReduce framework is increasingly being used to analyze large volumes of data. One important type of data analysis done with MapReduce is log processing, in which a click-stream or an event log is filtered, aggregated, or mined for patterns. As ...