Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- short-paperOctober 2024
Bubble Sketch: A High-performance and Memory-efficient Sketch for Finding Top-k Items in Data Streams
- Lu Cao,
- Qilong Shi,
- Yuxi Liu,
- Hanyue Zheng,
- Yao Xin,
- Wenjun Li,
- Tong Yang,
- Yangyang Wang,
- Yang Xu,
- Weizhe Zhang,
- Mingwei Xu
CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge ManagementPages 3653–3657https://doi.org/10.1145/3627673.3679882Sketch algorithms are crucial for identifying top-k items in large-scale data streams. Existing methods often compromise between performance and accuracy, unable to efficiently handle increasing data volumes with limited memory. We present Bubble Sketch, ...
- research-articleSeptember 2024
Discovering Top-k Relevant and Diversified Rules
Proceedings of the ACM on Management of Data (PACMMOD), Volume 2, Issue 4Article No.: 195, Pages 1–28https://doi.org/10.1145/3677131This paper studies the problem of discovering top-k relevant and diversified rules. Given a real-life dataset, it is to mine a set of k rules that are as close to users' interest as possible, and meanwhile, as diverse to each other as possible. It aims ...
- research-articleMay 2024
Query Refinement for Diverse Top-k Selection
Proceedings of the ACM on Management of Data (PACMMOD), Volume 2, Issue 3Article No.: 166, Pages 1–27https://doi.org/10.1145/3654969Database queries are often used to select and rank items as decision support for many applications. As automated decision-making tools become more prevalent, there is a growing recognition of the need to diversify their outcomes. In this paper, we define ...
- research-articleJune 2023
rkHit: Representative Query with Uncertain Preference
Proceedings of the ACM on Management of Data (PACMMOD), Volume 1, Issue 2Article No.: 126, Pages 1–26https://doi.org/10.1145/3589271A top-k query retrieves the k tuples with highest scores according to a user preference, defined as a scoring function. It is difficult for a user to precisely specify the scoring function. Instead, obtaining the distribution on scoring functions, i.e., ...
- research-articleMay 2023
Discovering Top-k Rules using Subjective and Objective Criteria
Proceedings of the ACM on Management of Data (PACMMOD), Volume 1, Issue 1Article No.: 70, Pages 1–29https://doi.org/10.1145/3588924This paper studies two questions about rule discovery. Can we characterize the usefulness of rules using quantitative criteria? How can we discover rules using those criteria? As a testbed, we consider entity enhancing rules (REEs), which subsume common ...
-
- research-articleAugust 2022
Searching Top-K Similar Moving Videos
HP3C '22: Proceedings of the 6th International Conference on High Performance Compilation, Computing and CommunicationsPages 168–174https://doi.org/10.1145/3546000.3546026The application of sensors enables mobile devices to generate amounts of content-aware data, such as trajectory, gyro, and video data. Moving video is an emerging new type of moving object that can provide a potential source for geo-referenced ...
- research-articleJune 2022
Adaptive Threshold Sampling
SIGMOD '22: Proceedings of the 2022 International Conference on Management of DataPages 1612–1625https://doi.org/10.1145/3514221.3526122Sampling is a fundamental problem in computer science and statistics. However, for a given task and stream, it is often not possible to choose good sampling probabilities in advance. We derive a general framework for adaptively changing the sampling ...
- short-paperJune 2022
Everest: A Top-K Deep Video Analytics System
SIGMOD '22: Proceedings of the 2022 International Conference on Management of DataPages 2357–2360https://doi.org/10.1145/3514221.3520151The impressive accuracy of deep neural networks (DNNs) has created great demands on practical analytics over video data. Although efficient and accurate, the latest video analytic systems have not supported analytics beyond selection and aggregation ...
- research-articleJanuary 2022
Adaptive query relaxation and top‐k result sorting of fuzzy spatiotemporal data based on XML
International Journal of Intelligent Systems (IJIS), Volume 37, Issue 3Pages 2502–2520https://doi.org/10.1002/int.22781AbstractWith the increasing popularity of Extensible Markup Language (XML) for data representation, there is a lot of interest in searching XML data. Due to the structural heterogeneity of XML, it is daunting for users to formulate exact queries and ...
- research-articleNovember 2021
ParTBC: Faster Estimation of Top-k Betweenness Centrality Vertices on GPU
ACM Transactions on Design Automation of Electronic Systems (TODAES), Volume 27, Issue 2Article No.: 12, Pages 1–25https://doi.org/10.1145/3486613Betweenness centrality (BC) is a popular centrality measure, based on shortest paths, used to quantify the importance of vertices in networks. It is used in a wide array of applications including social network analysis, community detection, clustering, ...
- research-articleAugust 2020
On Sampling Top-K Recommendation Evaluation
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data MiningPages 2114–2124https://doi.org/10.1145/3394486.3403262Recently, Rendle has warned that the use of sampling-based top-k metrics might not suffice. This throws a number of recent studies on deep learning-based recommendation algorithms, and classic non-deep-learning algorithms using such a metric, into ...
- research-articleJune 2020
Efficient Indexes for Diverse Top-k Range Queries
PODS'20: Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database SystemsPages 213–227https://doi.org/10.1145/3375395.3387667Let P be a set of n (non-negatively) weighted points in Rd. We consider the problem of computing a subset of (at most) k diverse and high-valued points of P that lie inside a query range, a problem relevant to many areas such as search engines, ...
- research-articleMay 2020
External Merge Sort for Top-K Queries: Eager input filtering guided by histograms
SIGMOD '20: Proceedings of the 2020 ACM SIGMOD International Conference on Management of DataPages 2423–2437https://doi.org/10.1145/3318464.3389729Business intelligence and web log analysis workloads often use queries with top-k clauses to produce the most relevant results. Values ofk range from small to rather large and sometimes the requested output exceeds the capacity of the available main ...
- short-paperMay 2020
Optimal Join Algorithms Meet Top-k
SIGMOD '20: Proceedings of the 2020 ACM SIGMOD International Conference on Management of DataPages 2659–2665https://doi.org/10.1145/3318464.3383132Top-k queries have been studied intensively in the database community and they are an important means to reduce query cost when only the "best" or "most interesting" results are needed instead of the full output. While some optimality results exist, ...
- research-articleJune 2019
Top-k Queries over Digital Traces
SIGMOD '19: Proceedings of the 2019 International Conference on Management of DataPages 954–971https://doi.org/10.1145/3299869.3319857Recent advances in social and mobile technology have enabled an abundance of digital traces (in the form of mobile check-ins, association of mobile devices to specific WiFi hotspots, etc.) revealing the physical presence history of diverse sets of ...
- research-articleJune 2019
RRR: Rank-Regret Representative
SIGMOD '19: Proceedings of the 2019 International Conference on Management of DataPages 263–280https://doi.org/10.1145/3299869.3300080Selecting the best items in a dataset is a common task in data exploration. However, the concept of "best'' lies in the eyes of the beholder: different users may consider different attributes more important, and hence arrive at different rankings. ...
- research-articleJune 2019
Designing Fair Ranking Schemes
SIGMOD '19: Proceedings of the 2019 International Conference on Management of DataPages 1259–1276https://doi.org/10.1145/3299869.3300079Items from a database are often ranked based on a combination of criteria. The weight given to each criterion in the combination can greatly affect the fairness of the produced ranking, for example, preferring men over women. A user may have the ...
- research-articleFebruary 2020
Fast top-k search with relaxed graph simulation
Graph pattern matching has been widely used in large spectrum of real applications. In this context, different models along with their appropriate algorithms have been proposed. However, a major drawback on existing models is their limitation to find ...
- research-articleApril 2018
Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs
- Xiaofeng Yang,
- Deepak Ajwani,
- Wolfgang Gatterbauer,
- Patrick K. Nicholson,
- Mirek Riedewald,
- Alessandra Sala
WWW '18: Proceedings of the 2018 World Wide Web ConferencePages 489–498https://doi.org/10.1145/3178876.3186115Many problems in areas as diverse as recommendation systems, social network analysis, semantic search, and distributed root cause analysis can be modeled as pattern search on labeled graphs (also called "heterogeneous information networks" or HINs). ...
- research-articleMay 2017
Efficient Computation of Top-k Frequent Terms over Spatio-temporal Ranges
SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of DataPages 1227–1241https://doi.org/10.1145/3035918.3064032The wide availability of tracking devices has drastically increased the role of geolocation in social networks, resulting in new commercial applications; for example, marketers can identify current trending topics within a region of interest and focus ...