Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleOctober 2024
Indexing for Fine Arts Monographs: A Comparison Between User Queries and Controlled Vocabularies
Proceedings of the Association for Information Science and Technology (PRA2), Volume 61, Issue 1Pages 1144–1146https://doi.org/10.1002/pra2.1212ABSTRACTPrior research suggested the empirical approach to indexing for fine arts based on users' search queries. This study examined the characteristics of user queries submitted to the library catalog of a fine arts academy, and compared users' search ...
- research-articleSeptember 2024
Review of the Research on Russian Academic Journals
Scientific and Technical Information Processing (SPSTIP), Volume 51, Issue 3Pages 226–238https://doi.org/10.3103/S0147688224700151AbstractThis paper presents the results of academic papers that studied Russian academic journals between 2014 and 2024. Problems of the effective development of journals have been discussed intensively for the last 2 years. This is caused mainly by ...
- research-articleAugust 2024
Scalable Range Search over Temporal and Numerical Expressions
ICTIR '24: Proceedings of the 2024 ACM SIGIR International Conference on Theory of Information RetrievalPages 91–100https://doi.org/10.1145/3664190.3672509Natural language expressions of time and numbers can be ambiguous (e.g., 2020s can refer to either 2021 or 2025), can be present at different granularities, or can be unbounded (e.g., more than ten percent). To match and retrieve such ambiguous temporal ...
- research-articleJune 2024
Predicate Caching: Query-Driven Secondary Indexing for Cloud Data Warehouses
SIGMOD/PODS '24: Companion of the 2024 International Conference on Management of DataPages 347–359https://doi.org/10.1145/3626246.3653395Cloud data warehouses are today's standard for analytical query processing. Multiple cloud vendors offer state-of-the-art systems, such as Amazon Redshift. We have observed that customer workloads experience highly repetitive query patterns, i.e., users ...
- research-articleMay 2024Best Paper
Implementation Strategies for Views over Property Graphs
Proceedings of the ACM on Management of Data (PACMMOD), Volume 2, Issue 3Article No.: 146, Pages 1–26https://doi.org/10.1145/3654949The need to query complex interactions and relationships has motivated interest in property graph database platforms. For some graph applications, graph views are required to abstract the data, e.g., to capture individual-level vs. organization-level ...
-
- research-articleMarch 2024
LIT: Lightning-fast In-memory Temporal Indexing
Proceedings of the ACM on Management of Data (PACMMOD), Volume 2, Issue 1Article No.: 20, Pages 1–27https://doi.org/10.1145/3639275We study the problem of temporal database indexing, i.e., indexing versions of a database table in an evolving database. With the larger and cheaper memory chips nowadays, we can afford to keep track of all versions of an evolving table in memory. This ...
- research-articleNovember 2023
SH2O: Efficient Data Access for Work-Sharing Databases
Proceedings of the ACM on Management of Data (PACMMOD), Volume 1, Issue 3Article No.: 220, Pages 1–26https://doi.org/10.1145/3617340Interactive applications require processing tens to hundreds of concurrent analytical queries within tight time constraints. In such setups, where high concurrency causes contention, work-sharing databases are critical for improving scalability and for ...
- research-articleNovember 2023
AirIndex: Versatile Index Tuning Through Data and Storage
Proceedings of the ACM on Management of Data (PACMMOD), Volume 1, Issue 3Article No.: 204, Pages 1–26https://doi.org/10.1145/3617308The end-to-end lookup latency of a hierarchical index---such as a B-tree or a learned index---is determined by its structure such as the number of layers, the kinds of branching functions appearing in each layer, the amount of data we must fetch from ...
- short-paperNovember 2023
Exploring early vocal music and its lute arrangements: Using F-TEMPO as a musicological tool
DLfM '23: Proceedings of the 10th International Conference on Digital Libraries for MusicologyPages 77–81https://doi.org/10.1145/3625135.3625142In its earliest state, F-TEMPO (Full-Text searching of Early Music Prints Online) enabled searching in the musical content of about 30,000 page-images of early printed music from the British Library’s Early Music Online collection (GB-Lbl). The images ...
- keynoteOctober 2023
AI for Youth Sports: Democratizing Professional Sport Analytics Tools
MMSports '23: Proceedings of the 6th International Workshop on Multimedia Content Analysis in SportsPage 1https://doi.org/10.1145/3606038.3616170Sports analytics is about observing, understanding and describing the game in an intelligent manner. In practice, this requires a fully automated, robust end-to-end pipeline: from visual input, to player and group activities, to player and team ...
- short-paperOctober 2023
Learning Sparse Lexical Representations Over Specified Vocabularies for Retrieval
CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge ManagementPages 3865–3869https://doi.org/10.1145/3583780.3615207A recent line of work in first-stage Neural Information Retrieval has focused on learning sparse lexical representations instead of dense embeddings. One such work is SPLADE, which has been shown to lead to state-of-the-art results in both the in-domain ...
- research-articleOctober 2023
Boosting Big Brother: Attacking Search Engines with Encodings
RAID '23: Proceedings of the 26th International Symposium on Research in Attacks, Intrusions and DefensesPages 700–713https://doi.org/10.1145/3607199.3607220Search engines are vulnerable to attacks against indexing and searching via text encoding manipulation. By imperceptibly perturbing text using uncommon encoded representations, adversaries can control results across search engines for specific search ...
- ArticleOctober 2023
CRANBERRY: Memory-Effective Search in 100M High-Dimensional CLIP Vectors
AbstractRecent advances in cross-modal multimedia data analysis necessarily require efficient similarity search on the scales of hundreds of millions of high-dimensional vectors. We address this task by proposing the CRANBERRY algorithm that specifically ...
- research-articleOctober 2023
On the Maximal Independent Sets of k-mers with the Edit Distance
BCB '23: Proceedings of the 14th ACM International Conference on Bioinformatics, Computational Biology, and Health InformaticsArticle No.: 42, Pages 1–6https://doi.org/10.1145/3584371.3612982In computational biology, k-mers and edit distance are fundamental concepts. However, little is known about the metric space of all k-mers equipped with the edit distance. In this work, we explore the structure of the k-mer space by studying its ...
- research-articleJune 2023
Indexing for Keyword Search with Structured Constraints
PODS '23: Proceedings of the 42nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database SystemsPages 263–275https://doi.org/10.1145/3584372.3588663Keyword search, which finds the documents containing all the keywords supplied by a user, has proved to be an effective approach for querying non-structured information that does not conform to any pre-set schemas. In the last two decades, a vast amount ...
- short-paperJune 2023
Video Retrieval for Everyday Scenes With Common Objects
ICMR '23: Proceedings of the 2023 ACM International Conference on Multimedia RetrievalPages 565–570https://doi.org/10.1145/3591106.3592239We propose a video retrieval system for everyday scenes with common objects. Our system exploits the predictions made by deep neural networks for image understanding tasks using natural language processing (NLP). It aims to capture the relationships ...
- research-articleMay 2023
EAR-Oracle: On Efficient Indexing for Distance Queries between Arbitrary Points on Terrain Surface
Proceedings of the ACM on Management of Data (PACMMOD), Volume 1, Issue 1Article No.: 14, Pages 1–26https://doi.org/10.1145/3588694Due to the advancement of geo-positioning technology, the terrain data has become increasingly popular and has drawn a lot of research effort from both academia and industry. The distance computation on the terrain surface is a fundamental and important ...
- research-articleMarch 2023
Efficient Document-at-a-time and Score-at-a-time Query Evaluation for Learned Sparse Representations
ACM Transactions on Information Systems (TOIS), Volume 41, Issue 4Article No.: 96, Pages 1–28https://doi.org/10.1145/3576922Researchers have had much recent success with ranking models based on so-called learned sparse representations generated by transformers. One crucial advantage of this approach is that such models can exploit inverted indexes for top-k retrieval, thereby ...
- research-articleJanuary 2023
Enriching blockchain with spatial keyword query processing
International Journal of Information and Computer Security (IJICS), Volume 22, Issue 1Pages 91–116https://doi.org/10.1504/ijics.2023.133369Recently, after successfully revolutionising financial services, blockchain is now transforming a variety of other domains. However, current working abstraction requires technology to have more maturity from several key perspectives, and linear data ...
- research-articleNovember 2022
P-massive: a real-time search engine for a multi-terabyte mass spectrometry database
SC '22: Proceedings of the International Conference on High Performance Computing, Networking, Storage and AnalysisArticle No.: 9, Pages 1–15Queries of multi-TB Mass Spectrometry (MS) repositories provide deep insights into biological processes and pose challenging data processing problems. The key bottleneck for running these queries is the number of small random reads. Byte-addressable ...