Semantic embedding for regions of interest
The available spatial data are rapidly growing and also diversifying. One may obtain in large quantities information such as annotated point/place of interest (POIs), check-in comments on those POIs, geo-tagged microblog comments, and demarked ...
Comparison and evaluation of state-of-the-art LSM merge policies
Modern NoSQL database systems use log-structured merge (LSM) storage architectures to support high write throughput. LSM architectures aggregate writes in a mutable MemTable (stored in memory), which is regularly flushed to disk, creating a new ...
Cache-efficient sweeping-based interval joins for extended Allen relation predicates
We develop a family of efficient plane-sweeping interval join algorithms for evaluating a wide range of interval predicates such as Allen’s relationships and parameterized relationships. Our technique is based on a framework, components of which ...
Better database cost/performance via batched I/O on programmable SSD
Data should be placed at the most cost- and performance-effective tier in the storage hierarchy. While performance and cost decrease with distance from the CPU, the cost/performance trade-off depends on how efficiently data can be moved across ...
Cleaning timestamps with temporal constraints
Timestamps are often found to be dirty in various scenarios, e.g., in distributed systems with clock synchronization problems or unreliable RFID readers. Without cleaning the imprecise timestamps, temporal-related applications such as provenance ...