Partitioning techniques for fine-grained indexing

E Wu, S Madden - 2011 IEEE 27th International Conference on …, 2011 - ieeexplore.ieee.org
2011 IEEE 27th International Conference on Data Engineering, 2011ieeexplore.ieee.org
Many data-intensive websites use databases that grow much faster than the rate that users
access the data. Such growing datasets lead to ever-increasing space and performance
overheads for maintaining and accessing indexes. Furthermore, there is often considerable
skew with popular users and recent data accessed much more frequently. These
observations led us to design Shinobi, a system which uses horizontal partitioning as a
mechanism for improving query performance to cluster the physical data, and increasing …
Many data-intensive websites use databases that grow much faster than the rate that users access the data. Such growing datasets lead to ever-increasing space and performance overheads for maintaining and accessing indexes. Furthermore, there is often considerable skew with popular users and recent data accessed much more frequently. These observations led us to design Shinobi, a system which uses horizontal partitioning as a mechanism for improving query performance to cluster the physical data, and increasing insert performance by only indexing data that is frequently accessed. We present database design algorithms that optimally partition tables, drop indexes from partitions that are infrequently queried, and maintain these partitions as workloads change. We show a 60× performance improvement over traditionally indexed tables using a real-world query workload derived from a traffic monitoring application.
ieeexplore.ieee.org