Logbase: A scalable log-structured database system in the cloud

HT Vo, S Wang, D Agrawal, G Chen, BC Ooi - arXiv preprint arXiv …, 2012 - arxiv.org
arXiv preprint arXiv:1207.0140, 2012arxiv.org
Numerous applications such as financial transactions (eg, stock trading) are write-heavy in
nature. The shift from reads to writes in web applications has also been accelerating in
recent years. Write-ahead-logging is a common approach for providing recovery capability
while improving performance in most storage systems. However, the separation of log and
application data incurs write overheads observed in write-heavy environments and hence
adversely affects the write throughput and recovery time in the system. In this paper, we …
Numerous applications such as financial transactions (e.g., stock trading) are write-heavy in nature. The shift from reads to writes in web applications has also been accelerating in recent years. Write-ahead-logging is a common approach for providing recovery capability while improving performance in most storage systems. However, the separation of log and application data incurs write overheads observed in write-heavy environments and hence adversely affects the write throughput and recovery time in the system. In this paper, we introduce LogBase - a scalable log-structured database system that adopts log-only storage for removing the write bottleneck and supporting fast system recovery. LogBase is designed to be dynamically deployed on commodity clusters to take advantage of elastic scaling property of cloud environments. LogBase provides in-memory multiversion indexes for supporting efficient access to data maintained in the log. LogBase also supports transactions that bundle read and write operations spanning across multiple records. We implemented the proposed system and compared it with HBase and a disk-based log-structured record-oriented system modeled after RAMCloud. The experimental results show that LogBase is able to provide sustained write throughput, efficient data access out of the cache, and effective system recovery.
arxiv.org