Abstract
The poor performance of random writes has been a cause of major concern which needs to be addressed to better utilize the potential of flash in enterprise-scale environments. We examine one of the important causes of this poor performance: the design of the flash translation layer (FTL) which performs the virtual-to-physical address translations and hides the erase-before-write characteristics of flash. We propose a complete paradigm shift in the design of the core FTL engine from the existing techniques with our Demand-Based Flash Translation Layer (DFTL) which selectively caches page- level address mappings. Our experimental evaluation using FlashSim with realistic enterprise-scale workloads endorses the utility of DFTL in enterprise-scale storage systems by demonstrating: 1) improved performance, 2) reduced garbage collection overhead and 3) better overload behavior compared with hybrid FTL schemes which are the most popular implementation methods. For example, a predominantly random-write dominant I/O trace from an OLTP application running at a large financial institution shows a 78% improvement in average response time (due to a 3-fold reduction in operations of the garbage collector), compared with the hybrid FTL scheme. Even for the well-known read-dominant TPC-H benchmark, for which DFTL introduces additional overheads, we improve system response time by 56%. Moreover, interestingly, when write-back cache on DFTL-based SSD is enabled, DFTL even outperforms the page-based FTL scheme, improving their response time by 72% in Financial trace.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Gurumurthi S, Sivasubramaniam A, Natarajan V. Disk drive roadmap from the thermal perspective: A case for dynamic thermal management. In Proc. the 32nd International Symposium on Computer Architecture, June 2005, pp.38-49.
Kim Y, Gurumurthi S, Sivasubramaniam A. Understanding the performance-temperature interactions in disk I/O of server workloads. In Proc. the 12th International Symposium on High-Performance Computer Architecture (HPCA), February 2006, pp.176-186.
Mallary M, Torabi A, Benakli M. One terabit per square inch perpendicular recording conceptual design. IEEE Transactions on Magnetics, 2002, 38(4): 1719–1724.
Chen J, Moon J. Detection signal-to-noise ratio versus bit cell aspect ratio at high areal densities. IEEE Transactions on Magnetics, 2001, 37(3): 1157–1167.
Gulati A, Merchant A, Varman P J. pClock: An arrival curve based approach for QoS guarantees in shared storage systems. In Proc. the ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, June 2007, pp.13-24.
Tehrani S, Slaughter J M, Chen E, Durlam M, Shi J, De-Herren M. Progress and outlook for MRAM technology. IEEE Transactions on Magnetics, 1999, 35(5): 2814–2819.
Shimada Y. FeRAM: Next generation challenges and future directions. In Proc. the IEEE International Symposium on Applications of Ferroelectric, May 2007.
Leventhal A. Flash storage memory. Communications of the ACM, 2008, 51(7): 47–51.
Lee S, Moon B. Design of flash-based DBMS: An in-page logging approach. In Proc. the International Conference on Management of Data (SIGMOD), August 2007, pp.55-66.
Kim H, Ahn S. BPLRU: A buffer management scheme for improving random writes in flash storage. In Proc. the 6th USENIX Conference on File and Storage Technologies (FAST), February 2008, pp.239-252.
Small-block vs. large-block NAND flash devices. Technical 1044 Report, TN-29-07, Micron. http://www.micron.com/products/nand/technotes, Jan. 2013.
Lee S, Park D, Chung T, Lee D, Park S, Song H. A log buffer based flash translation layer using fully associative sector translation. ACM Transactions on Embedded Computing Systems, 2007, 6(3): Article No.18.
Kim Y, Taurus B, Gupta A, Urgaonkar B. FlashSim: A Simulator for NAND flash-based solid-state drives. In Proc. the International Conference on Advances in System Simulation (SIMUL), September 2009, pp.125-131.
Gupta A, Kim Y, Urgaonkar B. DFTL: A flash translation layer employing demand-based selective caching of page-level address mappings. In Proc. the 14th International Conference on Architectural Support for Programming Languages and Operating System (ASPLOS), March 2009, pp.229-240.
Hennessy J, Patterson D. Computer Architecture: A Quantitative Approach. San Francisco, USA: Morgan Kaufmann Publishers Inc., 2006.
Kim J, Kim J M, Noh S H, Min S, Cho Y. A space-efficient flash translation layer for compactflash systems. IEEE Transactions on Consumer Electronics, 2002, 48(2): 366–375.
Chung T, Park D, Park S, Lee D, Lee S, Song H. System software for flash memory: A survey. In Proc. the International Conference on Embedded and Ubiquitous Computing, August 2006, pp.394-404.
Kang J, Jo H, Kim J, Lee J. A superblock-based flash translation layer for NAND flash memory. In Proc. the 6th International Conference on Embedded Software (EMSOFT), October 2006, pp.161-170.
Lee S, Shin D, Kim Y, Kim J. LAST: Locality-aware sector translation for NAND flash memory-based storage systems. ACM SIGOPS Operating Systems Review, 2008, 42(6): 36-42.
Karedla R, Love J S, Wherry B G. Caching strategies to improve disk system performance. IEEE Transactions on Computer (TC), 1994, 27(3): 38–46.
Kawaguchi A, Nishioka S, Motoda H. A flash-memory based file system. In Proc. the Winter 1995 USENIX Technical Conference, Jan. 1995, pp.155-164.
Bucy J S, Ganger G R. The DiskSim simulation environment version 3.0 reference manual. CMU, January 2003.
Ban A. Flash file system. United States Patent 5404485, April 4, 1995.
Zhang J, Sivasubramaniam A, Franke H, Gautam N, Zhang Y, Nagar S. Synthesizing representative I/O workloads for TPC-H. In Proc. the 10th International Symposium on High Performance Computer Architecture (HPCA), Feb. 2004, pp.142-151.
Park D, Debnath B, Du D H C. A workload-aware adaptive hybrid flash translation layer with an efficient caching strategy. In Proc. the 19th IEEE Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems, July 2011, pp.248-255.
Budilovsky E, Toledo S, Zuck A. Prototyping a high-performance low-cost solid-state disk. In Proc. the 4th Annual International Conference on Systems and Storage, May 30-June 1, 2011, Article No. 13.
Choudhuri S, Givargis T. Performance improvement of block based NAND flash translation layer. In Proc. the 15th IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), September 30-October 3, 2007, pp.257-262.
Wu C H, Kuo T W, Chang L P. An efficient B-tree layer implementation for flash-memory storage systems. ACM Trans. Embed. Comput. Syst., 2007, 6(3): Article No. 19.
Kwon H, Kim E, Choi J, Lee D, Noh S H. Janus-FTL: Finding the optimal point on the spectrum between page and block mapping schemes. In Proc. the 10th ACM International Conference on Embedded Software, Oct. 2010, pp.169-178.
Author information
Authors and Affiliations
Additional information
This research was funded in part by the Natural Science Foundation of U.S. under Grant Nos. CCF-0811670, CNS-0720456, a gift from Cisco System, Inc. and partially through the O±ce of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725.
The preliminary version of the paper was published in the Proceedings of the 14th ASPLOS.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Kim, Y., Gupta, A. & Urgaonkar, B. A Temporal Locality-Aware Page-Mapped Flash Translation Layer. J. Comput. Sci. Technol. 28, 1025–1044 (2013). https://doi.org/10.1007/s11390-013-1395-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-013-1395-4