Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Multi-Level Cell Flash Memory Storage Systems: Amarnath Gaini Sathish Mothe K Vijayalaxmi

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 7

Multi-Level Cell Flash Memory Storage Systems

Amarnath Gaini
#1

Sathish Mothe*2
Associate Professor Department of Electronics JITS(27), Andhra Pradesh

K Vijayalaxmi#3
Assistant Professor Department of Electronics VITS(N9), Andhra Pradesh

amarnathgaini@gmail.com

Assistant Professor Department of Electronics VITS(N9), Andhra Pradesh

sathish.mothe@gmail.com

kchitti.401@gmail.com

AbstractThe flash memory management functions of


write coalescing, space management, logical-to-physical mapping, wear leveling, and garbage collection require significant on-going computation and data movement. MLC flash memory also introduces new challenges: (1) Pages in a block must be written sequentially. (2) Information to indicate a page being obsoleted cannot be recorded in its spare area. This paper designs an MLC Flash Translation Layer (MFTL) for flash-memory storage systems which takes new constraints of MLC flash memory and access behaviors of file system into consideration. A series of trace driven simulations is conducted to evaluate the performance of the proposed scheme. Our experiment results show that the proposed MFTL outperforms other related works in terms of the number of extra page writes, the number of total block erasures, and the memory requirement for the management.

MLC. Flash-memory storage systems are normally organized in layers, as shown in Fig. 1.

I. INTRODUCTION
Flash memory chips are constructed from different types of cells (NOR and NAND), and with different numbers of cells per memory location (single-level cell or SLC; and multi-level cell or MLC). These variations result in very different performance, cost, and reliability characteristics. NOR flash memory chips have much lower density, much lower bandwidth, much longer write and erase latencies, and much higher cost than NAND flash memory chips. For these reasons, NOR flash has minimal penetration in enterprise deployments; it is primarily used in consumer devices. Leading enterprise solid state drives (SSDs) are all designed with NAND flash. Another distinction in flash memory is SLC versus MLC. MLC increases density by storing more than a single bit per memory cell. With their increased density, the cost of MLC flash chips is roughly half that of SLC, but the MLC write bandwidth is about 2 times worse than SLC, and MLC supports from 3 to 30 times fewer erase cycles than SLC. A new generation of SSDs incorporates special firmware that closes the performance and durability gap between SLC and

Figure.1. Architecture of Flash-Memory Storage Systems The Memory Technology Device (MTD) layer provides lower-level functionalities of flash memory, such as read, write, and erase. Based on these services, higher-level management algorithms, such as wearleveling, garbage collection, and logical/physical address translation, are implemented in the Flash Translation Layer (FTL). The objective of the FTL is to provide transparent services for file systems such that flash memory can be accessed as a block oriented device. As an alternative approach, a file system can be made flash-memory-aware by having the characteristics of flash memory been taken into consideration. JFFS and YAFFS take this approach. Although a flash-memory aware file system is more efficient, FTL scheme has the advantage of enabling flash memory available to existing file systems directly. Since most of the applications, especially for

portable devices, access their files under FAT file offset as a block-level scheme. However, unlike the systems, this paper focuses on FTL. block-level scheme, it allows data update to the next available page in another physical block, i.e., a log Depending on the granularity with which the mapping block. A page mapping table is required for recording information is managed, FTL can be classified into the actual locations of the updated data. Hybrid page-level, block-level, and hybrid mapping schemes. mapping schemes consume an extra amount of SRAM For a page-level mapping scheme, each entry in the space for the records with page-level mapping but mapping table includes a physical page number (PPN), reduces the copying overhead of a block-level which is indexed by the logical page number (LPN). mapping scheme. When a request arrives, the mapping scheme translates the associated logical address into an LPN and uses Block Associative Sector Translation (BAST) is a this LPN to locate the corresponding PPN from the hybrid mapping scheme proposed by Kim et al. [4]. mapping table. With the mapping table, a logical BAST maintains a limited number of log blocks to address can be directly mapped to a page in any handle update requests. Each data block may associate location of a flash memory. It makes storage with at most one log block. When processing an update management more flexible and results in better request, the associated data is appended to free pages performance for random accesses. Since the mapping of the corresponding log block sequentially, as long as table is large in size and usually resides in SRAM, a such free pages are available. Merge operation is page-level mapping scheme is considered impractical required when no free log block available. Fully due to cost consideration. Associative Sector Translation (FAST) is another hybrid mapping scheme proposed by Lee et al. [6]. To reduce SRAM requirement, block-level mapping FAST shares the log blocks among all of the data scheme was proposed. In a block-level mapping blocks. It allows an update request to be written to the scheme, a logical address has two components: a current log block. As a result, FAST improves the logical block address (LBA) and a logical page offset. utilization of log blocks and reduces the number of LBA can be translated into a physical block address merge operations. Since a log block may associate (PBA), and the logical page offset is used to locate the with several data blocks at the same time, merging a target page in the physical block. The size of the log block may require the merging of several data mapping table is now significantly reduced and blocks. Both BAST and FAST are designed for singleproportional to the total number of blocks in flash level cell (SLC) flash memory. memory. However, update requests usually incur a block-level copy overhead. Block level mapping Currently, multi-level cell (MLC) flash memory has shown in fig 2. occupied the largest part of the flash memory market, due to an increasing demand for a larger capacity and smaller size storage medium. In MLC flash memory, each cell can store two or more bits of data. Although MLC flash memory offers a lower cost and higher density solution, it imposes additional stringent constraints [8]. First, the number of partial program cycles in the same page is limited to one. This constraint implies that it is no longer permitted to mark a page as dead by simply clearing some specific bit in the spare area, which is widely adopted in previous researches. Another constraint is related to the order in which write operations are performed. In MLC, free pages of a block must be written consecutively from the first free page to the target page in the block; random written order is prohibited. Such a constraint makes most existing block-level mapping schemes (e.g., [7]) and hybrid mapping schemes (e.g., [4], [6]) inapplicable for modern flash memory chips. Further, direct modification to these mapping schemes is not Figure 2: Block level mapping appropriate as it will inevitably lead to a significant increase of overhead. A hybrid mapping scheme is a block-level mapping scheme with a limited number of records adopting Motivated by the above concerns, we propose a hybrid page-level mapping. In a hybrid scheme, a logical mapping scheme for the management of the modern address is consisted of an LBA and a logical page

MLC flash memory. Since FAT file system is most commonly used for accessing flash memory, the proposed scheme focuses on the file access behavior under FAT file system as well. In the proposed scheme, the mapping granularity would adjust adaptively according to the access pattern. Our scheme is distinguished from existing solutions in that the proposed scheme consumes less SRAM while provides a better performance. In addition, existing solutions does not take stringent constraints of MLC flash memory into account while our scheme could apply to SLC flash memory as well. The rest of the paper is organized as follows. Section II presents the proposed scheme in details. Section III evaluates the performance of the proposed scheme and compares it with existing works. Section IV draws the conclusion.

II. MFTL: MLC FLASH TRANSLATION LAYER


A. Overview When a write request arrives, MFTL calculates the corresponding virtual block address (VBA) via the logical address of the request, allocates a free (physical) flash-memory block for the virtual block, and writes data to the allocated block. The PBA of the allocated block is recorded in Block Table[VBA]. Note that VBA is similar to LBA in block-level mapping schemes, and we use different notation to avoid confusion. Since out-place update is a widely adopted solution for handling data-update requests, an extra replacement block (physical flashmemory block) is assigned to store the newly updated data. To keep track of the latest version of data, an UpdateRec is created whenever a replacement block is allocated for a virtual block. Note that we do not maintain UpdateRec for every virtual block all the time as we do for Block Table. In fact, as shown later in the experiments (please refer to Section III), maintaining five UpdateRecs would be enough to deal with sustained read/write requests and MFTL always keeps the latest UpdateRecs in SRAM. Since data updates might be random and data amount of each update might be small, only UpdateRec is insufficient if fast data access is required. B. Data Structure Each UpdateRec contains the following fields: VBA records the corresponding virtual block address. Primary records the physical block address of the associated primary block. Replace records the physical block address of

The associated replacement block. LWP is the index of the last written page in the associated replacement block. Count maintains the number of backward updates occurred in this record. The initial value of Count is 0. Mode is a one-bit flag indicating whether the mapping is in block mapping mode (0) or page mapping mode (1). Priority is the priority for UpdateRec replacement. When write/update request arrives, Priority of the corresponding UpdateRec (if any) is set to 0xFF, while Priority of other UpdateRecs would be decreased by 1. Page Map Status is introduced to maintain pagemapping information for virtual blocks. When an UpdateRec switches from block mapping mode to page mapping mode, MFTL has to merge the corresponding primary block into the replacement block and then treat the resulting replacement block as the primary block. We can eliminate such overhead by directly switching to page mapping mode without merging operation when LWP is less than half of total pages in a block. The most significant bit (MSB) is used to indicate whether the mapped age is located in the primary block (set to 0) or in the replacement block (set to 1). Each PageMapStatus contains two elements:

VBA records the corresponding virtual block address. MapArray[N] is an integer array keeping the page mapping information of the block, where N is the number of pages per block. A value smaller than 0x80 in the MapArray[i] means the i-th logical page is located at the MapArray[i]-th page of the primary block. Otherwise, the i-th logical page is located at the (MapArray[i] AND 0x7F)-th page of the replacement block.

C. Write Flow When the FAT file system creates a new file, it first issues a write to root directory to update the file information. It then updates FAT tables to allocate required storage space for the file. Finally, the file body is written to the allocated area. Conceptually, the storage space could be treated as containing two parts: system area and data area. System area consists of a boot record, FAT tables and a root directory, while data area is the area used to store file body. Any write/update to a file (data area) is accompanied by several small updates to the system area to ensure correct file information. Observing the very different behavior over system area and data area, we adopt

various mechanisms to deal with write operations in an adaptive manner. Since writing data is trivial in flashmemory management, we will focus on the operation of data updating. Upon arrival of an update request, MFTL first determines whether the corresponding UpdateRec exists by checking the VBA field of each UpdateRec. If no such an UpdateRec exists, MFTL allocates a new one. In case that all the UpdateRecs have been assigned and the new update request does not match any one of them, MFTL will select the UpdateRec with the lowest priority as a victim for replacement (ties are broken arbitrarily). Live pages in the primary block and the replacement block of the victim UpdateRec will be merged to a newly allocated block, and the victim UpdateRec could then be released and reassigned to the new request. After the UpdateRec is located, MFTL determines the mapping mode of the UpdateRec. Different mapping mode would affect the manner how MFTL writes data to flash memory.

obtained from BlockTable[VBA]. With the PBA, MFTL accesses the corresponding flash-memory block and read the required data from page STP to the last page of the block. Suppose that the UpdateRec is successfully found, and it is in block mapping mode. In this case, MFTL has to identify which block and how many pages shall be read. It compares the derived STP with LWP of the UpdateRec. If STP is smaller than or equal to LWP, a portion of the required data shall be read from the corresponding replacement block. MFTL reads the required data from page STP to page LWP in the replacement block and then read from page LWP+1 to the last page in the primary block. Otherwise, MFTL simply reads the required data from STP to the last page in the primary block. For the case in which the corresponding UpdateRec is found and in page mapping mode, MFTL may not be able to read the required data sequentially because the live pages in the replacement block are not programmed in such a manner. MFTL must look up MapArray to determine physical locations of the needed data. For each target page i, if the value of MapArray[i] is 0xFF, MFTL reads data from page i in the primary block. Otherwise, MFTL reads from page MapArray[i] in the replacement block.

If the mapping mode is in block mapping mode, some further checks are required. According to the index of starting target page for the update request (i.e., STP) relative to the index of the last written page in the replacement block (i.e. LWP), the update request is treated differently. For the case in which STP is larger than LWP, the request is treated as a forward update. III. PERFORMANCE EVALUATION Otherwise, some previously updated data is to be updated again by the request, and the request is treated as a backward update. Since backward updates incur A. Experiment Setup higher page copying overheads, page-mapping mode is thus introduced to deal with small, random, and This section is meant to evaluate the performance of frequently updated data. the proposed MFTL in terms of the number of extra page writes2, the number of total block erasures, and The Count field in UpdateRec is used to maintain the the memory requirement for the management. We number of backward updates that had occurred in this compared the proposed MFTL with two well-known record. When it exceeds a threshold, the mapping is hybrid mapping schemes, BAST [4] and FAST [6], switched to page mapping mode. If a backward update under different number of extra blocks. Suppose the occurs but does not result in a switch to page mapping number of extra blocks is set to n, MFTL could have mode, MFTL will examine whether page-copying after up to n UpdateRecs. The maximum number of blocks updating data is required. The primary reason behind reserved for random accesses in BAST could be n, page-copying after updating data is the assurance of while FAST would contain one sequential write (SW) data integrity, since the primary block or the log block and (n 1) random write (RW) log blocks. replacement block would be reassigned in this case. The experiments were conducted by a trace-driven simulation. D. Read Flow For a read request, MFTL locates the target data through either BlockTable or UpdateRecs (and PageMapStatuss, if required). When a read request arrives, MFTL first derives the corresponding VBA and STP via the logical address of the request. Then, MFTL determines whether the derived VBA matches the VBA field of any UpdateRec. If no such UpdateRec is found, the PBA of the target data can be

Traces with various access behaviors were collected under Windows XP with FAT16 file system over a 2GB flash memory. The flash memory chip adopted in the experiment was Samsung K9WAG08U1B 16G-bit SLC flash memory each block contains 64 pages, and each page is of 2KB. All of the traces were captured by the SD/MMC card protocol analyzer VTE2100 [9]. Note that the FAT file system would update the root directory and FAT tables frequently to maintain the

correct file information. The cluster size was defined as 16KB, where the cluster size was the minimum allocation unit for a file. TABLE I summarizes characteristics of the seven evaluation patterns. B. Experiment Results Fig. 3 compares the number of extra page writes and the number of total block erasures of BAST, FAST, and the proposed MFTL under different number of extra blocks over a 2GB flash memory. In addition to the extreme case (i.e., the pattern 4KB-R), FAST had a good performance when the number of extra blocks was very limited; three extra blocks would be enough to make its performance stable. This was because FAST shares its extra blocks for all logical blocks. However, the proposed MFTL outperformed both BAST and FAST for all patterns when four extra blocks could be provided. Based on our experimentation, the performance improvement was significant when the number of extra blocks was between four and five. Such results were caused by two reasons: (1) Since the FAT file system would update two FAT tables, root directory, and data area while writing a file to storage, four extra blocks could significantly improve the performance.

rapidly use up log blocks. For BAST, each subsequent update would trigger a merge operation, and each merge operation would incur 64 extra page writes and 2 block erasures. For FAST, since a log block was shared by all data blocks, merge operations would not be triggered as frequent as BAST did. Recall that each flash-memory block contains 64 2KB pages and the cluster size was 16KB, each merge operation would incur 8 64 = 512 extra page writes and 9 block erasures (compared with 8 merge operations in BAST with 512 extra page writes and 16 block erasures). For MFTL, random 4KB file updates would not switch UpdateRecs into page mapping mode, and each 4KB file update would be treated as a forward update in block mapping mode. In addition, updates to root directory and FAT tables could always reside in page mapping mode thanks to Count and Priority in UpdateRec. As a result, FAST and MFTL could have a better performance in extra page writes and outperform BAST in block erasures. For the pattern 64KB-S, as illustrated in Fig. 3(c) and Fig.3(d), there was no internal fragmentation issue and readers might expect similar performances for the three hybrid mapping schemes. However, the experiment result showed a different story. For BAST and FAST, while there was no log block assigned to the corresponding data block and page offset of the write request did not align to the block boundary (i.e., page offset 0), a log block would be assigned and updated data would be written to the page 0 of the block. Even the following write requests were sequential writes, BAST and FAST would still incur extra page writes and block erasures during merge operations. On the other hand, as mentioned in Section II-C, MFTL uses Count in UpdateRec to prevent from switching to page mapping mode immediately. This mechanism helped UpdateRec to keep in block mapping mode for sequential write requests. Thus merge overheads could be reduced, and a better performance was achieved. For patterns SUB and MP3, since large files led to much more sequential write operations, the number of merge operations was largely reduced. As shown in Fig. 3(f) and Fig. 3(h), the improvement over the reduction of block erasures was not as significant as patterns 4KB-R and 64KB-S; MFTL could achieved a better performance on block erasures and extra page writes compared with the other two schemes for large file patterns while at least four extra blocks were provided. Note that in the pattern MP3 with at least four extra blocks, MFTL outperformed BAST and FAST in terms of extra page writes. It was because MFTL uses Count in UpdateRec to prevent from switching to page mapping mode immediately. With a small overhead of page copying, this mechanism helped UpdateRec to keep in block mapping mode. As a result, merge overhead could be

TABLE I
CHARACTERISTICS OF EVALUATION PATTERNS

area while writing a file to storage, four extra blocks could significantly improve the performance. (2) Windows XP writes files with size over 64KB by two threads, five extra blocks might be required for updating a large file. As a result, over six extra blocks would not have an obvious improvement for extra page writes and block erasures. For the pattern 4KB-R, as illustrated in Fig. 3(a) and Fig. 3(b), the number of extra blocks seemed to have no effect on the performance improvement. It was because a tremendous amount of random 4KB file updates would

reduced when an update request dose not begins from the selection of a victim log block when a merge page 0 of some block. operation is triggered. MFTL reserved three extra bytes for fields of LWP, Count, Mode, and Priority. For the memory requirements of the three hybrid For page mapping, BAST required an array for each mapping schemes, since block mapping tables were the log block to keep track of the page mapping same for each scheme, we focused on the memory information. Since each block contained 64 pages, 64 requirement for managing extra blocks. TABLE II lists entries were required for the array and each entry was the memory requirements for each extra block and of one byte. FAST required an array for each RW each MapArray under the three different schemes. block. Since each RW block was shared by all data BAST reserved one more byte to store information for

Figure 3: Performance Comparison of BAST/FAST/MFTL under Different Number of Extra Blocks.

blocks, each entry of the array required three bytes to maintain complete information. MFTL did not reserve an array for each UpdateRec. Thus MFTL required two more bytes to record the corresponding VBA.
TABLE III SRAM REQUIREMENTS FOR MANAGING UPDATE REQUESTS WITH DIFFERENT NUMBER OF EXTRA BLOCKS. (UNIT: BYTES) TABLE II SRAM REQUIREMENTS FOR DIFFERENT SCHEMES (UNIT: BYTES).

IV. CONCLUSION
Flash technology is constantly changing, providing faster program and erase cycles, a bigger number of guaranteed erase and re-program cycles and longer data retention. This paper proposes a management scheme for MLC flash memory storage systems. Observing that most of existing user applications access NAND flash memory under FAT file system, this paper takes new constraints of MLC flash memory and access behaviors of FAT file system into consideration. The proposed MFTL scheme treats different access behaviors in an adaptive manner. It adopts a hybrid mapping approach to balance between

TABLE III lists the memory requirements for these schemes with respect to different number of extra blocks. As shown in the table, when the number of extra blocks is five, MFTL can save about 31.55% memory space compared with BAST and up to 69.55% compared with FAST. These savings can be as high as 59.44% and 83.89% when there are ten extra blocks provided by the system.

performance and SRAM requirement. An extensive set of experiments has been conducted to evaluate the performance of the proposed MFTL. Our experiment results show that the proposed scheme outperforms BAST and FAST in terms of extra page writes and block erasures while SRAM requirement is largely reduced. For the future work, we shall further take the crash recovery and wear leveling issues into our design and survey other newly announced hybrid mapping schemes, e.g., LAST and KAST, for comparisons.

REFERENCES
[1] Understanding the Flash Translation Layer (FTL) Specification. Technical report, Intel Corporation, Dec 1998. [2] Aleph One Company. Yet Another Flash Filing System. [3]H. Cho, D. Shin, and Y. I. Eom. KAST: KAssociative Sector Translation for NAND Flash Memory in Real-Time Systems. In Design, Automation and Test in Europe (DATE), pages 507512, April 2009. [4] J. Kim, J. M. Kim, S. H. Noh, S. L. Min, and Y. Cho. A Space- Effficient Flash Translation Layer for CompactFlash Systems. In IEEE Transactions on Consumer Electronics, Vol. 48, No. 2, pages 366375,

You might also like