Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2901318.2901324acmotherconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article

A high performance file system for non-volatile main memory

Published: 18 April 2016 Publication History
  • Get Citation Alerts
  • Abstract

    Emerging non-volatile main memories (NVMMs) provide data persistence at the main memory level. To avoid the double-copy overheads among the user buffer, the OS page cache, and the storage layer, state-of-the-art NVMM-aware file systems bypass the OS page cache which directly copy data between the user buffer and the NVMM storage. However, one major drawback of existing NVMM technologies is the slow writes. As a result, such direct access for all file operations can lead to suboptimal system performance.
    In this paper, we propose HiNFS, a high performance file system for non-volatile main memory. Specifically, HiNFS uses an NVMM-aware Write Buffer policy to buffer the lazy-persistent file writes in DRAM and persists them to NVMM lazily to hide the long write latency of NVMM. However, HiNFS performs direct access to NVMM for eager-persistent file writes, and directly reads file data from both DRAM and NVMM as they have similar read performance, in order to eliminate the double-copy overheads from the critical path. To ensure read consistency, HiNFS uses a combination of the DRAM Block Index and Cacheline Bitmap to track the latest data between DRAM and NVMM. Finally, HiNFS employs a Buffer Benefit Model to identify the eager-persistent file writes before issuing the write operations. Using software NVMM emulators, we evaluate HiNFS's performance with various workloads. Comparing with state-of-the-art NVMM-aware file systems - PMFS and EXT4-DAX, surprisingly, our results show that HiNFS improves the system throughput by up to 184% for filebench microbenchmarks and reduces the execution time by up to 64% for data-intensive traces and macro-benchmarks, demonstrating the benefits of hiding the long write latency of NVMM.

    References

    [1]
    Dbt2 test suite. http://sourceforge.net/apps/mediawiki/osdldbt.
    [2]
    Flexible io (fio) tester. http://freecode.com/projects/fio.
    [3]
    Filebench 1.4.9.1. http://sourceforge.net/projects/filebench/.
    [4]
    Lasr system call io trace. http://iotta.snia.org/historical_section?tracetype_id=1.
    [5]
    Fiu system call io trace. http://sylab-srv.cs.fiu.edu/dokuwiki/doku.php?id=projects:nbw:start.
    [6]
    Supporting filesystems in persistent memory. https://lwn.net/Articles/610174/, 2014.
    [7]
    Support ext4 on nv-dimms. http://lwn.net/Articles/588218/, 2014.
    [8]
    G. W. Burr, M. J. Breitwisch, M. Franceschini, D. Garetto, K. Gopalakrishnan, B. Jackson, B. Kurdi, C. Lam, L. A. Lastras, A. Padilla, B. Rajendran, S. Raoux, and R. S. Shenoy. Phase change memory technology. Journal of Vacuum Science & Technology B, 28(2):223--262, 2010.
    [9]
    D. Campello, H. Lopez, R. Koller, R. Rangaswami, and L. Useche. Non-blocking writes to files. In 13th USENIX Conference on File and Storage Technologies (FAST '15), pages 151--165, Santa Clara, CA, Feb. 2015.
    [10]
    S. Chen, P. B. Gibbons, and S. Nath. Rethinking database algorithms for phase change memory. In Proceedings of the 5th Biennial Conference on Innovative Data Systems Research (CIDR '11), pages 21--31, 2011.
    [11]
    J. Coburn, A. M. Caulfield, A. Akel, L. M. Grupp, R. K. Gupta, R. Jhala, and S. Swanson. Nv-heaps: Making persistent objects fast and safe with next-generation, non-volatile memories. In Proceedings of the Sixteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '11), pages 105--118, 2011.
    [12]
    E. G. Coffman and P. J. Denning. Operating systems theory, volume 973. 1973.
    [13]
    J. Condit, E. B. Nightingale, C. Frost, E. Ipek, B. Lee, D. Burger, and D. Coetzee. Better i/o through byte-addressable, persistent memory. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (SOSP '09), pages 133--146, 2009.
    [14]
    I. Cooperation. Nvdimm namespace specification. http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf, 2015.
    [15]
    I. Cooperation. Intel architecture instruction set extensions programming reference. https://software.intel.com/sites/default/files/managed/b4/3a/319433-024.pdf, 2016.
    [16]
    P. J. Denning. The working set model for program behavior. Communications of the ACM, 11(5):323--333, 1968.
    [17]
    E. Doller. Phase change memory and its impacts on memory hierarchy. http://www.pdl.cmu.edu/SDI/2009/slides/Numonyx.pdf, 2009.
    [18]
    S. R. Dulloor, S. Kumar, A. Keshavamurthy, P. Lantz, D. Reddy, R. Sankaran, and J. Jackson. System software for persistent memory. In Proceedings of the Ninth European Conference on Computer Systems (EuroSys '14), pages 15:1--15:15, 2014.
    [19]
    T. Harter, C. Dragga, M. Vaughn, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. A file is not a file: Understanding the i/o behavior of apple desktop applications. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles (SOSP '11), pages 71--83, 2011.
    [20]
    J. Huang, K. Schwan, and M. K. Qureshi. Nvram-aware logging in transaction systems. Proceedings of the VLDB Endowment, 8(4):389--400, Dec. 2014.
    [21]
    L. Jiang, B. Zhao, Y. Zhang, J. Yang, and B. Childers. Improving write operations in mlc phase change memory. In Proceedings of the 18th International Symposium on High Performance Computer Architecture (HPCA '12), pages 1--10, Feb 2012.
    [22]
    H. Jo, J.-U. Kang, S.-Y. Park, J.-S. Kim, and J. Lee. Fab: flash-aware buffer management policy for portable media players. IEEE Transactions on Consumer Electronics, 52(2):485--493, May 2006.
    [23]
    T. Johnson and D. Shasha. 2q: A low overhead high performance buffer management replacement algorithm. In Proceedings of the 20th International Conference on Very Large Data Bases (VLDB '94), pages 439--450, 1994.
    [24]
    M. Jung, J. Shalf, and M. Kandemir. Design of a large-scale storage-class rram system. In Proceedings of the 27th International ACM Conference on International Conference on Supercomputing (ICS '13), pages 103--114, 2013.
    [25]
    S. Kang, S. Park, H. Jung, H. Shim, and J. Cha. Performance trade-offs in using nvram write buffer for flash memory-based storage devices. IEEE Transactions on Computers (TC), 58 (6):744--758, June 2009.
    [26]
    J. Katcher. Postmark: A new file system benchmark. Technical report, Technical Report TR3022, Network Appliance, 1997.
    [27]
    H. Kim and S. Ahn. Bplru: A buffer management scheme for improving random writes in flash storage. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST'08), pages 16:1--16:14, 2008.
    [28]
    E. LABORATORY. Mobibench benchmark tool. http://www.mobibench.co.kr/.
    [29]
    B. Lee, P. Zhou, J. Yang, Y. Zhang, B. Zhao, E. Ipek, O. Mutlu, and D. Burger. Phase-change technology and the future of main memory. Micro, IEEE, 30(1):143--143, Jan 2010.
    [30]
    B. C. Lee, E. Ipek, O. Mutlu, and D. Burger. Architecting phase change memory as a scalable dram alternative. In Proceedings of the 36th Annual International Symposium on Computer Architecture (ISCA '09), pages 2--13, 2009.
    [31]
    E. Lee and H. Bahn. Caching strategies for high-performance storage media. ACM Transactions on Storage (TOS), 10(3): 11:1--11:22, Aug. 2014.
    [32]
    Y. Lu, J. Shu, L. Sun, and O. Mutlu. Loose-ordering consistency for persistent memory. In Proceeding of the 32nd IEEE International Conference on Computer Design (ICCD'14), pages 216--223, Oct 2014.
    [33]
    Y. Lu, J. Shu, and L. Sun. Blurred persistence in transactional persistent memory. In Proceeding of the 31st Symposium on Mass Storage Systems and Technologies (MSST'15), pages 1--13, May 2015.
    [34]
    N. Megiddo and D. S. Modha. Arc: A self-tuning, low overhead replacement cache. In Proceedings of the 2nd USENIX Conference on File and Storage Technologies (FAST '03), pages 115--130, 2003.
    [35]
    C. Min, K. Kim, H. Cho, S.-W. Lee, and Y. I. Eom. Sfs: random write considered harmful in solid state drives. In 10th USENIX Conference on File and Storage Technologies (FAST '12), pages 139--154, San Jose, CA, Feb. 2012.
    [36]
    I. Moraru, D. G. Andersen, M. Kaminsky, N. Tolia, P. Ranganathan, and N. Binkert. Consistent, durable, and safe memory management for byte-addressable non volatile main memory. In Proceedings of the First ACM SIGOPS Conference on Timely Results in Operating Systems (TRIOS '13), pages 1:1--1:17, 2013.
    [37]
    E. B. Nightingale, K. Veeraraghavan, P. M. Chen, and J. Flinn. Rethink the sync. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI '06), pages 1--14, 2006.
    [38]
    J. Ou, J. Shu, Y. Lu, L. Yi, and W. Wang. Edm: An endurance-aware data migration scheme for load balancing in ssd storage clusters. In Proceedings of the 28th International Parallel and Distributed Processing Symposium (IPDPS'14), pages 787--796, May 2014.
    [39]
    S. Pelley, P. M. Chen, and T. F. Wenisch. Memory persistency. In Proceeding of the 41st Annual International Symposium on Computer Architecuture (ISCA'14), pages 265--276, 2014.
    [40]
    M. K. Qureshi, V. Srinivasan, and J. A. Rivers. Scalable high performance main memory system using phase-change memory technology. In Proceedings of the 36th Annual International Symposium on Computer Architecture (ISCA '09), pages 24--33, 2009.
    [41]
    L. E. Ramos, E. Gorbatov, and R. Bianchini. Page placement in hybrid memory systems. In Proceedings of the International Conference on Supercomputing (ICS '11), pages 85--95, 2011.
    [42]
    D. Roselli, J. R. Lorch, and T. E. Anderson. A comparison of file system workloads. In Proceedings of the Annual Conference on USENIX Annual Technical Conference (ATEC '00), pages 41--54, 2000.
    [43]
    C. Ruemmler and J. Wilkes. Unix disk access patterns. In USENIX Winter, volume 93, pages 405--420, 1993.
    [44]
    K. Suzuki and S. Swanson. The non-volatile memory technology database (nvmdb). Technical Report CS2015-1011, Department of Computer Science & Engineering, University of California, San Diego, May 2015.
    [45]
    S. Venkataraman, N. Tolia, P. Ranganathan, and R. H. Campbell. Consistent and durable data structures for non-volatile byte-addressable memory. In Proceedings of the 9th USENIX Conference on File and Storage Technologies (FAST '11), pages 61--75, San Jose, CA, Feb. 2011.
    [46]
    H. Volos, A. J. Tack, and M. M. Swift. Mnemosyne: Lightweight persistent memory. In Proceedings of the Sixteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '11), pages 91--104, 2011.
    [47]
    H. Volos, S. Nalli, S. Panneerselvam, V. Varadarajan, P. Saxena, and M. M. Swift. Aerie: Flexible file-system interfaces to storage-class memory. In Proceedings of the Ninth European Conference on Computer Systems (EuroSys '14), pages 14:1--14:14, 2014.
    [48]
    D. Willick, D. Eager, and R. Bunt. Disk cache replacement policies for network fileservers. In Proceedings the 13th International Conference on Distributed Computing Systems (ICDCS '93), pages 2--11, May 1993.
    [49]
    X. Wu and A. L. N. Reddy. Scmfs: A file system for storage class memory. In Proceedings of the 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC '11), pages 39:1--39:11, 2011.
    [50]
    J. Yang, Q. Wei, C. Chen, C. Wang, K. L. Yong, and B. He. Nv-tree: reducing consistency cost for nvm-based single level systems. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST '15), pages 167--181, Santa Clara, CA, Feb. 2015.
    [51]
    J. J. Yang and R. S. Williams. Memristive devices in computing system: Promises and challenges. Journal on Emerging Technologies in Computing Systems (JETC), 9(2):11:1--11:20, May 2013.
    [52]
    Y. Zhang and S. Swanson. A study of application performance with non-volatile main memory. In Proceedings of the 31st Symposium on Mass Storage Systems and Technologies (MSST '15), pages 1--10, May 2015.
    [53]
    Y. Zhang, J. Yang, A. Memaripour, and S. Swanson. Mojim: A reliable and highly-available non-volatile memory system. In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '15), pages 3--18, 2015.
    [54]
    P. Zhou, B. Zhao, J. Yang, and Y. Zhang. A durable and energy efficient main memory using phase change memory technology. In Proceedings of the 36th Annual International Symposium on Computer Architecture (ISCA '09), pages 14--23, 2009.

    Cited By

    View all
    • (2024)Exploiting Flat Namespace to Improve File System Metadata Performance on Ultra-Fast, Byte-Addressable NVMsACM Transactions on Storage10.1145/362067320:1(1-47)Online publication date: 30-Jan-2024
    • (2023)CJFSProceedings of the 21st USENIX Conference on File and Storage Technologies10.5555/3585938.3585949(167-181)Online publication date: 21-Feb-2023
    • (2023)NICFS: a file system based on persistent memory and SmartNICNICFS:基于持久化内存和智能网卡的文件系统Frontiers of Information Technology & Electronic Engineering10.1631/FITEE.220046924:5(675-687)Online publication date: 2-Jun-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    EuroSys '16: Proceedings of the Eleventh European Conference on Computer Systems
    April 2016
    605 pages
    ISBN:9781450342407
    DOI:10.1145/2901318
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 April 2016

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    EuroSys '16
    EuroSys '16: Eleventh EuroSys Conference 2016
    April 18 - 21, 2016
    London, United Kingdom

    Acceptance Rates

    EuroSys '16 Paper Acceptance Rate 38 of 180 submissions, 21%;
    Overall Acceptance Rate 241 of 1,308 submissions, 18%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)86
    • Downloads (Last 6 weeks)10
    Reflects downloads up to 11 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Exploiting Flat Namespace to Improve File System Metadata Performance on Ultra-Fast, Byte-Addressable NVMsACM Transactions on Storage10.1145/362067320:1(1-47)Online publication date: 30-Jan-2024
    • (2023)CJFSProceedings of the 21st USENIX Conference on File and Storage Technologies10.5555/3585938.3585949(167-181)Online publication date: 21-Feb-2023
    • (2023)NICFS: a file system based on persistent memory and SmartNICNICFS:基于持久化内存和智能网卡的文件系统Frontiers of Information Technology & Electronic Engineering10.1631/FITEE.220046924:5(675-687)Online publication date: 2-Jun-2023
    • (2023)An efficient wear-leveling-aware multi-grained allocator for persistent memory file systems一种磨损感知的持久化内存文件系统高效多粒度分配器Frontiers of Information Technology & Electronic Engineering10.1631/FITEE.220046824:5(688-702)Online publication date: 2-Jun-2023
    • (2023)Progress on storage systems for disaggregated data centersSCIENTIA SINICA Informationis10.1360/SSI-2023-003453:8(1503)Online publication date: 17-Aug-2023
    • (2023)Cache or Direct Access? Revitalizing Cache in Heterogeneous Memory File SystemProceedings of the 1st Workshop on Disruptive Memory Systems10.1145/3609308.3625272(38-44)Online publication date: 23-Oct-2023
    • (2023)Adaptive Management With Request Granularity for DRAM Cache Inside nand-Based SSDsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.322929342:8(2475-2487)Online publication date: Aug-2023
    • (2023)A Survey of Non-Volatile Main Memory File SystemsJournal of Computer Science and Technology10.1007/s11390-023-1054-338:2(348-372)Online publication date: 30-Mar-2023
    • (2022)Implementation of Remote-Sensing Data Processing Platform Based on Computable StorageMobile Information Systems10.1155/2022/62278942022(1-11)Online publication date: 9-Sep-2022
    • (2022)Zallocator: A High Throughput Write-Optimized Persistent Allocator for Non-Volatile MemoryACM Journal on Emerging Technologies in Computing Systems10.1145/354952818:4(1-20)Online publication date: 13-Oct-2022
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media