Abstract
Due to its low cost and high performance, 3D NAND flash memory is a popular choice as the storage device in edge computing platforms. One of the critical challenges in 3D NAND flash is frequent data refresh operations to eliminate transient errors such as retention errors. The default raw bit error rate (DRBER) is usually used as a reference to decide when to apply refresh. This method is often over-conservative, resulting in excessive unnecessary refresh operations in 3D NAND flash, especially at the late-life stage. Enduring extra wearing from refresh operations, flash blocks will need more frequent refresh operations to deal with the exponentially increasing errors. To avoid the snowball effect on flash wearing, this paper proposes to minimize the number of refresh operations by exploiting the optimal RBER (ORBER). We first develop an ORBER model by conducting evaluations and analyses on a set of real 3D NAND flash chips. Based on this model, a new refresh scheme is proposed to extend the lifetime of 3D flash memory. Experiments show that, within the lifetime of 3D NAND flash memory, the proposed method can averagely reduce 75% of the P/E cycles consumed by refresh operations and improve lifetime by 2.5X with marginal overhead, compared to the traditional refresh scheme.
Similar content being viewed by others
References
Cai, Y., Haratsch, E.F., Mutlu, O., Mai, K.: Threshold voltage distribution in mlc nand flash memory: characterization, analysis, and modeling. In: 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE), pp. 1285–1290, IEEE (2013)
Cai, Y., Luo, Y., Haratsch, E.F., Mai, K., Mutlu, O.: Data retention in MLC NAND flash memory: Characterization, optimization, and recovery. In: 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA), pp. 551–563, IEEE (2015)
Cai, Y., Ghose, S., Haratsch, E.F., Luo, Y., Mutlu, O.: Errors in flash-memory-based solid-state drives: analysis, mitigation, and recovery. arXiv preprint arXiv:1711.11427 (2017)
Cai, Y., Luo, Y., Haratsch, E.F., Mai, K., Ghose, S., Mutlu, O.: Experimental characterization, optimization, and recovery of data retention errors in mlc nand flash memory. arXiv preprint arXiv:1805.02819 (2018)
Chen, S.-H., Chen, Y.-T., Chang, Y.-H., Wei, H.-W., Shih, W.-K.: A progressive performance boosting strategy for 3-D charge-trap NAND flash. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 99, 1–13 (2018)
Di, Y., Shi, L., Gao, C., Li, Q., Xue, C.J., Wu, K.: Minimizing retention induced refresh through exploiting process variation of flash memory. IEEE Trans. Comput. 68(1), 83–98 (2018)
Du, Y., Li, Q., Shi, L., Zou, D., Jin, H., Xue, C.J.: Reducing ldpc soft sensing latency by lightweight data refresh for flash read performance improvement. In: 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC), pp. 1–6, IEEE (2017)
Du, P.-Y., Lue, H.-T., Hsu, T.-H., Hsieh, C.-C., Chen, W.-C., Chang, K.-P., Wang, K.-C., Lu, C.-Y.: Read disturb evaluations of 3d nand flash for highly read-intensive edge-computing inference device for artificial intelligence applications. In: 2019 IEEE 11th International Memory Workshop (IMW), pp. 1–4, IEEE (2019)
Gao, C., Ye, M., Li, Q., Xue, C.J., Zhang, Y., Shi, L., Yang, J.: Constructing large, durable and fast ssd system via reprogramming 3d tlc flash memory. In: Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, pp. 493–505 (2019)
Grupp, L.M., Caulfield, A.M., Coburn, J., Swanson, S., Yaakobi, E., Siegel, P.H., Wolf, J.K.: Characterizing flash memory: anomalies, observations, and applications. In: Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, pp. 24–33 (2009)
Guo, J., Wang, D., Shao, Z., Chen, Y.: Data-pattern-aware error prevention technique to improve system reliability. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 25(4), 1433–1443 (2017)
Hasan, M.M., Ray, B.: Reliability of nand flash memory as a weight storage device of artificial neural network. IEEE Trans. Device Mater. Reliab. 20(3), 596–603 (2020)
Hu, Y., Jiang, H., Feng, D., Tian, L., Luo, H., Zhang, S.: Performance impact and interplay of ssd parallelism through advanced commands, allocation strategy and data granularity. In: Proceedings of the International Conference on Supercomputing, pp. 96–107 (2011)
Huh, H., Cho, W., Lee, J., Noh, Y., Park, Y., Ok, S., Kim, J., Cho, K., Lee, H., Kim, G., et al.: 13.2 a 1tb 4b/cell 96-stacked-wl 3d nand flash memory with 30mb/s program throughput using peripheral circuit under memory cell array technique. In: 2020 IEEE International Solid-State Circuits Conference-(ISSCC), pp. 220–221, IEEE (2020)
Hung, C.-H., Chang, M.-F., Yang, Y.-S., Kuo, Y.-J., Lai, T.-N., Shen, S.-J., Hsu, J.-Y., Hung, S.-N., Lue, H.-T., Shih, Y.-H., et al.: Layer-aware program-and-read schemes for 3d stackable vertical-gate be-sonos nand flash against cross-layer process variations. IEEE J. Solid-State Circ. 50(6), 1491–1501 (2015)
Hynix, S.: 128-layer NAND Flash. https://bit-tech.net/news/tech/storage/sk-hynix-achieves-worlds-first-128-layer-nand-flash-based-consumer-ssd/1/ (August 19, 2020)
Jimenez, X., Novo, D., Ienne, P.: Wear unleveling: improving nand flash lifetime by balancing page endurance. In: Proceedings of the 12th USENIX Conference on File and Storage Technologies (2014)
Kang, D., Jeong, W., Kim, C., Kim, D.-H., Cho, Y.S., Kang, K.-T., Ryu, J., Kang, K.-M., Lee, S., Kim, W., et al.: 256 Gb 3 b/cell V-NAND flash memory with 48 stacked wl layers. IEEE J. Solid-State Circ. 52(1), 210–217 (2017)
Kim, B.S., Choi, J., Min, S.L.: Design tradeoffs for SSD reliability. In: 17th USENIX Conference on File and Storage Technologies (FAST),pp. 281–294. USENIX Association, Boston, MA (2019)
Lee, S., Lee, J.-y., Park, I.-h., Park, J., Yun, S.-w., Kim, M.-s., Lee, J.-h., Kim, M., Lee, K., Kim, T., et al.: 7.5 A 128Gb 2b/cell NAND flash memory in 14nm technology with tprog= 640 μs and 800MB/s I/O rate. In: 2016 IEEE International Solid-State Circuits Conference (ISSCC), pp. 138–139 (2016). IEEE
Li, Q., Shi, L., Cui, Y., Xue, C.J.: Exploiting asymmetric errors for ldpc decoding optimization on 3d nand flash memory. IEEE Trans. Comput. 69, 475–488 (2019)
Li, Q., Ye, M., Cui, Y., Shi, L., Li, X., Kuo, T.-W., Xue, C.J.: Shaving retries with sentinels for fast read over high-density 3d flash. In: 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 483–495, IEEE (2020)
Lin, W., Chen, J., Zhang, X., Cheng, Z.: Improving 3d nand flash memory read performance by modeling the read offset. In: 2019 IEEE 19th International Conference on Communication Technology (ICCT), pp. 1472–1476, IEEE (2019)
Luo, Y., Cai, Y., Ghose, S., Choi, J., Mutlu, O.: Warm: Improving nand flash memory lifetime with write-hotness aware retention management. In: 2015 31st Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–14, IEEE (2015)
Luo, Y., Ghose, S., Cai, Y., Haratsch, E.F., Mutlu, O.: Heatwatch: improving 3d nand flash memory device reliability by exploiting self-recovery and temperature awareness. In: 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA), pp. 504–517 (2018a)
Luo, Y., Ghose, S., Cai, Y., Haratsch, E., Mutlu, O.: Improving 3d nand flash memory lifetime by tolerating early retention loss and process variation. In: Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems, pp. 106 (2018b)
Ma, R., Wu, F., Zhang, M., Lu, Z., Wan, J., Xie, C.: Rber-aware lifetime prediction scheme for 3d-tlc nand flash memory. IEEE Access 7, 44696–44708 (2019)
Ma, R., Wu, F., Lu, Z., Zhong, W., Wu, Q., Wan, J., Xie, C.: Blockhammer: improving flash reliability by exploiting process variation aware proactive failure prediction. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst. 39, 4563–4567 (2020)
Papandreou, N., Ioannou, N., Parnell, T., Pletka, R., Stanisavljevic, M., Stoica, R., Tomic, S., Pozidis, H.: Reliability of 3d nand flash memory with a focus on read voltage calibration from a system aspect. In: 2019 19th Non-Volatile Memory Technology Symposium (NVMTS), pp. 1–4 (2019). IEEE
Schroeder, R.L. Bianca, Merchant., A.: Flash reliability in production: the expected and the unexpected. In: 14th USENIX Conference on File and Storage Technologies (FAST 16), pp. 67–80 (2016)
Shi, X., Wu, F., Wang, S., Xie, C., Lu, Z.: Program error rate-based wear leveling for nand flash memory. In: 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE), pp. 1241–1246 (2018). IEEE
Shim, Y., Kim, M., Chun, M., Park, J., Kim, Y., Kim, J.: Exploiting process similarity of 3d flash memory for high performance ssds. In: Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, pp. 211–223 (2019)
Tesla.: MCU1 flash memory analysis and failures. https://teslatap.com/articles/mcu1-flash-memory-analysis-and-failures/ (July 6, 2020)
Xiong, Q., Wu, F., Lu, Z., Zhu, Y., Zhou, Y., Chu, Y., Xie, C., Huang, P.: Characterizing 3d floating gate nand flash: observations, analyses, and implications. ACM Trans. Storage (TOS) 14(2), 16 (2018)
Xu, E., Zheng, M., Qin, F., Xu, Y., Wu, J.: Lessons and actions: What we learned from 10k ssd-related storage system failures. In: 2019 USENIX Annual Technical Conference (USENIX ATC 19), pp. 961–976 (2019)
YEESTOR: YS9083XT/YS9081XT SSD Platform. http://www.yeestor.com/index-product-info-id-946-cid-33-pid-3-infopid-33.html (June 4, 2018)
Acknowledgements
The work was supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (CityU 11218720 and 11217020). On behalf of all authors, the corresponding author states that there is no conflict of interest.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ye, M., Li, Q., Gao, C. et al. Stop unnecessary refreshing: extending 3D NAND flash lifetime with ORBER. CCF Trans. HPC 4, 281–301 (2022). https://doi.org/10.1007/s42514-022-00107-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42514-022-00107-x