Design and Implementation of Various File Deduplication Schemes on Storage Devices

Su, Kuan-Wu; Leu, Jenq-Shiou; Yu, Min-Chieh; Wu, Yong-Ting; Lee, Eau-Chung; Song, Tian

doi:10.1007/s11036-016-0677-9

Design and Implementation of Various File Deduplication Schemes on Storage Devices

Published: 15 January 2016

Volume 22, pages 40–50, (2017)
Cite this article

Mobile Networks and Applications Aims and scope Submit manuscript

Kuan-Wu Su¹,
Jenq-Shiou Leu¹,
Min-Chieh Yu¹,
Yong-Ting Wu¹,
Eau-Chung Lee² &
…
Tian Song³

386 Accesses
6 Citations
Explore all metrics

Abstract

As smart devices are revolutionized in recent years, people may generate enormous amount of various sized data and store them in the local or remote file system in their daily lives. With cheaper and easy to use private cloud storage appliances helping to handle the increasing demand of storing and sharing big volume of data, effective file deduplication schemes can greatly increase the space efficiency in private cloud storage systems as well as preserve network bandwidth. In the paper, we aim at designing and implementing several file deduplication schemes built in the private cloud storage appliance, based on different duplication checking rules, including file name, file size, and file partial/full content hash value. Experiment results show using partial content hashing based file deduplication scheme achieves a reasonably balanced performance without overutilized limited local computational resources.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Novel Approach to File Deduplication in Cloud Storage Systems

Distributed Storage Hash Algorithm (DSHA) for File-Based Deduplication in Cloud Computing

Content-Based Chunk Placement Scheme for Decentralized Deduplication on Distributed File Systems

References

Tate J, Beck P, Ibarra HH, Kumaravel S, Miklas L (2012) Introduction to storage area networks and system networking. IBM Redbooks
Hong B, Plantenberg D, Long DD, & Sivan-Zimet M (2004) “Duplicate Data Elimination in a SAN File System”. In MSST (pp. 301–314)
Bobbarjung DR, Jagannathan S, Dubnicki C (2006) Improving duplicate elimination in storage systems. ACM Trans Storage (TOS) 2(4):424–448
Article Google Scholar
Min J, Yoon D, Won Y (2011) Efficient deduplication techniques for modern backup operation. Comput IEEE Trans on 60(6):824–840
Article MathSciNet Google Scholar
Li J, Li YK, Chen X, Lee PP, Lou W (2015) A hybrid cloud approach for secure authorized deduplication. Parallel and Distrib Sys IEEE Trans on 26(5):1206–1216
Article Google Scholar
Stanek J, Sorniotti A, Androulaki E, Kencl L (2014) A secure data deduplication scheme for cloud storage in financial cryptography and data security. Springer, Berlin Heidelberg, pp 99–118
Google Scholar
Meyer DT, Bolosky WJ (2012) A study of practical deduplication. ACM Trans Storage (TOS) 7(4):14
Google Scholar
Harnik D, Pinkas B, Shulman-Peleg A (2010) Side channels in cloud services: deduplication in cloud storage. Security & Privacy IEEE 8(6):40–47
Article Google Scholar
Paulo J, Pereira J (2014) A survey and classification of storage deduplication systems. ACM Comput Surveys (CSUR) 47(1):11
Article Google Scholar
Meister D., & Brinkmann A (2009) Multi-level comparison of data deduplication in a backup scenario. In Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference (p. 8). ACM
Henson V (2003) An Analysis of Compare-by-hash. In HotOS (pp. 13–18)
Malhotra J, & Bakal J (2015) A survey and comparative study of data deduplication techniques. In Pervasive Computing (ICPC), 2015 International Conference on (pp. 1–5). IEEE
Rivest R (1992) The MD5 message-digest algorithm. RFC 1321
Banachowski L, Kreczmar A, Rytter W (1991) Analysis of Algorithms and Data Structures
Quinlan S, & Dorward S (2002) Venti: A New Approach to Archival Storage. In FAST (Vol. 2, pp. 89–101).2

Download references

Acknowledgments

The authors gratefully acknowledge the financial support from the “Aiming For the Top University Program” funded by Ministry of Education, Taiwan.

Author information

Authors and Affiliations

Department of Electronic and Computer Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
Kuan-Wu Su, Jenq-Shiou Leu, Min-Chieh Yu & Yong-Ting Wu
QNAP Inc, Taipei, Taiwan
Eau-Chung Lee
Department of Electrical and Electronic Engineering, School of Engineering, Tokushima University, Tokushima, Japan
Tian Song

Authors

Kuan-Wu Su
View author publications
You can also search for this author in PubMed Google Scholar
Jenq-Shiou Leu
View author publications
You can also search for this author in PubMed Google Scholar
Min-Chieh Yu
View author publications
You can also search for this author in PubMed Google Scholar
Yong-Ting Wu
View author publications
You can also search for this author in PubMed Google Scholar
Eau-Chung Lee
View author publications
You can also search for this author in PubMed Google Scholar
Tian Song
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jenq-Shiou Leu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Su, KW., Leu, JS., Yu, MC. et al. Design and Implementation of Various File Deduplication Schemes on Storage Devices. Mobile Netw Appl 22, 40–50 (2017). https://doi.org/10.1007/s11036-016-0677-9

Download citation

Published: 15 January 2016
Issue Date: February 2017
DOI: https://doi.org/10.1007/s11036-016-0677-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Design and Implementation of Various File Deduplication Schemes on Storage Devices

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Novel Approach to File Deduplication in Cloud Storage Systems

Distributed Storage Hash Algorithm (DSHA) for File-Based Deduplication in Cloud Computing

Content-Based Chunk Placement Scheme for Decentralized Deduplication on Distributed File Systems

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Design and Implementation of Various File Deduplication Schemes on Storage Devices

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Novel Approach to File Deduplication in Cloud Storage Systems

Distributed Storage Hash Algorithm (DSHA) for File-Based Deduplication in Cloud Computing

Content-Based Chunk Placement Scheme for Decentralized Deduplication on Distributed File Systems

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation