HEDS: Hybrid Email Deduplication System

Kim, Daehee; Song, Sejun; Choi, Baek-Young

doi:10.1007/978-3-319-42280-0_3

Daehee Kim⁴,
Sejun Song⁵ &
Baek-Young Choi⁵

679 Accesses

Abstract

In this chapter, we show a server-side deduplication component, HEDS (Hybrid Email Deduplication System) for the proposed deduplication framework. HEDS removes redundancies by trading-off of file-level and block deduplication for email systems while achieving good storage space savings and low processing overhead.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13, 422–426 (1970)
Article MATH Google Scholar
Bolosky, W., Corbin, S., Goebel, D., Douceur, J.: Single instance storage in Windows 2000. In: Proceeding of the 4th USENIX Windows Systems Symposium (2000)
Google Scholar
FUSE: File in UserSpacE. http://fuse.sourceforge.net/ (2016)
Klimt, B., Yang, Y.: The enron corpus: a new dataset for email classification research, pp. 217–226. http://nyc.lti.cs.cmu.edu/yiming/Publications/klimt-ecml04.pdf (2004)
Lillibridge, M., Eshghi, K., Bhagwat, D., Deolalikar, V., Trezise, G., Camble, P.: Sparse indexing: large scale, inline deduplication using sampling and locality. In: Proceeding of the USENIX Conference on File and Storage Technologies (FAST) (2009)
Google Scholar
Meyer, D.T., Bolosky, W.J.: A study of practical deduplication. In: Proceeding of the USENIX Conference on File and Storage Technologies (FAST) (2011)
Google Scholar
Milter.org: Sendmail mail filters. http://www.sendmail.com/sm/partners/milter_partners/open_source_milter_partners/ (2015)
National Institute of Standards and Technology (NIST): Secure Hash Standard 1 (SHA1). http://csrc.nist.gov/publications/fips/fips180-4/fips-180-4.pdf (2015)
Rabin, M.O.: Fingerprinting by random polynomials. Tech. Rep. Report TR-15-81, Harvard University (1981)
Google Scholar
Zhu, B., Li, K., Patterson, H.: Avoiding the disk bottleneck in the data domain deduplication file system. In: Proceeding of the USENIX Conference on File and Storage Technologies (FAST) (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computing and New Media Technologies, University of Wisconsin-Stevens Point, Stevens Point, Wisconsin, USA
Daehee Kim
Department of Computer Science and Electrical Engineering, University of Missouri-Kansas City, Kansas City, Missouri, USA
Sejun Song & Baek-Young Choi

Authors

Daehee Kim
View author publications
You can also search for this author in PubMed Google Scholar
Sejun Song
View author publications
You can also search for this author in PubMed Google Scholar
Baek-Young Choi
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kim, D., Song, S., Choi, BY. (2017). HEDS: Hybrid Email Deduplication System. In: Data Deduplication for Data Optimization for Storage and Network Systems. Springer, Cham. https://doi.org/10.1007/978-3-319-42280-0_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-42280-0_3
Published: 09 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42278-7
Online ISBN: 978-3-319-42280-0
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics