Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/1267102.1267104guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Single instance storage in Windows® 2000

Published: 03 August 2000 Publication History

Abstract

Certain applications, such as Windows 2000's Remote Install service, can result in a set of files in which many different files have the same content. Using a traditional file system to store these files separately results in excessive use of disk and main memory file cache space. Using hard or symbolic links would eliminate the excess resource requirements, but changes the semantics of having separate files, in that updates to one "copy" of a file would be visible to users of another "copy." We describe the Single Instance Store (SIS), a component within Windows® 2000 that implements links with the semantics of copies for files stored on a Windows 2000 NTFS volume. SIS uses copy-on-close to implements the copy semantics of its links. SIS is structured as a file system filter driver that implements links and a user level service that detects duplicate files and reports them to the filter for conversion into links. Because SIS links are semantically identical to separate files, SIS creates them automatically when it detects files with duplicate contents. This paper describes the design and implementation of SIS in detail, briefly presents measurements of a remote install server showing a 58% disk space savings by using SIS, and discusses other possible uses of SIS.

References

[1]
{Accetta 86} M. Accetta, R. Baron, W. Bolosky, D. Golub, R. Rashid, A. Tevanian, and M. Young. "Mach: A New Kernel Foundation for UNIX Development," In Proceedings of the Summer USENIX, July, 1986.]]
[2]
{Anderson 95} T. Anderson, M. Dahlin, J. Neefe, D. Patterson, D. Roselli, and R. Wang. "Serverless Network File Systems," In Proceedings of the 15th ACM Symposium on Operating Systems Principles, pp. 109-126, December 1995.]]
[3]
{Baker 97} A. Baker. The Windows NT Device Driver Book, Prentice Hall PTR, 1997.]]
[4]
{Bolosky 99} W. Bolosky. "The SIS/Backup Interface," available upon request from Steve Olsson, [email protected].]]
[5]
{Bolosky 00} W. Bolosky, J. Douceur, D. Ely and M. Theimer. "Evaluation of Desktop PCs as Candidates for a Serverless, Distributed File System," to appear in Proceedings of ACM SIGMETRICS 2000.]]
[6]
{Cabrera 91} L. Cabrera and D. D. E. Long. "Swift: Using Distributed Disk Striping to Provide High I/O Data Rates," Computing Systems, 4(4):405-436, Fall 1991.]]
[7]
{Custer 94} H. Custer. Inside the Windows NT File System. Microsoft Press, 1994.]]
[8]
{Douceur 99} J. Douceur and W. Bolosky. "A Large-Scale Study of File System Contents," in Proceedings of ACM SIGMETRICS '99, pp. 59-70, May 1999.]]
[9]
{Douceur 99a} J. Douceur and W. Bolosky, "Progress-based regulation of low-importance processes," in Proceedings of the 17th ACM Symposium on Operating Systems Principles, pp. 247-260, December, 1999.]]
[10]
{Fisher 98} L. Fisher. The Windows NT INstallable File System Kit. A Microsoft product that can be ordered from http://www.microsoft.com/hwdev/ntifskit.]]
[11]
{Fitzgerald 86} R. Fitzgerald and R. Rashif, "The Integration of Virtual Memory Mangement and Interprocess Communication in Accent," ACM Transactions on Computer Systems, 4(2):147-177, May, 1986.]]
[12]
{Guy 90} R. G. Guy, J. S. Heidemann, W. Mak, T. W. P., Jr, G. J. Popek, and D. Rothmeier., "Implementation of the Ficus Replicated File System," Proc. of the Summer 1990 USENIX Conference, pp. 63-71, June, 1990.]]
[13]
{Hartman 93} J. H. Hartman and J. K. Ousterhout. "The Zebra Striped Network File System," In Proceedings of the 14th ACM Symposium on Operating Systems Principles, pp. 29-43, December, 1993.]]
[14]
{Howard 88} J. Howard, M. Kazar, S. Menees, D. Nichols, M. Satyanarayanan, R. Sidebotham, and M. West. "Scale and Performance in a Distributed File System," ACM Transactions on Computer Systems, 6(1):51-81, February 1988.]]
[15]
{Kistler 91} J. Kistler and M. Satyanarayanan. "Disconnected Operation in the Coda File System," In Proceedings of the 13th ACM Symposium on Operating Systems Principles, pp. 213-225, October, 1991.]]
[16]
{Leach 98} P. Leach. Personal communication. He said this is only documented in the manuals, not any other published sources, and these manuals are now hard to find.]]
[17]
{Li 86} K. Li and P. Hudak. "Memory Coherence in Shared Virtual Memory Systems," In Proceedings of the 5th Symposium on Principles of Distributed Computing, pp. 229-239, August 1986.]]
[18]
{McKusick 84} M. K. McKusick, W. N. Joy, S. J. Leffler, and R. S. Fabry. "A Fast File System for Unix," ACM Tranactions on Computer Systems, 2(3):181-197, August, 1984.]]
[19]
{Microsoft 00} Microsoft Windows 2000 Server online help file. Microsoft Corporation, February 2000.]]
[20]
{Microsoft 00a} Microsoft Developer Network Library. A product available from Microsoft. See http://msdn.microsoft.com/. January, 2000.]]
[21]
{Nagar 97} R. Nagar. Windows NT File System Internals. O'Reilly, 1997.]]
[22]
{Ousterhout 88} J. Ousterhout, A. Cherenson, F. Douglis, M. Nelson and B. Welch. "The Sprite Network Operating System," IEEE Computer 21(2):23-36, February, 1988.]]
[23]
{Patterson 88} D. A. Patterson, G. Gibson, and R. H. Katz. "A Case for Redundant Arrays of Inexpensive Disks (RAID)," In Proceedings of the 1988 ACM Conference on Management of Data (SIGMOD), pp. 109-116, June 1988.]]
[24]
{Pawlowski 94} B. Pawlowski, C. Juszczak, P. Staubach, C. Smith, D. Lebel and D. Hitz. "NFS Version 3 Design and Implementation," In Proceedings of the Summer USENIX Conference, pp. 137-152, June 1994.]]
[25]
{Sandberg 85} R. Sandberg, D. Goldberg, S. Kleiman, D. Walsh, and B. Lyon. "Design and Implementation of the Sun Network Filesystem," In Proceedings of the Summer USENIX Conference, pp. 119-130, June 1985.]]
[26]
{Rashid 81} R. Rashid and G. Robertson. "Accent: A Communication Oriented Network Operating System Kernel," In Proceedings of the 8th ACM Symposium on Operating Systems Principles, pp. 64-75, December, 1981.]]
[27]
{Satya 90} M. Satyanarayanan, J. Kistler, P. Kumar, M. Okasaki, E. Siegel, and D. Steere. "Coda: A Highly Available Filesystem for a Distributed Workstation Environment," IEEE Transactions on Computers, 39(4), April 1990.]]
[28]
{Solomon 98} D. Solomon. Inside Windows NT, Second Edition. Microsoft Press, 1998.]]
[29]
{Thekkath 97} C. Thekkath, T. Mann and E. Lee. "Frangipani: A Scalable Distributed File System," In Proceedings of the 16th ACM Symposium on Operating Systems Principles, pp. 224-237, December, 1997.]]
[30]
{Thompson 85} M. R. Thompson, R. D. Sansom, M. B. Jones, and R. F. Rashid. "Sesame: The Spice File System," Carnegie-Mellon University Computer Science Technical Report CMU-CS-85-172, Carnegie-Mellon University, Pittsburgh, PA. 1985.]]
[31]
{Todd 96} G. Todd, et al. Microsoft Exchange Server Survival Guide. Sams Publishing, 1996.]]
[32]
{Vogels 99} W. Vogels. "File system usage in Windows NT 4.0," In Proceedings of the 17th ACM Symposium on Operating Systems Principles, pp. 93-109, December, 1999.]]
[33]
{Young 87} M. Young, A. Tevanian, R. Rashid, D. Golub, J. Eppinger, J. Chew, W. Bolosky, D. Black, and R. Baron. "The Duality of Memory and Communication in the Implementation of a Multiprocessor Operating System," Proceedings of the 11th ACM Symposium on Operating Systems Principles, pp. 63-76, November, 1987.]]

Cited By

View all
  • (2022)From Hyper-dimensional Structures to Linear Structures: Maintaining Deduplicated Data’s LocalityACM Transactions on Storage10.1145/350792118:3(1-28)Online publication date: 24-Aug-2022
  • (2021)Copy-on-Abundant-Write for Nimble File System ClonesACM Transactions on Storage10.1145/342349517:1(1-27)Online publication date: 29-Jan-2021
  • (2020)How to copy filesProceedings of the 18th USENIX Conference on File and Storage Technologies10.5555/3386691.3386699(75-90)Online publication date: 24-Feb-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
WSS'00: Proceedings of the 4th conference on USENIX Windows Systems Symposium - Volume 4
August 2000
135 pages

Sponsors

  • USENIX Assoc: USENIX Assoc

Publisher

USENIX Association

United States

Publication History

Published: 03 August 2000

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2022)From Hyper-dimensional Structures to Linear Structures: Maintaining Deduplicated Data’s LocalityACM Transactions on Storage10.1145/350792118:3(1-28)Online publication date: 24-Aug-2022
  • (2021)Copy-on-Abundant-Write for Nimble File System ClonesACM Transactions on Storage10.1145/342349517:1(1-27)Online publication date: 29-Jan-2021
  • (2020)How to copy filesProceedings of the 18th USENIX Conference on File and Storage Technologies10.5555/3386691.3386699(75-90)Online publication date: 24-Feb-2020
  • (2019)Prefetch-aware fingerprint cache management for data deduplication systemsFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-017-7119-013:3(500-515)Online publication date: 1-Jun-2019
  • (2018)A Global Survey on Data DeduplicationInternational Journal of Grid and High Performance Computing10.4018/IJGHPC.201810010310:4(43-66)Online publication date: 1-Oct-2018
  • (2017)PDFSIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2016.259407028:3(863-876)Online publication date: 1-Mar-2017
  • (2017)GLE-DedupInternational Journal of Parallel Programming10.1007/s10766-016-0450-545:4(946-964)Online publication date: 1-Aug-2017
  • (2016)Implementation of a deduplication cache mechanism using content-defined chunkingInternational Journal of High Performance Computing and Networking10.1504/ijhpcn.2016.0762519:3(190-205)Online publication date: 1-Jan-2016
  • (2016)Efficient Deduplication in a Distributed Primary Storage InfrastructureACM Transactions on Storage10.1145/287650912:4(1-35)Online publication date: 20-May-2016
  • (2016)A Method for High-Throughput Deduplication for Primary File Server by Using Prefetch CacheElectronics and Communications in Japan10.1002/ecj.1191399:12(54-64)Online publication date: 1-Dec-2016
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media