Cache (Computing)

Cache (computing)
In computing, a cache (/kæʃ/ ( listen) KASH)[1] is a

hardware or software component that stores data so
that future requests for that data can be served
faster; the data stored in a cache might be the result
of an earlier computation or a copy of data stored
elsewhere. A cache hit occurs when the requested
data can be found in a cache, while a cache miss
occurs when it cannot. Cache hits are served by
reading data from the cache, which is faster than Diagram of a CPU memory cache operation
recomputing a result or reading from a slower data
store; thus, the more requests that can be served
from the cache, the faster the system performs.[2]
To be cost-effective and to enable efficient use of data, caches must be relatively small. Nevertheless,
caches have proven themselves in many areas of computing, because typical computer applications access
data with a high degree of locality of reference. Such access patterns exhibit temporal locality, where data is
requested that has been recently requested already, and spatial locality, where data is requested that is stored
physically close to data that has already been requested.
Contents
Motivation
Latency
Throughput
Operation
Writing policies
Prefetch
Examples of hardware caches
CPU cache
GPU cache
DSPs
Translation lookaside buffer
In-network cache
Information-centric networking
Policies
Time aware least recently used (TLRU)
Least frequent recently used (LFRU)
Weather forecast
Software caches
Disk cache
Web cache
Memoization
Content delivery network
Cloud storage gateway
Other caches
Buffer vs. cache
See also
References
Further reading
Motivation
There is an inherent trade-off between size and speed (given that a larger resource implies greater physical
distances) but also a tradeoff between expensive, premium technologies (such as SRAM) vs cheaper, easily
mass-produced commodities (such as DRAM or hard disks).
The buffering provided by a cache benefits one or both of latency and throughput (bandwidth):
Latency
A larger resource incurs a significant latency for access – e.g. it can take hundreds of clock cycles for a
modern 4 GHz processor to reach DRAM. This is mitigated by reading in large chunks, in the hope that
subsequent reads will be from nearby locations. Prediction or explicit prefetching might also guess where
future reads will come from and make requests ahead of time; if done correctly the latency is bypassed
altogether.
Throughput
The use of a cache also allows for higher throughput from the underlying resource, by assembling multiple
fine grain transfers into larger, more efficient requests. In the case of DRAM circuits, this might be served
by having a wider data bus. For example, consider a program accessing bytes in a 32-bit address space, but
being served by a 128-bit off-chip data bus; individual uncached byte accesses would allow only 1/16th of
the total bandwidth to be used, and 80% of the data movement would be memory addresses instead of data
itself. Reading larger chunks reduces the fraction of bandwidth required for transmitting address
information.
Operation
Hardware implements cache as a block of memory for temporary storage of data likely to be used again.
Central processing units (CPUs), solid-state drives (SSDs) and hard disk drives (HDDs) frequently include
hardware-based cache, while web browsers and web servers commonly rely on software caching.
A cache is made up of a pool of entries. Each entry has associated data, which is a copy of the same data in
some backing store. Each entry also has a tag, which specifies the identity of the data in the backing store
of which the entry is a copy. Tagging allows simultaneous cache-oriented algorithms to function in
multilayered fashion without differential relay interference.
When the cache client (a CPU, web browser, operating system) needs to access data presumed to exist in
the backing store, it first checks the cache. If an entry can be found with a tag matching that of the desired
data, the data in the entry is used instead. This situation is known as a cache hit. For example, a web
browser program might check its local cache on disk to see if it has a local copy of the contents of a web
page at a particular URL. In this example, the URL is the tag, and the content of the web page is the data.
The percentage of accesses that result in cache hits is known as the hit rate or hit ratio of the cache.
The alternative situation, when the cache is checked and found not to contain any entry with the desired
tag, is known as a cache miss. This requires a more expensive access of data from the backing store. Once
the requested data is retrieved, it is typically copied into the cache, ready for the next access.
During a cache miss, some other previously existing cache entry is removed in order to make room for the
newly retrieved data. The heuristic used to select the entry to replace is known as the replacement policy.
One popular replacement policy, "least recently used" (LRU), replaces the oldest entry, the entry that was
accessed less recently than any other entry (see cache algorithm). More efficient caching algorithms
compute the use-hit frequency against the size of the stored contents, as well as the latencies and
throughputs for both the cache and the backing store. This works well for larger amounts of data, longer
latencies, and slower throughputs, such as that experienced with hard drives and networks, but is not
efficient for use within a CPU cache.
Writing policies
When a system writes data to cache, it must at some point write that
data to the backing store as well. The timing of this write is
controlled by what is known as the write policy. There are two
basic writing approaches:[3]
Write-through: write is done synchronously both to the

cache and to the backing store.
Write-back: initially, writing is done only to the cache.
The write to the backing store is postponed until the
modified content is about to be replaced by another
cache block.
A write-back cache is more complex to implement, since it needs to

track which of its locations have been written over, and mark them
as dirty for later writing to the backing store. The data in these
locations are written back to the backing store only when they are
A write-through cache with no-write
evicted from the cache, an effect referred to as a lazy write. For this
allocation
reason, a read miss in a write-back cache (which requires a block to
be replaced by another) will often require two memory accesses to
service: one to write the replaced data from the cache back to the store, and then one to retrieve the needed
data.
Other policies may also trigger data write-back. The client may make many changes to data in the cache,
and then explicitly notify the cache to write back the data.
Since no data is returned to the requester on write operations, a decision needs to be made on write misses,
whether or not data would be loaded into the cache. This is defined by these two approaches:
Write allocate (also called fetch on write): data at the missed-write location is loaded to
cache, followed by a write-hit operation. In this approach, write misses are similar to read
misses.
No-write allocate (also called write-no-allocate or write
around): data at the missed-write location is not loaded
to cache, and is written directly to the backing store. In
this approach, data is loaded into the cache on read
misses only.
Both write-through and write-back policies can use either of these

write-miss policies, but usually they are paired in this way:[4]
A write-back cache uses write allocate, hoping for

subsequent writes (or even reads) to the same location,
which is now cached.
A write-through cache uses no-write allocate. Here,
subsequent writes have no advantage, since they still
need to be written directly to the backing store.
A write-back cache with write
Entities other than the cache may change the data in the backing allocation
store, in which case the copy in the cache may become out-of-date
or stale. Alternatively, when the client updates the data in the cache,
copies of those data in other caches will become stale. Communication protocols between the cache
managers which keep the data consistent are known as coherency protocols.
Prefetch
On a cache read miss, caches with a demand paging policy read the minimum amount from the backing
store. For example, demand-paging virtual memory reads one page of virtual memory (often 4 kBytes)
from disk into the disk cache in RAM. For example, a typical CPU reads a single L2 cache line of 128
bytes from DRAM into the L2 cache, and a single L1 cache line of 64 bytes from the L2 cache into the L1
cache.
Caches with a prefetch input queue or more general anticipatory paging policy go further—they not only
read the chunk requested, but guess that the next chunk or two will soon be required, and so prefetch that
data into the cache ahead of time. Anticipatory paging is especially helpful when the backing store has a
long latency to read the first chunk and much shorter times to sequentially read the next few chunks, such
as disk storage and DRAM.
A few operating systems go further with a loader that always pre-loads the entire executable into RAM.
A few caches go even further, not only pre-loading an entire file, but also starting to load other related files
that may soon be requested, such as the page cache associated with a prefetcher or the web cache
associated with link prefetching.
Examples of hardware caches
CPU cache
Small memories on or close to the CPU can operate faster than the much larger main memory.[5] Most
CPUs since the 1980s have used one or more caches, sometimes in cascaded levels; modern high-end
embedded, desktop and server microprocessors may have as many as six types of cache (between levels
and functions).[6] Examples of caches with a specific function are the D-cache and I-cache and the
translation lookaside buffer for the MMU.
GPU cache
Earlier graphics processing units (GPUs) often had limited read-only texture caches, and introduced Morton
order swizzled textures to improve 2D cache coherency. Cache misses would drastically affect
performance, e.g. if mipmapping was not used. Caching was important to leverage 32-bit (and wider)
transfers for texture data that was often as little as 4 bits per pixel, indexed in complex patterns by arbitrary
UV coordinates and perspective transformations in inverse texture mapping.
As GPUs advanced (especially with GPGPU compute shaders) they have developed progressively larger
and increasingly general caches, including instruction caches for shaders, exhibiting increasingly common
functionality with CPU caches. For example, GT200 architecture GPUs did not feature an L2 cache, while
the Fermi GPU has 768 KB of last-level cache, the Kepler GPU has 1536 KB of last-level cache, and the
Maxwell GPU has 2048 KB of last-level cache. These caches have grown to handle synchronisation
primitives between threads and atomic operations, and interface with a CPU-style MMU.
DSPs
Digital signal processors have similarly generalised over the years. Earlier designs used scratchpad memory
fed by DMA, but modern DSPs such as Qualcomm Hexagon often include a very similar set of caches to a
CPU (e.g. Modified Harvard architecture with shared L2, split L1 I-cache and D-cache).[7]
Translation lookaside buffer
A memory management unit (MMU) that fetches page table entries from main memory has a specialized
cache, used for recording the results of virtual address to physical address translations. This specialized
cache is called a translation lookaside buffer (TLB).[8]
In-network cache
Information-centric networking
Information-centric networking (ICN) is an approach to evolve the Internet infrastructure away from a host-
centric paradigm, based on perpetual connectivity and the end-to-end principle, to a network architecture in
which the focal point is identified information (or content or data). Due to the inherent caching capability of
the nodes in an ICN, it can be viewed as a loosely connected network of caches, which has unique
requirements of caching policies. However, ubiquitous content caching introduces the challenge to content
protection against unauthorized access, which requires extra care and solutions.[9] Unlike proxy servers, in
ICN the cache is a network-level solution. Therefore, it has rapidly changing cache states and higher
request arrival rates; moreover, smaller cache sizes further impose a different kind of requirements on the
content eviction policies. In particular, eviction policies for ICN should be fast and lightweight. Various
cache replication and eviction schemes for different ICN architectures and applications have been
proposed.
Policies
Time aware least recently used (TLRU)
The Time aware Least Recently Used (TLRU)[10] is a variant of LRU designed for the situation where the
stored contents in cache have a valid life time. The algorithm is suitable in network cache applications, such
as Information-centric networking (ICN), Content Delivery Networks (CDNs) and distributed networks in
general. TLRU introduces a new term: TTU (Time to Use). TTU is a time stamp of a content/page which
stipulates the usability time for the content based on the locality of the content and the content publisher
announcement. Owing to this locality based time stamp, TTU provides more control to the local
administrator to regulate in network storage. In the TLRU algorithm, when a piece of content arrives, a
cache node calculates the local TTU value based on the TTU value assigned by the content publisher. The
local TTU value is calculated by using a locally defined function. Once the local TTU value is calculated
the replacement of content is performed on a subset of the total content stored in cache node. The TLRU
ensures that less popular and small life content should be replaced with the incoming content.
Least frequent recently used (LFRU)
The Least Frequent Recently Used (LFRU)[11] cache replacement scheme combines the benefits of LFU
and LRU schemes. LFRU is suitable for 'in network' cache applications, such as Information-centric
networking (ICN), Content Delivery Networks (CDNs) and distributed networks in general. In LFRU, the
cache is divided into two partitions called privileged and unprivileged partitions. The privileged partition
can be defined as a protected partition. If content is highly popular, it is pushed into the privileged partition.
Replacement of the privileged partition is done as follows: LFRU evicts content from the unprivileged
partition, pushes content from privileged partition to unprivileged partition, and finally inserts new content
into the privileged partition. In the above procedure the LRU is used for the privileged partition and an
approximated LFU (ALFU) scheme is used for the unprivileged partition, hence the abbreviation LFRU.
The basic idea is to filter out the locally popular contents with ALFU scheme and push the popular contents
to one of the privileged partition.
Weather forecast
Back in 2010 The New York Times suggested "Type 'weather' followed by your zip code."[12] By 2011, the
use of smartphones with weather forecasting options was overly taxing AccuWeather servers; two requests
within the same park would generate separate requests. An optimization by edge-servers to truncate the
GPS coordinates to fewer decimal places meant that the cached results from the earlier query would be
used. The number of to-the-server lookups per day dropped by half.[13]
Software caches
Disk cache
While CPU caches are generally managed entirely by hardware, a variety of software manages other
caches. The page cache in main memory, which is an example of disk cache, is managed by the operating
system kernel.
While the disk buffer, which is an integrated part of the hard disk drive or solid state drive, is sometimes
misleadingly referred to as "disk cache", its main functions are write sequencing and read prefetching.
Repeated cache hits are relatively rare, due to the small size of the buffer in comparison to the drive's
capacity. However, high-end disk controllers often have their own on-board cache of the hard disk drive's
data blocks.
Finally, a fast local hard disk drive can also cache information held on even slower data storage devices,
such as remote servers (web cache) or local tape drives or optical jukeboxes; such a scheme is the main
concept of hierarchical storage management. Also, fast flash-based solid-state drives (SSDs) can be used as
caches for slower rotational-media hard disk drives, working together as hybrid drives or solid-state hybrid
drives (SSHDs).
Web cache
Web browsers and web proxy servers employ web caches to store previous responses from web servers,
such as web pages and images. Web caches reduce the amount of information that needs to be transmitted
across the network, as information previously stored in the cache can often be re-used. This reduces
bandwidth and processing requirements of the web server, and helps to improve responsiveness for users of
the web.[14]
Web browsers employ a built-in web cache, but some Internet service providers (ISPs) or organizations also
use a caching proxy server, which is a web cache that is shared among all users of that network.
Another form of cache is P2P caching, where the files most sought for by peer-to-peer applications are
stored in an ISP cache to accelerate P2P transfers. Similarly, decentralised equivalents exist, which allow
communities to perform the same task for P2P traffic, for example, Corelli.[15]
Memoization
A cache can store data that is computed on demand rather than retrieved from a backing store. Memoization
is an optimization technique that stores the results of resource-consuming function calls within a lookup
table, allowing subsequent calls to reuse the stored results and avoid repeated computation. It is related to
the dynamic programming algorithm design methodology, which can also be thought of as a means of
caching.
Content delivery network
A content delivery network (CDN) is a network of distributed servers that deliver pages and other Web
content to a user, based on the geographic locations of the user, the origin of the web page and the content
delivery server.
CDNs began in the late 1990s as a way to speed up the delivery of static content, such as HTML pages,
images and videos. By replicating content on multiple servers around the world and delivering it to users
based on their location, CDNs can significantly improve the speed and availability of a website or
application. When a user requests a piece of content, the CDN will check to see if it has a copy of the
content in its cache. If it does, the CDN will deliver the content to the user from the cache.[16]
Cloud storage gateway
A cloud storage gateway, also known as an edge filer, is a hybrid cloud storage device that connects a local
network to one or more cloud storage service, typically an object storage service such as Amazon S3. It
provides a cache for frequently accessed data, providing high speed local access to frequently accessed data
in the cloud storage service. Cloud storage gateways also provide additional benefits such as accessing
cloud object storage through traditional file serving protocols as well as continued access to cached data
during connectivity outages.[17]
Other caches
The BIND DNS daemon caches a mapping of domain names to IP addresses, as does a resolver library.
Write-through operation is common when operating over unreliable networks (like an Ethernet LAN),
because of the enormous complexity of the coherency protocol required between multiple write-back
caches when communication is unreliable. For instance, web page caches and client-side network file
system caches (like those in NFS or SMB) are typically read-only or write-through specifically to keep the
network protocol simple and reliable.
Search engines also frequently make web pages they have indexed available from their cache. For example,
Google provides a "Cached" link next to each search result. This can prove useful when web pages from a
web server are temporarily or permanently inaccessible.
Database caching can substantially improve the throughput of database applications, for example in the
processing of indexes, data dictionaries, and frequently used subsets of data.
A distributed cache[18] uses networked hosts to provide scalability, reliability and performance to the
application.[19] The hosts can be co-located or spread over different geographical regions.
Buffer vs. cache

The semantics of a "buffer" and a "cache" are not totally different; even so, there are fundamental
differences in intent between the process of caching and the process of buffering.
Fundamentally, caching realizes a performance increase for transfers of data that is being repeatedly
transferred. While a caching system may realize a performance increase upon the initial (typically write)
transfer of a data item, this performance increase is due to buffering occurring within the caching system.
With read caches, a data item must have been fetched from its residing location at least once in order for
subsequent reads of the data item to realize a performance increase by virtue of being able to be fetched
from the cache's (faster) intermediate storage rather than the data's residing location. With write caches, a
performance increase of writing a data item may be realized upon the first write of the data item by virtue of
the data item immediately being stored in the cache's intermediate storage, deferring the transfer of the data
item to its residing storage at a later stage or else occurring as a background process. Contrary to strict
buffering, a caching process must adhere to a (potentially distributed) cache coherency protocol in order to
maintain consistency between the cache's intermediate storage and the location where the data resides.
Buffering, on the other hand,
reduces the number of transfers for otherwise novel data amongst communicating
processes, which amortizes overhead involved for several small transfers over fewer, larger
transfers,
provides an intermediary for communicating processes which are incapable of direct
transfers amongst each other, or
ensures a minimum data size or representation required by at least one of the
communicating processes involved in a transfer.
With typical caching implementations, a data item that is read or written for the first time is effectively being
buffered; and in the case of a write, mostly realizing a performance increase for the application from where
the write originated. Additionally, the portion of a caching protocol where individual writes are deferred to
a batch of writes is a form of buffering. The portion of a caching protocol where individual reads are
deferred to a batch of reads is also a form of buffering, although this form may negatively impact the
performance of at least the initial reads (even though it may positively impact the performance of the sum of
the individual reads). In practice, caching almost always involves some form of buffering, while strict
buffering does not involve caching.
A buffer is a temporary memory location that is traditionally used because CPU instructions cannot directly
address data stored in peripheral devices. Thus, addressable memory is used as an intermediate stage.
Additionally, such a buffer may be feasible when a large block of data is assembled or disassembled (as
required by a storage device), or when data may be delivered in a different order than that in which it is
produced. Also, a whole buffer of data is usually transferred sequentially (for example to hard disk), so
buffering itself sometimes increases transfer performance or reduces the variation or jitter of the transfer's
latency as opposed to caching where the intent is to reduce the latency. These benefits are present even if
the buffered data are written to the buffer once and read from the buffer once.
A cache also increases transfer performance. A part of the increase similarly comes from the possibility that
multiple small transfers will combine into one large block. But the main performance-gain occurs because
there is a good chance that the same data will be read from cache multiple times, or that written data will
soon be read. A cache's sole purpose is to reduce accesses to the underlying slower storage. Cache is also
usually an abstraction layer that is designed to be invisible from the perspective of neighboring layers.
See also
Cache coloring
Cache hierarchy
Cache-oblivious algorithm
Cache stampede
Cache language model
Cache manifest in HTML5
Dirty bit
Five-minute rule
Materialized view
Memory hierarchy
Pipeline burst cache
Temporary file
References
1. "Cache" (https://web.archive.org/web/20120818122040/http://oxforddictionaries.com/definiti
on/english/cache). Oxford Dictionaries. Oxford Dictionaries. Archived from the original (http://
www.oxforddictionaries.com/definition/english/cache) on 18 August 2012. Retrieved
2 August 2016.
2. Zhong, Liang; Zheng, Xueqian; Liu, Yong; Wang, Mengting; Cao, Yang (February 2020).
"Cache hit ratio maximization in device-to-device communications overlaying cellular
networks" (https://dx.doi.org/10.23919/jcc.2020.02.018). China Communications. 17 (2):
232–238. doi:10.23919/jcc.2020.02.018 (https://doi.org/10.23919%2Fjcc.2020.02.018).
ISSN 1673-5447 (https://www.worldcat.org/issn/1673-5447). S2CID 212649328 (https://api.s
emanticscholar.org/CorpusID:212649328).
3. Bottomley, James (1 January 2004). "Understanding Caching" (https://www.linuxjournal.co
m/article/7105). Linux Journal. Retrieved 1 October 2019.
4. John L. Hennessy; David A. Patterson (2011). Computer Architecture: A Quantitative
Approach (https://books.google.com/books?id=v3-1hVwHnHwC&pg=SL2-PA12). Elsevier.
pp. B–12. ISBN 978-0-12-383872-8.
5. Su, Chao; Zeng, Qingkai (10 June 2021). Nicopolitidis, Petros (ed.). "Survey of CPU Cache-
Based Side-Channel Attacks: Systematic Analysis, Security Models, and Countermeasures"
(https://doi.org/10.1155%2F2021%2F5559552). Security and Communication Networks.
2021: 1–15. doi:10.1155/2021/5559552 (https://doi.org/10.1155%2F2021%2F5559552).
ISSN 1939-0122 (https://www.worldcat.org/issn/1939-0122).
6. "Intel Broadwell Core i7 5775C '128MB L4 Cache' Gaming Behemoth and Skylake Core i7
6700K Flagship Processors Finally Available In Retail" (https://wccftech.com/intel-broadwell
-core-i7-5775c-128mb-l4-cache-and-skylake-core-i7-6700k-flagship-processors-available-re
tail/). 25 September 2015.Mentions L4 cache. Combined with separate I-Cache and TLB,
this brings the total 'number of caches (levels+functions) to 6
7. "qualcom Hexagon DSP SDK overview" (https://developer.qualcomm.com/software/hexago
n-dsp-sdk/dsp-processor).
8. Frank Uyeda (2009). "Lecture 7: Memory Management" (http://cseweb.ucsd.edu/classes/su0
9/cse120/lectures/Lecture7.pdf) (PDF). CSE 120: Principles of Operating Systems. UC San
Diego. Retrieved 4 December 2013.
9. Bilal, Muhammad; et al. (2019). "Secure Distribution of Protected Content in Information-
Centric Networking". IEEE Systems Journal. 14 (2): 1–12. arXiv:1907.11717 (https://arxiv.or
g/abs/1907.11717). Bibcode:2020ISysJ..14.1921B (https://ui.adsabs.harvard.edu/abs/2020I
SysJ..14.1921B). doi:10.1109/JSYST.2019.2931813 (https://doi.org/10.1109%2FJSYST.201
9.2931813). S2CID 198967720 (https://api.semanticscholar.org/CorpusID:198967720).
10. Bilal, Muhammad; et al. (2017). "Time Aware Least Recent Used (TLRU) Cache
Management Policy in ICN". IEEE 16th International Conference on Advanced
Communication Technology (ICACT): 528–532. arXiv:1801.00390 (https://arxiv.org/abs/180
1.00390). Bibcode:2018arXiv180100390B (https://ui.adsabs.harvard.edu/abs/2018arXiv180
100390B). doi:10.1109/ICACT.2014.6779016 (https://doi.org/10.1109%2FICACT.2014.6779
016). ISBN 978-89-968650-3-2. S2CID 830503 (https://api.semanticscholar.org/CorpusID:83
0503).
11. Bilal, Muhammad; et al. (2017). "A Cache Management Scheme for Efficient Content
Eviction and Replication in Cache Networks". IEEE Access. 5: 1692–1701.
arXiv:1702.04078 (https://arxiv.org/abs/1702.04078). Bibcode:2017arXiv170204078B (http
s://ui.adsabs.harvard.edu/abs/2017arXiv170204078B).
doi:10.1109/ACCESS.2017.2669344 (https://doi.org/10.1109%2FACCESS.2017.2669344).
S2CID 14517299 (https://api.semanticscholar.org/CorpusID:14517299).
12. Simon Mackie (3 May 2010). "9 More Simple Google Search Tricks" (https://archive.nytimes.
com/www.nytimes.com/external/gigaom/2010/05/03/03gigaom-9-more-simple-google-searc
h-tricks-86578.html). New York Times.
13. Chris Murphy (30 May 2011). "5 Lines Of Code In The Cloud". InformationWeek. p. 28. "300
million to 500 million fewer requests a day handled by AccuWeather servers"
14. Multiple (wiki). "Web application caching" (https://web.archive.org/web/20191212152625/htt
p://www.docforge.com/wiki/Web_application/Caching). Docforge. Archived from the original
(http://docforge.com/wiki/Web_application/Caching) on 12 December 2019. Retrieved
24 July 2013.
15. Gareth Tyson; Andreas Mauthe; Sebastian Kaune; Mu Mu; Thomas Plagemann. Corelli: A
Dynamic Replication Service for Supporting Latency-Dependent Content in Community
Networks (https://web.archive.org/web/20150618193018/http://comp.eprints.lancs.ac.uk/204
4/1/MMCN09.pdf) (PDF). MMCN'09. Archived from the original (http://comp.eprints.lancs.ac.
uk/2044/1/MMCN09.pdf) (PDF) on 18 June 2015.
16. "Globally Distributed Content Delivery, by J. Dilley, B. Maggs, J. Parikh, H. Prokop, R.
Sitaraman and B. Weihl, IEEE Internet Computing, Volume 6, Issue 5, November 2002" (http
s://people.cs.umass.edu/~ramesh/Site/PUBLICATIONS_files/DMPPSW02.pdf) (PDF).
Archived (https://web.archive.org/web/20170809231307/http://people.cs.umass.edu/~rames
h/Site/PUBLICATIONS_files/DMPPSW02.pdf) (PDF) from the original on 9 August 2017.
Retrieved 25 October 2019.
17. "Definition: cloud storage gateway" (https://www.techtarget.com/searchstorage/definition/clo
ud-storage-gateway). SearchStorage. July 2014.
18. Paul, S; Z Fei (1 February 2001). "Distributed caching with centralized control". Computer
Communications. 24 (2): 256–268. CiteSeerX 10.1.1.38.1094 (https://citeseerx.ist.psu.edu/vi
ewdoc/summary?doi=10.1.1.38.1094). doi:10.1016/S0140-3664(00)00322-4 (https://doi.org/
10.1016%2FS0140-3664%2800%2900322-4).
19. Khan, Iqbal (July 2009). "Distributed Caching on the Path To Scalability" (https://msdn.micro
soft.com/magazine/dd942840.aspx). MSDN. 24 (7).
Further reading
"What Every Programmer Should Know About Memory" (https://people.freebsd.org/~lstewar
t/articles/cpumemory.pdf)
"Caching in the Distributed Environment" (http://msdn.microsoft.com/en-us/library/dd129907.
aspx)
Retrieved from "https://en.wikipedia.org/w/index.php?title=Cache_(computing)&oldid=1128796745"
This page was last edited on 22 December 2022, at 01:36 (UTC).
Text is available under the Creative Commons Attribution-ShareAlike License 3.0; additional terms may apply. By
using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the
Wikimedia Foundation, Inc., a non-profit organization.

Cache (Computing)

Uploaded by

Copyright:

Available Formats

Cache (Computing)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cache (Computing)

Uploaded by

Copyright:

Available Formats

Cache (computing)

In computing, a cache (/kæʃ/ ( listen) KASH)[1] is a

Write-through: write is done synchronously both to the

A write-back cache is more complex to implement, since it needs to

Both write-through and write-back policies can use either of these

A write-back cache uses write allocate, hoping for

Examples of hardware caches

Translation lookaside buffer

Time aware least recently used (TLRU)

Least frequent recently used (LFRU)

Content delivery network

Cloud storage gateway

Buffer vs. cache

Retrieved from "https://en.wikipedia.org/w/index.php?title=Cache_(computing)&oldid=1128796745"

This page was last edited on 22 December 2022, at 01:36 (UTC).

You might also like