Examples of coherency protocols for cache memory are listed here. For simplicity, all "miss" Read and Write status transactions which obviously come from state "I" (or miss of Tag), in the diagrams are not shown. They are shown directly on the new state. Many of the following protocols have only historical value. At the moment the main protocols used are the R-MESI type / MESIF protocols and the HRT-ST-MESI (MOESI type) or a subset or an extension of these.
Cache coherency problem
In systems as Multiprocessor system, multi-core and NUMA system, where a dedicated cache for each processor, core or node is used, a consistency problem may occur when a same data is stored in more than one cache. This problem arises when a data is modified in one cache. This problem can be solved in two ways:
- Invalidate all the copies on other caches (broadcast-invalidate)
- Update all the copies on other caches (write-broadcasting), while the memory may be updated (write through) or not updated (write-back).
Note: Coherency generally applies only to data (as operands) and not to instructions (see Self-Modifying Code).
The schemes can be classified based on:
- Snoopy scheme vs Directory scheme and vs Shared caches
- Write through vs Write-back (ownership-based) protocol
- Update vs Invalidation protocol
- Intervention vs not Intervention
- Dirty-sharing vs not-dirty-sharing protocol (MOESI vs MESI)
Three approaches are adopted to maintain the coherency of data.
- Bus watching or Snooping – generally used for bus-based SMP – Symmetric Multiprocessor System/multi-core systems
- Directory-based – Message-passing – may be used in all systems but typically in NUMA system and in large multi-core systems
- Shared cache – generally used in multi-core systems
Snoopy coherency protocol
Protocol used in bus-based systems like a SMP systems
SMP – symmetric multiprocessor systems
Systems operating under a single OS (Operating System) with two or more homogeneous processors and with a centralized shared Main Memory
Each processor has its own cache that acts as a bridge between processor and Main Memory. The connection is made using a System Bus or a Crossbar ("xbar") or a mix of two previously approach, bus for address and crossbar for Data (Data crossbar).123
The bottleneck of these systems is the traffic and the Memory bandwidth. Bandwidth can be increasing by using large data bus path, data crossbar, memory interleaving (multi-bank parallel access) and out of order data transaction. The traffic can be reduced by using a cache that acts as a "filter" versus the shared memory, that is the cache is an essential element for shared-memory in SMP systems.
In multiprocessor systems with separate caches that share a common memory, a same datum can be stored in more than one cache. A data consistency problem may occur when data is modified in one cache only. The protocols to maintain the coherency for multiple processors are called cache-coherency protocols.
Usually in SMP the coherency is based on the "Bus watching" or "Snoopy" (after the Peanuts' character Snoopy ) approach. In a snooping system, all the caches monitor (or snoop) the bus transactions to intercept the data and determine if they have a copy on its cache.
Various cache-coherency protocols are used to maintain data coherency between caches.4
These protocols are generally classified based only on the cache states (from 3 to 5 and 7 or more) and the transactions between them, but this could create some confusion.
This definition is incomplete because it lacks important and essential information as the actions that these produce. These actions can be invoked by the processor or the bus controller (e.g. intervention, invalidation, broadcasting, etc.). The type of actions are implementation dependent. The states and transaction rules do not capture everything about a protocol. For instance protocol MESI with shared-intervention on unmodified data and MESI without intervention are different (see below). At the same time, some protocols with different states can be practically the same. For instance, the 4-state MESI Illinois and 5-state MERSI (R-MESI) IBM / MESIF-Intel protocols are only different implementations of the same functionality (see below).
The most common protocols are the 4-state MESI and the 5-state MOESI, each letter standing for one of the possible states of the cache. Other protocols use some proper subset of these but with different implementations along with their different but equivalent terminology. The terms MESI, MOESI or any subset of them generally refer to a class of protocols instead of a specific one.
Cache states
The states MESI and MOESI are often and more commonly called by different names.
- M=Modified or D=Dirty or DE=Dirty-Exclusive or EM=Exclusive Modified
- modified in one cache only – write-back required at replacement.
- data is stored only in one cache but the data in memory is not updated (invalid, not clean).
- O=Owner or SD=Shared Dirty or SM=Shared Modified or T=Tagged
- modified, potentially shared, owned, write-back required at replacement.
- data may be stored in more than a cache but the data in memory is not updated (invalid, not clean). Only one cache is the "owner", other caches are set "Valid" (S/V/SC). On bus read request, the data is supplied by the "owner" instead of the memory.
- E=Exclusive or R=Reserved or VE=Valid-Exclusive or EC=Exclusive Clean or Me=Exclusive
- clean, in one cache only.
- Data is stored only in one cache and clean in memory.
- S=Shared or V=Valid or SC=Shared Clean
- shared or valid
- Data potentially shared with other caches. The data can be clean or dirty. The term "clean" in SC is misleading because can be also dirty (see Dragon protocol).
- I=Invalid.
- Cache line invalid. If the cache line is not present (no tag matching) it is considered equivalent to invalid, therefore invalid data means data present but invalid or not present in cache.
Special states:
- F=Forward or R=Recent
- Additional states of MESI protocol
- Last data read. It is a special "Valid" state that is the "Owner" for non modified shared data, used in some extended MESI protocols (MERSI or R-MESI IBM,56 MESIF – Intel78). The R/F state is used to allow "intervention" when the value is clean but shared among many caches. This cache is responsible for intervention (shared intervention ). On bus read request, the data is supplied by this cache instead of the memory. MERSI and MESIF are the same protocol with different terminology (F instead of R). Some time R is referred as "shared last " (SL).910
- The state R = Recent is used not only in MERSI = R-MESI protocol but in several other protocols. This state can be used in combination with other states. For instance RT-MESI, HR-MESI, HRT-MESI, HRT-ST-MESI.111213 All protocols that use this state will be refereed as R-MESI type.
- H=Hover – H-MESI (additional state of MESI protocol)14
- The Hover (H) state allows a cache line to maintain an address Tag in the directory even though the corresponding value in the cache entry is an invalid copy. If the correspondent value happens on the bus (address Tag matching) due a valid "Read" or "Write" operation, the entry is updated with a valid copy and its state is changed in S.
- This state can be used in combination with other states. For instance HR-MESI, HT-MESI, HRT-MESI, HRT-ST-MESI.151617
Various coherency protocols
Protocols | |
---|---|
SI protocol | Write Through |
MSI protocol | Synapse protocol18 |
MEI protocol | IBM PowerPC 750,19 MPC740020 |
MES protocol | Firefly protocol21 |
MESI protocol | Pentium II,22 PowerPC, Intel Harpertown (Xeon 5400) |
MOSI protocol | Berkeley protocol23 |
MOESI protocol | AMD64,24 MOESI,25 T-MESI IBM26 |
Terminology used | |
---|---|
Illinois protocol | D-VE-S-I (= extended MESI)2728 |
Write-once or Write-first | D-R-V-I (= MESI) 293031 |
Berkeley protocol | D-SD-V-I (= MOSI) 32 |
Synapse protocol | D-V-I (= MSI) 33 |
Firefly protocol | D-VE-S (= MES) DEC34 |
Dragon protocol | D-SD (SM ?)-SC-VE (= MOES) Xerox35 |
Bull HN ISI protocol | D-SD-R-V-I (= MOESI)36 |
MERSI (IBM) / MESIF (Intel) protocol | |
HRT-ST-MESI protocol | H=Hover, R=Recent, T=Tagged, ST=Shared-Tagged – IBM4344 – Note: The main terminologies are SD-D-R-V-I and MOESI and so they will be used both. |
POWER4 IBM protocol | Mu-T-Me-M-S-SL-I ( L2 seven states)45
(*) Special state – Asking for a reservation for load and store doubleword (for 64-bit implementations). |
Snoopy coherency operations
- Bus Transactions
- Data Characteristics
- Cache Operations
Bus transactions
The main operations are:
- Write Through
- Write-Back
- Write Allocate
- Write-no-Allocate
- Cache Intervention
- Shared Intervention
- Dirty Intervention
- Invalidation
- Write-broadcast
- Intervention-broadcasting
Write Through
- The cache line is updated both in cache and in MM or only in MM (write no-allocate).
- Simple to implement, high bandwidth consuming. It is better for single write.
Write-Back
- Data is written only in cache. Data is Write-Back to MM only when the data is replaced in cache or when required by other caches (see Write policy).
- It is better for multi-write on the same cache line.
- Intermediate solution: Write Through for the first write, Write-Back for the next (Write-once and Bull HN ISI46 protocols).
Write Allocate
- On miss the data is read from the "owner" or from MM, then the data is written in cache (updating-partial write) (see Write policy).
Write-no-Allocate
- On miss the data is written only in MM without to involve the cache, or as in Bull HN ISI protocol, in the "owner" that is in D or SD cache (owner updating), if they are, else in MM.
- Write-no-Allocate usually is associated with Write Through.
- Cache Intervention
- Invalidation
- Write-broadcast (Write-update)
- Intervention-broadcasting
- Write invalidate vs broadcast
Data characteristics
There are three characteristics of cached data:
- Validity
- Exclusiveness
- Ownership
- Validity
- Exclusiveness
- Ownership
(*) – Implementation depending.
Note: Not to confuse the more restrictive "owner" definition in MOESI protocol with this more general definition.
Cache operations
The cache operations are:
- Read Hit
- Read Miss
- Write Hit
- Write Miss
- Read Hit
- Read Miss
- Data stored only in MM
- Data stored in MM and in one or more caches in S (V) state or in R/F in R-MESI type / MESIF protocols.
- – Illinois protocol – a network priority is used to temporary and arbitrary assign the ownership to a S copy. - Data is supplied by the selected cache. Requesting cache is set S (shared intervention with MM clean).
- – R-MESI type / MESIF protocols – a copy is in R/F state (shared owner) – The data is supplied by the R/F cache. Sending cache is changed in S and the requesting cache is set R/F (in read miss the "ownership" is always taken by the last requesting cache) – shared intervention.
- – In all the other cases the data is supplied by the memory and the requesting cache is set S (V).
- Data stored in MM and only in one cache in E (R) state.
- – Data is supplied by a E (R) cache or by the MM, depending on the protocol. – From E (R) in extended MESI (e.g. Illinois, Pentium (R) II 52), R-MESI type / MESIF and from same MOESI implementation (e.g. AMD64) – The requesting cache is set S (V), or R/F in R-MESI type / MESIF protocols and E (R) cache is changed in S (V) or in I in MEI protocol.
- – In all the other cases the data is supplied by the MM.
- Data modified in one or more caches with MM not clean
- Protocol MOESI type – Data stored in M (D) or in O (SD) and the other caches in S (V)
- Protocols MESI type and MEI – Data stored in M (D) and the other caches in S (V) state
- – Data is sent from the M (D) cache to the requesting cache and also to MM (e.g. Illinois, Pentium (R) II 53)
- – The operation is made in two steps: the requesting transaction is stopped, the data is sent from the M (D) cache to MM then the wait transaction can proceed and the data is read from MM (e.g. MESI and MSI Synapse protocol).
- Write Hit
- Cache in S (V) or R/F or O (SD) (sharing)
- Cache in E (R) or M (D) state (exclusiveness)
- Write Miss
Coherency protocols
– warning – For simplicity all Read and Write "miss" state transactions that obviously came from I state (or Tag miss), in the diagrams are not depicted. They are depicted directly on the new state. – Note – Many of the following protocols have only historical value. At the present the main protocols used are R-MESI type / MESIF and HRT-ST-MES (MOESI type) or a subset of this.MESI protocol
States MESI = D-R-V-I
– Use of a bus "shared line" to detect "shared" copy in the other caches- Processor operations
- Read Miss
- Write Hit
- Write Miss
- Bus transactions
- Bus Read
- Bus Read – (RWITM)
- Bus Invalidate Transaction
- Operations
MOESI protocol
States MEOSI = D-R-SD-V-I = T-MESI IBM56
– Use of bus "shared line" to detect "shared" copy on the other caches- Processor operations
- Read Miss
- Write Hit
- Write Miss
- Bus transactions
- Bus Read
- Bus Read – (RWITM)
- Bus Invalidate Transaction
- Operations
- Write Allocate
- Intervention: from M-O-E (*)
- Write Invalidate
- Copy-Back: M-O replacement
Illinois protocol
States MESI = D-R-V-I57
– Characteristics: – It is an extension of MESI protocol – Use of a network priority for shared intervention (intervention on shared data) – Differences from MESI: in addition to E and M, intervention also from S (see Read Miss – point 1)- Operations
Write-once (or write-first) protocol
– Characteristics: – No use of "shared line" (protocol for standard or unmodifiable bus) – Write Through on first Write Hit in state V, then Copy Back- Processor operations
- Read Miss
- Write Hit
- Write Miss
- Bus transactions
- Bus Read
- Bus Read – (RWITM)
- Bus Invalidate Transaction
- Operations
- Write Allocate
- Intervention: from D
- Write Through: first write hit in V state
- Write Invalidate
- Copy-Back: D replacement
Bull HN ISI protocol
(Bull-Honeywell Italia)
States D-SD-R-V-I (MOESI) Patented protocol (F. Zulian)61
– Characteristics: – MOESI extension of the Write-Once protocol - Write-no-allocate on miss with D or SD updating - No use of RWITM - No use of "shared line"- Processor operations
- Read Miss
- Write Hit
- Bus transactions
- Bus Read
- Bus Read (Write Update / Write Invalidate)
- Operations
- Write-no-allocate: on miss
- Write update: on miss
- Write Through: for the first write, then copy back
- Write Update / Write Invalidate
- Intervention: from SD-D
- Copy-Back: D replacement or SD replacement with invalidate
Synapse protocol
States D-V-I (MSI)62
- Characteristics: - The characteristic of this protocol is ti have a single-bit tag with each cache line in MM, indicating that a cache have the line in D state. - This bit prevents a possible race condition if the D cache does not respond quickly enough to inhibit the MM from responding before being updating. - The data comes always from MM - No use of "shared line"- Processor operations
- Read Miss
- Write Hit
- Write Miss (RWITM)
- Bus transactions
- Bus Read
- Bus Read (RWITM)
- Operations
- Write Allocate
- Intervention: no intervention
- Write Invalidate: (RWITM)
- No Invalidate transaction
- Copy-Back: D replacement
Berkeley protocol
States D-SD-V-I (MOSI)63
- Characteristics: - As with MOESI without E state - No use of "shared line"- Processor operations
- Read Miss
- Write Hit
- Write Miss
- Bus transactions
- Bus Read
- Bus Read – (RWITM)
- Bus Invalidate Transaction
- Operations
- Write Allocate
- Intervention: from D-SD
- Write Invalidate
- Copy-Back: D-SD replacement
Firefly (DEC) protocol
States D-VE-S (MES)64
- Characteristics: - No "Invalid" state - "Write-broadcasting"+"Write Through" - Use of "shared line" - "Write-broadcasting" avoid the necessity of "Invalid" state - Simultaneous intervention from all caches (shared and dirty intervention – on not modified that modified data) - This protocol requires a synchronous bus- Processor operations
- Read Miss
- Write Hit
- Write Miss
- Bus transactions
- Bus Read
- Bus Read
- Write Broadcasting
- Operations
- Write Allocate
- Intervention: from D-VE-S (from all "valid" caches)
- Write-broadcasting – Write through
- Copy-Back: D replacement and on any transaction with a cache D
Dragon (Xerox) protocol
States D-SD-VE-SC (MOES)65
Note – the state SC, despite the term "clean", can be "clean" or "dirty" as the S state of the other protocols. SC and S are equivalents
- Characteristics: - No "Invalid" state - "Write-broadcasting" (no "Write Through") - Use of "shared line" - "Write-broadcasting" avoid the necessity of "Invalid" state- Processor operations
- Read Miss
- Write Hit
- Write Miss
- Bus transactions
- Bus Read
- Write Broadcasting
- Operations
- Write Allocate
- Intervention: from D-SD (but not from VE)
- Write-broadcasting
- Copy-Back: D-SD replacement
MERSI (IBM) / MESIF (Intel) protocol
States MERSI or R-MESI States MESIF Patented protocols – IBM (1997)66 – Intel (2002)67
- MERSI and MESIF are the same identical protocol (only the name state is different ,F instead of R) - Characteristics: - The same functionality of Illinois protocol - A new state R (Recent) / F (Forward) is the "owner " for "shared-clean" data (with MM updated). - The "shared ownership" (on clean data) is not assigned by a network priority like with Illinois, but it is always assigned to the last cache with Read Miss, setting its state R/F - The "ownership" is temporary loosed in case of R/F replacement. The "ownership" is reassigned again to the next Read Miss with caches "shared clean" - Use of the "shared line"- Operations
- Write Allocate
- Intervention: from M-E-R/F
- Write Invalidate
- Copy-Back: M replacement
MESI vs MOESI
MESI and MOESI are the most popular protocols
It is common opinion that MOESI is an extension of MESI protocol and therefore it is more sophisticate and more performant. This is true only if compared with standard MESI, that is MESI with "not sharing intervention". MESI with "sharing intervention", as MESI Illinois like or the equivalent 5-state protocols MERSI / MESIF, are much more performant than the MOESI protocol.
In MOESI, cache-to-cache operations is made only on modified data. Instead in MESI Illinois type and MERSI / MESIF protocols, the cache-to-cache operations are always performed both with clean that with modified data. In case of modified data, the intervention is made by the "owner" M, but the ownership is not loosed because it is migrated in another cache (R/F cache in MERSI / MESIF or a selected cache as Illinois type). The only difference is that the MM must be updated. But also in MOESI this transaction should be done later in case of replacement, if no other modification occurs meanwhile. However this it is a smaller limit compared to the memory transactions due to the not-intervention, as in case of clean data for MOESI protocol. (see e.g. "Performance evaluation between MOESI (Shanghai) and MESIF Nehalem-EP"68)
The most advance systems use only R-MESI / MESIF protocol or the more complete RT-MESI, HRT-ST-MESI and POWER4 IBM protocols that are an enhanced merging of MESI and MOESI protocols
Note: Cache-to-cache is an efficient approach in multiprocessor/multicore systems direct connected between them, but less in Remote cache as in NUMA systems where a standard MESI is preferable. Example in POWER4 IBM protocol "shared intervention" is made only "local" and not between remote module.
RT-MESI protocol
States RT-MESI IBM patented protocol6970
- Characteristics:
Processor operations
- Read Miss
- Write Hit
- Write Miss
- Bus transactions
- Bus Read
- Bus Read – (RWITM)
- Bus Invalidate Transaction
- Operations
- Write Allocate
- Intervention: from T-M-R-E
- Write Invalidate
- Copy-Back: T-M replacement
RT-ST-MESI protocol
It is an improvement of RT-MESI protocol71 and it is a subset of HRT-ST-MESI protocol72
ST = Shared-Tagged - Use of the "Shared-Tagged" state allows to maintain intervention after deallocation of a Tagged cache line - In case of T replacement (cache line deallocation), the data needs to be written back to MM and so to lose the "ownership". To avoid this, a new state ST can be used. In Read Miss the previous T is set ST instead of S. ST is the candidate to replace the ownership in case of T deallocation. The T "Copy back" transaction is stopped (no MM updating) by the ST cache that changes its state in T. In case of a new read from another cache, this last is set T, the previous T is changed in ST and the previous ST is changed in S.An additional improvement can be obtained using more than a ST state, ST1, ST2, … STn.
- In Read Miss, T is changed in ST1 and all the indices of the others STi are increased by "1. - In case of T deallocation, ST1 stops the "Copy Back" transaction, changes its state in T and all the indices of the others STi are decrease by "1". - In case of a deallocation, for instance STk, the chain will be interrupted and all the STi with index greater of "k" are automatically loosen in term of ST and will be considered de facto only as simple S states also if they are set as ST. All this because only ST1 intervenes to block and to replace itself with T. For instance if we have a situation type T, ST1, ST3, ST4 with ST2 replaced, if T will be replaced the new situation will be T, ST2, ST3 without any ST1.HRT-ST-MESI protocol
IBM patented full HRT-ST-MESI protocol7374
- I state = Invalid Tag (*) – Invalid Data - H state = Valid Tag – Invalid Data
- I state is set at the cache initialization and its state changes only after a processor Read or Write miss. After it will not return more in this state.
- H has the same functionality of I state but in addition with the ability to capture any bus transaction that match the Tag of the directory and to update the data cache.
- After the first utilization I is replaced by H in its functions
- The main features are : - Write Back - Intervention both in sharing-clean and dirty data – from T-M-R-E - Reserve states of the Tagged (Shared-Tagged) - Invalid H state (Hover) auto-updating(*) – Note: The Tag for definition is always valid, but until the first updating of the cache line it is considered invalid in order to avoid to update the cache also when this line has been not still required and used.
POWER4 IBM protocol
States M-T-Me-S-I -Mu-SL = RT-MESI+Mu75
- Use of the "shared line"- Used in multi-core/module systems – multi L2 cache 76
- This protocol is equivalent to the RT-MESI protocol for system with multi L2 cache on multi-module systems
- SL - "Shared Last" equivalent to R on RT-MESI
- Me - "Valid Exclusive" = E
- Mu – unsolicited modified state
- "Shared intervention" from SL is done only between L2 caches of the same module
- "Dirty intervention" from T is done only between L2 caches of the same module
- Operations
General considerations on the protocols
Under some conditions the most efficient and complete protocol turns out to be the HRT-ST-MESI protocol.
- Write Back - Intervention both with dirty than shared-clean data - Reserve states of the Tagged state (Shared-Tagged) - Invalid H (Hover) state auto-updatingReferences
US patent 5701413, "Multi-processor system with shared memory" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=US5701413 ↩
EP patent 0923032A1, "Method for transferring data in a multiprocessor computer system with crossbar interconnecting unit" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=EP0923032A1 ↩
"Specification and Verification of the PowerScale Bus Arbitration Protocol: An Industrial Experiment with LOTOS, Chap. 2, Pag. 4" (PDF). ftp://ftp.inrialpes.fr/pub/vasy/publications/cadp/Chehaibar-Garavel-et-al-96.pdf ↩
, Archibald, James; Baer, Jean-Loup (1986). "Cache coherence protocols: evaluation using a multiprocessor simulation model" (PDF). ACM Transactions on Computer Systems. 4 (4). Association for Computing Machinery (ACM): 273–298. doi:10.1145/6513.6514. ISSN 0734-2071. S2CID 713808. http://ctho.org/toread/forclass/18-742/3/p273-archibald.pdf ↩
""MPC7400 RISC Microprocessor User's Manual"" (PDF). http://pccomponents.com/datasheets/MOT-MPC7400.PDF ↩
US Patent 5996049, "Cache-coherency protocol with recently read state for data and instructions" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=US5996049 ↩
""An Introduction to the Intel® QuickPath Interconnect"" (PDF). http://www.intel.ie/content/dam/doc/white-paper/quick-path-interconnect-introduction-paper.pdf ↩
US Patent 6922756, "Forward state for use in cache coherency in a multiprocessor system" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=US6922756 ↩
"POWER4 System Microarchitecture" (PDF). cc.gatech.edu. 2008-10-08. Archived from the original (PDF) on 2013-11-07. https://web.archive.org/web/20131107140531/http://www.cc.gatech.edu/~bader/COURSES/UNM/ece637-Fall2003/papers/TDF02.pdf ↩
"IBM PowerPC 476FP L2 Cache Core Databook" https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/8D5342097498C81A852575C50078D867/$file/L2CacheController_v1.5_ext_Pub.pdf ↩
US Patent 5996049, "Cache-coherency protocol with recently read state for data and instructions" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=US5996049 ↩
US Patent 6275908, "Cache Coherency Protocol Including an HR State" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=US6275908 ↩
US Patent 6334172, "Cache Coherency Protocol with Tagged State for Modified Values" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=US6334172 ↩
US Patent 6275908, "Cache Coherency Protocol Including an HR State" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=US6275908 ↩
US Patent 5996049, "Cache-coherency protocol with recently read state for data and instructions" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=US5996049 ↩
US Patent 6275908, "Cache Coherency Protocol Including an HR State" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=US6275908 ↩
US Patent 6334172, "Cache Coherency Protocol with Tagged State for Modified Values" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=US6334172 ↩
, Archibald, James; Baer, Jean-Loup (1986). "Cache coherence protocols: evaluation using a multiprocessor simulation model" (PDF). ACM Transactions on Computer Systems. 4 (4). Association for Computing Machinery (ACM): 273–298. doi:10.1145/6513.6514. ISSN 0734-2071. S2CID 713808. http://ctho.org/toread/forclass/18-742/3/p273-archibald.pdf ↩
""MPC750UM/D 12/2001 Rev. 1 MPC750 RISC Microprocessor Family User's Manual"" (PDF). http://www.freescale.com/files/32bit/doc/ref_manual/MPC750UM.pdf ↩
US Patent 5996049, "Cache-coherency protocol with recently read state for data and instructions" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=US5996049 ↩
, Archibald, James; Baer, Jean-Loup (1986). "Cache coherence protocols: evaluation using a multiprocessor simulation model" (PDF). ACM Transactions on Computer Systems. 4 (4). Association for Computing Machinery (ACM): 273–298. doi:10.1145/6513.6514. ISSN 0734-2071. S2CID 713808. http://ctho.org/toread/forclass/18-742/3/p273-archibald.pdf ↩
Shanley, T. (1998). Pentium Pro and Pentium II System Architecture. Mindshare PC System Architecture Series. Addison-Wesley. p. 160. ISBN 978-0-201-30973-7. 978-0-201-30973-7 ↩
, Archibald, James; Baer, Jean-Loup (1986). "Cache coherence protocols: evaluation using a multiprocessor simulation model" (PDF). ACM Transactions on Computer Systems. 4 (4). Association for Computing Machinery (ACM): 273–298. doi:10.1145/6513.6514. ISSN 0734-2071. S2CID 713808. http://ctho.org/toread/forclass/18-742/3/p273-archibald.pdf ↩
AMD64 Architecture Programmer's Manual. Vol. 2: System Programming. AMD. May 2013 – via Internet Archive. https://archive.org/details/24593APMV21 ↩
Sweazey, P.; Smith, A. J. (1986). "A class of compatible cache consistency protocols and their support by the IEEE futurebus" (PDF). ACM SIGARCH Computer Architecture News. 14 (2). Association for Computing Machinery (ACM): 414–423. doi:10.1145/17356.17404. ISSN 0163-5964. S2CID 9713683. http://pdf.aminer.org/000/419/524/a_class_of_compatible_cache_consistency_protocols_and_their_support.pdf ↩
US Patent 6334172, "Cache Coherency Protocol with Tagged State for Modified Values" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=US6334172 ↩
, Archibald, James; Baer, Jean-Loup (1986). "Cache coherence protocols: evaluation using a multiprocessor simulation model" (PDF). ACM Transactions on Computer Systems. 4 (4). Association for Computing Machinery (ACM): 273–298. doi:10.1145/6513.6514. ISSN 0734-2071. S2CID 713808. http://ctho.org/toread/forclass/18-742/3/p273-archibald.pdf ↩
Papamarcos, Mark S.; Patel, Janak H. (1984). "A low-overhead coherence solution for multiprocessors with private cache memories". Proceedings of the 11th annual international symposium on Computer architecture - ISCA '84. New York, New York, USA: ACM Press. pp. 348–354. doi:10.1145/800015.808204. ISBN 0818605383. 0818605383 ↩
, Archibald, James; Baer, Jean-Loup (1986). "Cache coherence protocols: evaluation using a multiprocessor simulation model" (PDF). ACM Transactions on Computer Systems. 4 (4). Association for Computing Machinery (ACM): 273–298. doi:10.1145/6513.6514. ISSN 0734-2071. S2CID 713808. http://ctho.org/toread/forclass/18-742/3/p273-archibald.pdf ↩
Goodman, James R. (1983). "Using cache memory to reduce processor-memory traffic" (PDF). Proceedings of the 10th annual international symposium on Computer architecture - ISCA '83. New York, New York, USA: ACM Press. pp. 124–131. doi:10.1145/800046.801647. ISBN 0897911016. 0897911016 ↩
Hwang, K. (2011). Advanced Computer Architecture, 2E. McGraw-Hill computer science series. McGraw Hill Education. p. 301. ISBN 978-0-07-070210-3. 978-0-07-070210-3 ↩
, Archibald, James; Baer, Jean-Loup (1986). "Cache coherence protocols: evaluation using a multiprocessor simulation model" (PDF). ACM Transactions on Computer Systems. 4 (4). Association for Computing Machinery (ACM): 273–298. doi:10.1145/6513.6514. ISSN 0734-2071. S2CID 713808. http://ctho.org/toread/forclass/18-742/3/p273-archibald.pdf ↩
, Archibald, James; Baer, Jean-Loup (1986). "Cache coherence protocols: evaluation using a multiprocessor simulation model" (PDF). ACM Transactions on Computer Systems. 4 (4). Association for Computing Machinery (ACM): 273–298. doi:10.1145/6513.6514. ISSN 0734-2071. S2CID 713808. http://ctho.org/toread/forclass/18-742/3/p273-archibald.pdf ↩
, Archibald, James; Baer, Jean-Loup (1986). "Cache coherence protocols: evaluation using a multiprocessor simulation model" (PDF). ACM Transactions on Computer Systems. 4 (4). Association for Computing Machinery (ACM): 273–298. doi:10.1145/6513.6514. ISSN 0734-2071. S2CID 713808. http://ctho.org/toread/forclass/18-742/3/p273-archibald.pdf ↩
, Archibald, James; Baer, Jean-Loup (1986). "Cache coherence protocols: evaluation using a multiprocessor simulation model" (PDF). ACM Transactions on Computer Systems. 4 (4). Association for Computing Machinery (ACM): 273–298. doi:10.1145/6513.6514. ISSN 0734-2071. S2CID 713808. http://ctho.org/toread/forclass/18-742/3/p273-archibald.pdf ↩
EP patent 0396940B1, Ferruccio Zulian, "Cache memory and related consistency protocol" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=EP0396940B1 ↩
""MPC7400 RISC Microprocessor User's Manual"" (PDF). http://pccomponents.com/datasheets/MOT-MPC7400.PDF ↩
US Patent 5996049, "Cache-coherency protocol with recently read state for data and instructions" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=US5996049 ↩
""An Introduction to the Intel® QuickPath Interconnect"" (PDF). http://www.intel.ie/content/dam/doc/white-paper/quick-path-interconnect-introduction-paper.pdf ↩
Hackenberg, Daniel; Molka, Daniel; Nagel, Wolfgang E. (2009-12-12). "Comparing cache architectures and coherency protocols on x86-64 multicore SMP systems" (PDF). Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture. New York, NY, USA: ACM. pp. 413–422. doi:10.1145/1669112.1669165. ISBN 9781605587981. 9781605587981 ↩
Rolf, Trent (2009), Cache Organization and Memory Management of the Intel Nehalem Computer Architecture (PDF) http://gec.di.uminho.pt/Discip/MInf/cpd1011/PAC/material/nehalemPaper.pdf ↩
David Kanter (2007-08-28), "The Common System Interface: Intel's Future Interconnect", Real World Tech: 5, retrieved 2012-08-12 http://www.realworldtech.com/common-system-interface/5/ ↩
US Patent 6275908, "Cache Coherency Protocol Including an HR State" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=US6275908 ↩
US Patent 6334172, "Cache Coherency Protocol with Tagged State for Modified Values" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=US6334172 ↩
"POWER4 System Microarchitecture" (PDF). cc.gatech.edu. 2008-10-08. Archived from the original (PDF) on 2013-11-07. https://web.archive.org/web/20131107140531/http://www.cc.gatech.edu/~bader/COURSES/UNM/ece637-Fall2003/papers/TDF02.pdf ↩
EP patent 0396940B1, Ferruccio Zulian, "Cache memory and related consistency protocol" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=EP0396940B1 ↩
Sweazey, P.; Smith, A. J. (1986). "A class of compatible cache consistency protocols and their support by the IEEE futurebus" (PDF). ACM SIGARCH Computer Architecture News. 14 (2). Association for Computing Machinery (ACM): 414–423. doi:10.1145/17356.17404. ISSN 0163-5964. S2CID 9713683. http://pdf.aminer.org/000/419/524/a_class_of_compatible_cache_consistency_protocols_and_their_support.pdf ↩
Rolf, Trent (2009), Cache Organization and Memory Management of the Intel Nehalem Computer Architecture (PDF) http://gec.di.uminho.pt/Discip/MInf/cpd1011/PAC/material/nehalemPaper.pdf ↩
US Patent 5996049, "Cache-coherency protocol with recently read state for data and instructions" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=US5996049 ↩
Sweazey, P.; Smith, A. J. (1986). "A class of compatible cache consistency protocols and their support by the IEEE futurebus" (PDF). ACM SIGARCH Computer Architecture News. 14 (2). Association for Computing Machinery (ACM): 414–423. doi:10.1145/17356.17404. ISSN 0163-5964. S2CID 9713683. http://pdf.aminer.org/000/419/524/a_class_of_compatible_cache_consistency_protocols_and_their_support.pdf ↩
EP patent 0396940B1, Ferruccio Zulian, "Cache memory and related consistency protocol" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=EP0396940B1 ↩
Shanley, T. (1998). Pentium Pro and Pentium II System Architecture. Mindshare PC System Architecture Series. Addison-Wesley. p. 160. ISBN 978-0-201-30973-7. 978-0-201-30973-7 ↩
Shanley, T. (1998). Pentium Pro and Pentium II System Architecture. Mindshare PC System Architecture Series. Addison-Wesley. p. 160. ISBN 978-0-201-30973-7. 978-0-201-30973-7 ↩
"Optimizing the MESI Cache Coherence Protocol for Multithreaded Applications on Small Symmetric Multiprocessor Systems". Neal Tibrewala's Resume. 2003-12-12. Archived from the original on 2016-10-22. http://tibrewala.net/papers/mesi98/ ↩
Shanley, T. (1998). Pentium Pro and Pentium II System Architecture. Mindshare PC System Architecture Series. Addison-Wesley. p. 160. ISBN 978-0-201-30973-7. 978-0-201-30973-7 ↩
US Patent 6334172, "Cache Coherency Protocol with Tagged State for Modified Values" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=US6334172 ↩
, Archibald, James; Baer, Jean-Loup (1986). "Cache coherence protocols: evaluation using a multiprocessor simulation model" (PDF). ACM Transactions on Computer Systems. 4 (4). Association for Computing Machinery (ACM): 273–298. doi:10.1145/6513.6514. ISSN 0734-2071. S2CID 713808. http://ctho.org/toread/forclass/18-742/3/p273-archibald.pdf ↩
, Archibald, James; Baer, Jean-Loup (1986). "Cache coherence protocols: evaluation using a multiprocessor simulation model" (PDF). ACM Transactions on Computer Systems. 4 (4). Association for Computing Machinery (ACM): 273–298. doi:10.1145/6513.6514. ISSN 0734-2071. S2CID 713808. http://ctho.org/toread/forclass/18-742/3/p273-archibald.pdf ↩
Goodman, James R. (1983). "Using cache memory to reduce processor-memory traffic" (PDF). Proceedings of the 10th annual international symposium on Computer architecture - ISCA '83. New York, New York, USA: ACM Press. pp. 124–131. doi:10.1145/800046.801647. ISBN 0897911016. 0897911016 ↩
Hwang, K. (2011). Advanced Computer Architecture, 2E. McGraw-Hill computer science series. McGraw Hill Education. p. 301. ISBN 978-0-07-070210-3. 978-0-07-070210-3 ↩
EP patent 0396940B1, Ferruccio Zulian, "Cache memory and related consistency protocol" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=EP0396940B1 ↩
, Archibald, James; Baer, Jean-Loup (1986). "Cache coherence protocols: evaluation using a multiprocessor simulation model" (PDF). ACM Transactions on Computer Systems. 4 (4). Association for Computing Machinery (ACM): 273–298. doi:10.1145/6513.6514. ISSN 0734-2071. S2CID 713808. http://ctho.org/toread/forclass/18-742/3/p273-archibald.pdf ↩
, Archibald, James; Baer, Jean-Loup (1986). "Cache coherence protocols: evaluation using a multiprocessor simulation model" (PDF). ACM Transactions on Computer Systems. 4 (4). Association for Computing Machinery (ACM): 273–298. doi:10.1145/6513.6514. ISSN 0734-2071. S2CID 713808. http://ctho.org/toread/forclass/18-742/3/p273-archibald.pdf ↩
, Archibald, James; Baer, Jean-Loup (1986). "Cache coherence protocols: evaluation using a multiprocessor simulation model" (PDF). ACM Transactions on Computer Systems. 4 (4). Association for Computing Machinery (ACM): 273–298. doi:10.1145/6513.6514. ISSN 0734-2071. S2CID 713808. http://ctho.org/toread/forclass/18-742/3/p273-archibald.pdf ↩
, Archibald, James; Baer, Jean-Loup (1986). "Cache coherence protocols: evaluation using a multiprocessor simulation model" (PDF). ACM Transactions on Computer Systems. 4 (4). Association for Computing Machinery (ACM): 273–298. doi:10.1145/6513.6514. ISSN 0734-2071. S2CID 713808. http://ctho.org/toread/forclass/18-742/3/p273-archibald.pdf ↩
US Patent 5996049, "Cache-coherency protocol with recently read state for data and instructions" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=US5996049 ↩
US Patent 6922756, "Forward state for use in cache coherency in a multiprocessor system" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=US6922756 ↩
Hackenberg, Daniel; Molka, Daniel; Nagel, Wolfgang E. (2009-12-12). "Comparing cache architectures and coherency protocols on x86-64 multicore SMP systems" (PDF). Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture. New York, NY, USA: ACM. pp. 413–422. doi:10.1145/1669112.1669165. ISBN 9781605587981. 9781605587981 ↩
US Patent 6275908, "Cache Coherency Protocol Including an HR State" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=US6275908 ↩
US Patent 6334172, "Cache Coherency Protocol with Tagged State for Modified Values" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=US6334172 ↩
US Patent 6334172, "Cache Coherency Protocol with Tagged State for Modified Values" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=US6334172 ↩
US Patent 6275908, "Cache Coherency Protocol Including an HR State" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=US6275908 ↩
US Patent 6275908, "Cache Coherency Protocol Including an HR State" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=US6275908 ↩
US Patent 6334172, "Cache Coherency Protocol with Tagged State for Modified Values" https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=US6334172 ↩
"POWER4 System Microarchitecture" (PDF). cc.gatech.edu. 2008-10-08. Archived from the original (PDF) on 2013-11-07. https://web.archive.org/web/20131107140531/http://www.cc.gatech.edu/~bader/COURSES/UNM/ece637-Fall2003/papers/TDF02.pdf ↩
"POWER4 System Microarchitecture" (PDF). cc.gatech.edu. 2008-10-08. Archived from the original (PDF) on 2013-11-07. https://web.archive.org/web/20131107140531/http://www.cc.gatech.edu/~bader/COURSES/UNM/ece637-Fall2003/papers/TDF02.pdf ↩