A primer on memory consistency and cache coherence /
Saved in:
Author / Creator: | Sorin, Daniel J. |
---|---|
Imprint: | San Rafael, Calif. (1537 Fourth Street, San Rafael, CA 94901 USA) : Morgan & Claypool, c2011. |
Description: | 1 electronic text (xiii, 197 p.) : ill., digital file. |
Language: | English |
Series: | Synthesis lectures on computer architecture, 1935-3243 ; # 16 Synthesis digital library of engineering and computer science. Synthesis lectures on computer architecture, # 16. |
Subject: | |
Format: | E-Resource Book |
URL for this record: | http://pi.lib.uchicago.edu/1001/cat/bib/10511002 |
Table of Contents:
- Preface
- 1. Introduction to consistency and coherence
- Consistency (a.k.a., memory consistency, memory consistency model or memory model)
- Coherence (a.k.a., cache coherence)
- A consistency and coherence quiz
- What this primer does not do
- 2. Coherence basics
- Baseline system model
- The problem: how incoherence could possibly occur
- Defining coherence
- Maintaining the coherence invariants
- The granularity of coherence
- The scope of coherence
- References
- 3. Memory consistency motivation and sequential consistency
- Problems with shared memory behavior
- What is a memory consistency model
- Consistency vs. coherence
- Basic idea of sequential consistency (SC)
- A little SC formalism
- Naive SC implementations
- A basic SC implementation with cache coherence
- Optimized SC implementations with cache coherence
- Atomic operations with SC
- Putting it all together: MIPS R10000
- Further reading regarding SC
- References
- 4. Total store order and the x86 memory model
- Motivation for TSO/x86
- Basic idea of TSO/x86
- A little TSO formalism and an x86 conjecture
- Implementing TSO/x86
- Atomic instructions and fences with TSO
- Atomic instructions
- Fences
- Further reading regarding TSO
- Comparing SC and TSO
- References
- 5. Relaxed memory consistency
- Motivation
- Opportunities to reorder memory operations
- Opportunities to exploit reordering
- An example relaxed consistency model (XC)
- The basic idea of the XC model
- Examples using fences under XC
- Formalizing XC
- Examples showing XC operating correctly
- Implementing XC
- Atomic instructions with XC
- Fences with XC
- A caveat
- Sequential consistency for data-race-free programs
- Some relaxed model concepts
- Release consistency
- Causality and write atomicity
- A relaxed memory model case study: IBM power
- Further reading and commercial relaxed memory models
- Academic literature
- Commercial models
- Comparing memory models
- How do relaxed memory models relate to each other and TSO and SC
- How good are relaxed models
- High-level language models
- References
- 6. Coherence protocols
- The big picture
- Specifying coherence protocols
- Example of a simple coherence protocol
- Overview of coherence protocol design space
- States
- Transactions
- Major protocol design options
- References
- 7. Snooping coherence protocols
- Introduction to snooping
- Baseline snooping protocol
- High-level protocol specification
- Simple snooping system model: atomic requests
- Atomic transactions
- Baseline snooping system model: non-atomic requests, atomic transactions
- Running example
- Protocol simplifications
- Adding the exclusive state
- Motivation
- Getting to the exclusive state
- High-level specification of protocol
- Detailed specification
- Running example
- Adding the owned state
- Motivation
- High-level protocol specification
- Detailed protocol specification
- Running example
- Non-atomic bus
- Motivation
- In-order vs. out-of-order responses
- Non-atomic system model
- An MSI protocol with a split-transaction bus
- An optimized, non-stalling MSI protocol with a split-transaction bus
- Optimizations to the bus interconnection network
- Separate non-bus network for data responses
- Logical bus for coherence requests
- Case studies
- Sun Starfire E10000
- IBM Power5
- Discussion and the future of snooping
- References
- 8. Directory coherence protocols
- Introduction to directory protocols
- Baseline directory system
- Directory system model
- High-level protocol specification
- Avoiding deadlock
- Detailed protocol specification
- Protocol operation
- Protocol simplifications
- Adding the exclusive state
- High-level protocol specification
- Detailed protocol specification
- Adding the owned state
- High-level protocol specification
- Detailed protocol specification
- Representing directory state
- Coarse directory
- Limited pointer directory
- Directory organization
- Directory cache backed by DRAM
- Inclusive directory caches
- Null directory cache (with no backing store)
- Performance and scalability optimizations
- Distributed directories
- Non-stalling directory protocols
- Interconnection networks without point-to-point ordering
- Silent vs. non-silent evictions of blocks in state S
- Case studies
- SGI origin 2000
- Coherent hypertransport
- Hypertransport assist
- Intel QPI
- Discussion and the future of directory protocols
- References
- 9. Advanced topics in coherence
- System models
- Instruction caches
- Translation lookaside buffers (TLBS)
- Virtual caches
- Write-through caches
- Coherent direct memory access (DMA)
- Multi-level caches and hierarchical coherence protocols
- Performance optimizations
- Migratory sharing optimization
- False sharing optimizations
- Maintaining liveness
- Deadlock
- Livelock
- Starvation
- Token coherence
- The future of coherence
- References
- Author biographies.