0% found this document useful (0 votes)

40 views

Chip Multicore Processors - Tutorial 8: Task 8.1: Performance of Snooping-Based Cache Coherency

This document contains a tutorial on cache coherency for a multicore processor system. It provides sample operations and asks the student to trace the state of the caches using the MSI protocol and an extended MOSI protocol. It also asks the student to summarize a research paper on cache coherency effects on the Intel Nehalem multicore architecture.

Uploaded by

Bobby Beaman

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views

Chip Multicore Processors - Tutorial 8: Task 8.1: Performance of Snooping-Based Cache Coherency

Uploaded by

Bobby Beaman

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

MUNCHEN TECHNISCHE UNIVERSITAT Lehrstuhl f ur Integrierte Systeme

Chip Multicore Processors Tutorial 8

June 19, 2013

Task 8.1: Performance of Snooping-based Cache Coherency

In this task the performance of snooping-based cache coherency is evaluated. The starting state of a system with three processor cores and their caches is depicted. Each cache entry is the state of the coherency protocol, the tag und two data words. All addresses are hexadecimal and the tag is depicted simpleed as the cahe lines base address of the cache line. Data is also simpleed. You also nd the start state of the memory. As discussed in the previous tutorial, the timing behavior and the performance depend on the coherency implementation. The given implementation for mobile systems is optimized towards power eciency, so that accesses to the external memory should be minimized. The system has the following properties: In case of a hit, no extra stall cycles are required. In case of a miss, Nmem = 40 cycles are required when the block is loaded from memory. In case another cache holds the cache block currrently, it can provide the data within Ncache = 16 to other caches. An invalidation delays the execution by Ninv = 4 cycles. A write back delays the execution by Nwb = 40 cycles. Given is following operation sequence: sequence 1: (P1) read 410 (P2) read 410 (P0) read 430 sequence 2: (P0) write 420, 42 (P2) read 424 (P2) write 424, 23 sequence 3: (P0) write 408, 7 (P2) read 408 (P0) write 408, 9

1: 2: 3:

nomenclature: (CPU) read address and (CPU) write addresse, value. a) Give the changes of the cache entries of each sequence (separately) according to the MSI protocol. Use the following tables for the changes after each operation. Furthermore, give the delay of the whole sequence on execution.

2 sequence 1 Op CPU

Index

State

Tag

Data

sequence 2 Op CPU

Index

State

Tag

Data

sequence 3 Op CPU

Index

State

Tag

Data

b) To optimize the external accesses an owner state (O) is added to the cache coherency protocol. On a write, all other cache entries should be invalidated (write-invalidate). Instead of the memory the current owner will give the data on a read access of another cache. Sketch the modied diagramm of the MOSI protocol.

Invalid

Shared

Modified

Owner

3 c) Perform the same procedure as in part a for the MOSI protocol in the following tables.

sequence 1 Op CPU

Index

State

Tag

Data

sequence 2 Op CPU

Index

State

Tag

Data

sequence 3 Op CPU

Index

State

Tag

Data

Task 8.2: Cache Coherency Example: Intel Nehalem

Read the article Memory Performance and Cache Coherency Eects on an Intel Nehalem Multiprocessor System, Daniel Molka et al., PACT 2009. Shortly describe the investigated architecture? What is decribed by the term ccNUMA? How do the information in the level 3 cache relate to the other levels and how precise is it? Shortly describe the executed benchmarks and central ndings of the article.

Yan Solihin - Fundamentals of Parallel Computer Architecture
100% (2)
Yan Solihin - Fundamentals of Parallel Computer Architecture
547 pages
Parallel 2
No ratings yet
Parallel 2
14 pages
Cache Coherency S
No ratings yet
Cache Coherency S
20 pages
Preliminary Specifications: Programmed Data Processor Model Three (PDP-3) October, 1960
From Everand
Preliminary Specifications: Programmed Data Processor Model Three (PDP-3) October, 1960
Digital Equipment Corporation
No ratings yet
18bce2429 Da 2 Cao
No ratings yet
18bce2429 Da 2 Cao
13 pages
3217
No ratings yet
3217
11 pages
Muge - Snoop Based Multiprocessor Design
No ratings yet
Muge - Snoop Based Multiprocessor Design
32 pages
Shared Memory Architecture
No ratings yet
Shared Memory Architecture
39 pages
Cache Coherence
No ratings yet
Cache Coherence
53 pages
Chip Multicore Processors - Tutorial 7: Task 7.1: Memory Overhead of Cache Coherency
No ratings yet
Chip Multicore Processors - Tutorial 7: Task 7.1: Memory Overhead of Cache Coherency
2 pages
Shared Memory Architecture Concepts and Performance Issues: Outline
No ratings yet
Shared Memory Architecture Concepts and Performance Issues: Outline
7 pages
L39 - Centralized Shared Memory Architectures
No ratings yet
L39 - Centralized Shared Memory Architectures
31 pages
Tutorial08 Solution
No ratings yet
Tutorial08 Solution
13 pages
CA-unit 5-Material-For Reference
No ratings yet
CA-unit 5-Material-For Reference
16 pages
Cache Coherence - MESI MOESI
No ratings yet
Cache Coherence - MESI MOESI
57 pages
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
From Everand
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
MARIO FRANCO
No ratings yet
Module 4
No ratings yet
Module 4
40 pages
Cache Coherency
No ratings yet
Cache Coherency
33 pages
Ownership Based Cache Coherence
No ratings yet
Ownership Based Cache Coherence
10 pages
PlayStation 2 Architecture: Architecture of Consoles: A Practical Analysis, #12
From Everand
PlayStation 2 Architecture: Architecture of Consoles: A Practical Analysis, #12
Rodrigo Copetti
No ratings yet
IJARCCE-46_cachemesiwithverilog
No ratings yet
IJARCCE-46_cachemesiwithverilog
5 pages
Practical Reverse Engineering: x86, x64, ARM, Windows Kernel, Reversing Tools, and Obfuscation
From Everand
Practical Reverse Engineering: x86, x64, ARM, Windows Kernel, Reversing Tools, and Obfuscation
Bruce Dang
No ratings yet
Cache Coherence Protocols: Evaluation Using A Multiprocessor Simulation Model
No ratings yet
Cache Coherence Protocols: Evaluation Using A Multiprocessor Simulation Model
26 pages
Dcos: Cache Embedded Switch Architecture For Distributed Shared Memory Multiprocessor Socs
No ratings yet
Dcos: Cache Embedded Switch Architecture For Distributed Shared Memory Multiprocessor Socs
4 pages
CS 523 Advanced Computer Architecture: Introduction To Cache Coherence Protocols
No ratings yet
CS 523 Advanced Computer Architecture: Introduction To Cache Coherence Protocols
24 pages
ACA Lecture 29 Cache-Coherence 2
No ratings yet
ACA Lecture 29 Cache-Coherence 2
42 pages
Computer Architecture: Multiprocessors Shared Memory Architectures Prof. Jerry Breecher CSCI 240 Fall 2003
No ratings yet
Computer Architecture: Multiprocessors Shared Memory Architectures Prof. Jerry Breecher CSCI 240 Fall 2003
24 pages
LPIC-3 Exam 306-300 Mastery: 500 Practice Questions on High Availability & Storage Clusters
From Everand
LPIC-3 Exam 306-300 Mastery: 500 Practice Questions on High Availability & Storage Clusters
Steve Brown
No ratings yet
Cache Coherency in Multiprocessors (MPS) / Multi-Cores: Topic 9
No ratings yet
Cache Coherency in Multiprocessors (MPS) / Multi-Cores: Topic 9
79 pages
Cache Coherence: - According To Webster's Dictionary
No ratings yet
Cache Coherence: - According To Webster's Dictionary
15 pages
Memory Hierarchy: Haresh Dagale Dept of ESE
No ratings yet
Memory Hierarchy: Haresh Dagale Dept of ESE
32 pages
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
From Everand
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
Rodrigo Copetti
No ratings yet
Cache Coherence: CSE 661 - Parallel and Vector Architectures
No ratings yet
Cache Coherence: CSE 661 - Parallel and Vector Architectures
37 pages
Pattern Based Cache Coherency Architectu
No ratings yet
Pattern Based Cache Coherency Architectu
13 pages
Content Beyond Syllabus PDF
No ratings yet
Content Beyond Syllabus PDF
7 pages
Shared-Memory Architectures: Adapted From A Lecture by Ian Watson, University of Machester
No ratings yet
Shared-Memory Architectures: Adapted From A Lecture by Ian Watson, University of Machester
33 pages
ECE657
No ratings yet
ECE657
15 pages
Cache Coherence (Part 1)
No ratings yet
Cache Coherence (Part 1)
13 pages
Cache Coherency
No ratings yet
Cache Coherency
19 pages
Shared Memory Architectures
No ratings yet
Shared Memory Architectures
34 pages
Cache Coherence: Write-Invalidate Snooping Protocol For Write-Back
No ratings yet
Cache Coherence: Write-Invalidate Snooping Protocol For Write-Back
21 pages
Electronic Structure Calculations on Graphics Processing Units: From Quantum Chemistry to Condensed Matter Physics
From Everand
Electronic Structure Calculations on Graphics Processing Units: From Quantum Chemistry to Condensed Matter Physics
Ross C. Walker
No ratings yet
Cache Coherence: Computer Science & Artificial Intelligence Lab
No ratings yet
Cache Coherence: Computer Science & Artificial Intelligence Lab
36 pages
A Survey of Cache Coherence Mechanisms in Shared M
No ratings yet
A Survey of Cache Coherence Mechanisms in Shared M
27 pages
System Design for Telecommunication Gateways
From Everand
System Design for Telecommunication Gateways
Alexander Bachmutsky
No ratings yet
Cache Coherence: Part I: CMU 15-418: Parallel Computer Architecture and Programming (Spring 2012)
No ratings yet
Cache Coherence: Part I: CMU 15-418: Parallel Computer Architecture and Programming (Spring 2012)
31 pages
Mehmet Senvar - Cache Coherence Protocols
No ratings yet
Mehmet Senvar - Cache Coherence Protocols
30 pages
CSA Mod 3-Part 2 Notes (Cache Coherence)
No ratings yet
CSA Mod 3-Part 2 Notes (Cache Coherence)
19 pages
L7 Multicore 1
No ratings yet
L7 Multicore 1
50 pages
Computer Science II Essentials
From Everand
Computer Science II Essentials
Randall Raus
No ratings yet
Lect10 SMPCC
No ratings yet
Lect10 SMPCC
27 pages
1.symmetric and Distributed Shared Memory Architectures
79% (19)
1.symmetric and Distributed Shared Memory Architectures
29 pages
Cache Coherence and Synchronization - Tutorialspoint
No ratings yet
Cache Coherence and Synchronization - Tutorialspoint
7 pages
Ritesh Kumar Jha 26900121014 Pcc-cs402
No ratings yet
Ritesh Kumar Jha 26900121014 Pcc-cs402
9 pages
05 Multiprocessor
No ratings yet
05 Multiprocessor
54 pages
Coherence
No ratings yet
Coherence
16 pages
Week4 1
No ratings yet
Week4 1
37 pages
Lec 6 SharedArch PDF
No ratings yet
Lec 6 SharedArch PDF
33 pages
Multiprocessor Cache Coherence
No ratings yet
Multiprocessor Cache Coherence
13 pages
Multiprocessing: Flynn's Classification (1966)
No ratings yet
Multiprocessing: Flynn's Classification (1966)
8 pages
Chip Multicore Processors - Tutorial 11: Task 11.1: Routing
No ratings yet
Chip Multicore Processors - Tutorial 11: Task 11.1: Routing
2 pages
Tutorial10 Solution
No ratings yet
Tutorial10 Solution
14 pages
Chip Multicore Processors - Tutorial 10: Task 10.1: Why On-Chip Coherence Is Here To Stay
No ratings yet
Chip Multicore Processors - Tutorial 10: Task 10.1: Why On-Chip Coherence Is Here To Stay
2 pages
Tutorial05 Solution
No ratings yet
Tutorial05 Solution
23 pages
Chip Multicore Processors: Tutorial 9
No ratings yet
Chip Multicore Processors: Tutorial 9
19 pages
Chip Multicore Processors - Tutorial 6: Task 6.1: Cache Misses
No ratings yet
Chip Multicore Processors - Tutorial 6: Task 6.1: Cache Misses
1 page
Chip Multicore Processors - Tutorial 5: Task 5.1: Semaphores
No ratings yet
Chip Multicore Processors - Tutorial 5: Task 5.1: Semaphores
1 page
Chip Multicore Processors: Tutorial 4
No ratings yet
Chip Multicore Processors: Tutorial 4
21 pages
Chip Multicore Processors - Tutorial 4: Task 4.1: Counter Implementation
No ratings yet
Chip Multicore Processors - Tutorial 4: Task 4.1: Counter Implementation
1 page
Chip Multicore Processors - Tutorial 2: 2.1: Frequency and Voltage Scaling, Amdahl's Law
No ratings yet
Chip Multicore Processors - Tutorial 2: 2.1: Frequency and Voltage Scaling, Amdahl's Law
2 pages
Chip Multicore Processors - Tutorial 3: 3.1: 3-Thread Lock
No ratings yet
Chip Multicore Processors - Tutorial 3: 3.1: 3-Thread Lock
2 pages
2010 FinalExam SoCN Solution
No ratings yet
2010 FinalExam SoCN Solution
12 pages
Tutorial 1 - Introduction
No ratings yet
Tutorial 1 - Introduction
4 pages
Final Exam System On Chip Solutions in Networking SS 2010
No ratings yet
Final Exam System On Chip Solutions in Networking SS 2010
12 pages
2008 FinalExam SoCN Final Master Solution
No ratings yet
2008 FinalExam SoCN Final Master Solution
10 pages
Final Exam System On Chip Solutions in Networking SS 2007
No ratings yet
Final Exam System On Chip Solutions in Networking SS 2007
10 pages
ECE 6100 Project 3: Implementation of Cache Coherence Protocols
No ratings yet
ECE 6100 Project 3: Implementation of Cache Coherence Protocols
6 pages
Chip Multicore Processors - Tutorial 8: Task 8.1: Performance of Snooping-Based Cache Coherency
No ratings yet
Chip Multicore Processors - Tutorial 8: Task 8.1: Performance of Snooping-Based Cache Coherency
3 pages
Verification and Computer Architecture Important Links
No ratings yet
Verification and Computer Architecture Important Links
22 pages
Parallel Computing Pastpaper Solve by Noman Tariq
No ratings yet
Parallel Computing Pastpaper Solve by Noman Tariq
30 pages

Chip Multicore Processors - Tutorial 8: Task 8.1: Performance of Snooping-Based Cache Coherency

Uploaded by

Chip Multicore Processors - Tutorial 8: Task 8.1: Performance of Snooping-Based Cache Coherency

Uploaded by

MUNCHEN TECHNISCHE UNIVERSITAT Lehrstuhl f ur Integrierte Systeme

Chip Multicore Processors Tutorial 8

Task 8.1: Performance of Snooping-based Cache Coherency

Task 8.2: Cache Coherency Example: Intel Nehalem

You might also like