inst.eecs.berkeley.edu/~csc UCB CSC : Machine Structures Guest Lecturer Alan Christopher Lecture Caches II -- MEMRISTOR MEMORY ON ITS WAY (HOPEFULLY) HP has begun testing research prototypes of a novel non-volatile memory element, the memristor. They have double the storage density of flash, and has x more read-write cycles than flash ( vs ). Memristors are (in principle) also capable of being memory and logic, how cool is that? Originally slated to be ready by, HP later pushed that date to some time. www.technologyreview.com/computing/8 http://www.technologyreview.com/view//can-hp-save-itself/
Review: New-School Machine Structures Instruction Unit(s) Parallel Requests Assigned to computer e.g., Search Katz Parallel Threads Assigned to core e.g., Lookup, Ads Parallel Instructions > instruction @ one time e.g., pipelined instructions Parallel Data > data item @ one time e.g., Add of pairs of words Hardware descriptions All gates @ one time Programming Languages Core Memory Input/Output Cache Memory (Cache) Core Functional Unit(s)
Review: Direct-Mapped Cache All fields are read as unsigned integers. Index Tag Offset specifies the cache index (or row /block) distinguishes betw the addresses that map to the same location specifies which byte within the block we want tttttttttttttttttt iiiiiiiiii oooo tag index byte to check to offset if have select within correct block block block CSC L Caches II ()
TIO Dan s great cache mnemonic AREA (cache size, B) = HEIGHT (# of blocks) * WIDTH (size of one block, B/block) Tag Index Offset (H+W) = H * W WIDTH (size of one block, B/block) Addr size (often bits) HEIGHT (# of blocks) AREA (cache size, B) CSC L Caches II ()
Memory Access without Cache Load word instruction: lw $t, ($t) $t contains ten, Memory[] = 99. Processor issues address ten to Memory. Memory reads word at address ten (99). Memory sends 99 to Processor. Processor loads 99 into register $t CSC L Caches II ()
Memory Access with Cache Load word instruction: lw $t, ($t) $t contains ten, Memory[] = 99 With cache (similar to a hash). Processor issues address ten to Cache. Cache checks to see if has copy of data at address ten a. If finds a match (Hit): cache reads 99, sends to processor b. No match (Miss): cache sends address to Memory I. Memory reads 99 at address ten II. Memory sends 99 to Cache III. Cache replaces word with new 99 IV. Cache sends 99 to processor. Processor loads 99 into register $t CSC L Caches II ()
Caching Terminology When reading memory, things can happen: cache hit: cache block is valid and contains proper address, so read desired word cache miss: nothing in cache in appropriate block, so fetch from memory cache miss, block replacement: wrong data is in cache at appropriate block, so discard it and fetch desired data from memory (cache always copy) CSC L Caches II ()
Cache Terms Hit rate: fraction of access that hit in the cache Miss rate: Hit rate Miss penalty: time to replace a block from lower level in memory hierarchy to cache Hit time: time to access cache memory (including tag comparison) Abbreviation: $ = cache (A Berkeley innovation!) CSC L Caches II (8)
Accessing data in a direct mapped cache Ex.: KB of data, direct-mapped, word blocks Can you work out height, width, area? Read addresses. x. xc. x. x8 Memory vals here: CSC L Caches II (9) Memory Address (hex) Value of Word 8 C 8 C 8 8 88 8C
Accessing data in a direct mapped cache Addresses: x, xc, x, x8 Addresses divided (for convenience) into Tag, Index, Byte Offset fields Tag Index Offset CSC L Caches II ()
KB Direct Mapped Cache, B blocks Valid bit: determines whether anything is stored in that row (when computer initially turned on, all entries invalid) xc-f x8-b x- x- CSC L Caches II ()
. Read x xc-f x8-b x- x- CSC L Caches II ()
So we read block () xc-f x8-b x- x- CSC L Caches II ()
No valid data xc-f x8-b x- x- CSC L Caches II ()
So load that data into cache, setting tag, valid xc-f x8-b x- x- CSC L Caches II ()
Read from cache at offset, return word b xc-f x8-b x- x- CSC L Caches II ()
. Read xc =.. xc-f x8-b x- x- CSC L Caches II ()
Index is Valid xc-f x8-b x- x- CSC L Caches II (8)
Index valid, Tag Matches xc-f x8-b x- x- CSC L Caches II (9)
Index Valid, Tag Matches, return d xc-f x8-b x- x- CSC L Caches II ()
. Read x =.. xc-f x8-b x- x- CSC L Caches II ()
So read block xc-f x8-b x- x- CSC L Caches II ()
No valid data xc-f x8-b x- x- CSC L Caches II ()
Load that cache block, return word f xc-f x8-b x- x- CSC L Caches II ()
. Read x8 =.. xc-f x8-b x- x- CSC L Caches II ()
So read Cache Block, Data is Valid xc-f x8-b x- x- CSC L Caches II ()
Cache Block Tag does not match (!= ) xc-f x8-b x- x- CSC L Caches II ()
Miss, so replace block with new data & tag xc-f x8-b x- x- CSC L Caches II (8)
And return word J xc-f x8-b x- x- CSC L Caches II (9)
Do an example yourself. What happens? Chose from: Cache: Hit, Miss, Miss w. replace Values returned: a,b, c, d, e,..., k, l Read address x? Read address xc? xc-f x8-b x- x- CSC L Caches II ()
Answers x a hit Index =, Tag matches, Offset =, value = e xc a miss Index =, Tag mismatch, so replace from memory, Offset = xc, value = d Since reads, values must = memory values whether or not cached: x = e xc = d Memory Address (hex) Value of Word 8 C 8 C 8 8 88 8C CSC L Caches II ()
Administrivia Proj - due Sunday CSC L Caches II ()
Multiword-Block Direct-Mapped Cache Four words/block, cache size = K words
Peer Instruction ) Mem hierarchies were invented before 9. (UNIVAC I wasn t delivered til 9) ) All caches take advantage of spatial locality. ) All caches take advantage of temporal locality. CSC L Caches II () a) FFF a) FFT b) FTF b) FTT c) TFF d) TFT e) TTF e) TTT
And in Conclusion Mechanism for transparent movement of data among levels of a storage hierarchy set of address/value bindings address index to set of candidates compare desired address with tag service hit or miss load new block and binding on miss address: tag index offset xc-f x8-b x- x- CSC L Caches II ()