Full Re-Write/Update for clarity (and your sanity, its abit too long) ... (Old Post)
For an assignment, I need to find the levels (L1,L2,...) and size of each cache. Given hints and what I found so far: I think the idea is to create arrays of different sizes and read them. Timing these operations:
sizes = [1k, 4k, 256K, ...]
foreach size in sizes
create array of `size`
start timer
for i = 0 to n // just keep accessing array
arr[(i * 16) % arr.length]++ // i * 16 supposed to modify every cache line ... see link
record/print time
UPDATED (28 Sept 6:57PM UTC+8)
See also full source
Ok now following @mah's advice, I might have fixed the SNR ratio problem ... and also found a method of timing my code (wall_clock_time
from a lab example code)
However, I seem to be getting incorrect results: I am on a Intel Core i3 2100: [SPECS]
- L1: 2 x 32K
- L2: 2 x 256K
- L3: 3MB
The results I got, in a graph:
lengthMod: 1KB to 512K
The base of the 1st peak is 32K ... reasonable ... the 2nd is 384K ... why? I'm expecting 256?
lengthMod: 512k to 4MB
Then why might this range be in a mess?
I also read about prefetching or interference from other applications, so I closed as many things as possible while the script is running, it appears consistently (through multiple runs) that the data of 1MB and above is always so messy?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…