Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
534 views
in Technique[技术] by (71.8m points)

caching - Cache Addressing Methods Confusion

I have been reading about the four ways a cache can be addressed:

  1. Physically Indexed Physically Tagged (PIPT)
  2. Physically Indexed Virtually Tagged (PIVT)
  3. Virtually Indexed Physically Tagged (VIPT)
  4. Virtually Indexed Virtually Tagged (VIVT)

Which of the following caches would suffer from the synonym and homonym issues? I know that the VIVT would suffer from these issues and PIPT won't. But what about PIVT and VIPT?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Since synonyms occur when different virtual addresses map to the same physical address (where one wants to avoid false misses), in a VIPT cache synonyms might be (virtually) indexed to different cache sets (in which case data inconsistency is possible, e.g., by a write to a synonym in one set followed by a read of a synonym [same physical address, different virtual address] in another set) while in a PIVT cache synonyms always map to the same set but a difference in the tag portion of the virtual address could result in a miss being indicated.

(A direct-mapped PIVT cache that does the writeback of the victimized block before the miss is services would avoid the synonym issue since the actual memory accessed [physical address] would necessarily force an eviction of any synonym since the physical address index would be the same and there is only one block at that index. A write-through direct-mapped PIVT cache would behave similarly for the same reasons; the most current data values would be in the backing store [L2 or memory]. This assumes that any write buffer or L2 cache is physically tagged. If the backing store is not updated before the miss is serviced, then a false miss [virtual address not matching the tag but having the same physical address] can read stale data from the backing store.)

(Note PIVT typically only makes sense when the virtual index is the same as the physical index, i.e., when virtual bits within the page offset are used. If one already knows the full physical address early enough to index the cache, there is little reason not to use the physical address for tags.)

Using write-through would not remove the synonym issue as long as synonyms could map to different blocks in the cache. If any index bits could differ (with virtual indexing) or more than one way was provided, then a stale value could remain in that alternate place and not be found when the cache is probed with a different virtual address. A sequence of read A, write B, read A (where A and B are synonyms) could have the second read A not see the write B result when that second read A is a cache hit. (Even with a direct-mapped write-through cache, any write buffer would need to be physically tagged [physical indexing is not an issue since write buffers are relatively small].)

While the probability of two synonyms being simultaneously present in L1 cache with a write to one followed by a read of the other might be extremely low, there is still an expectation that such cases will be handled correctly.

Since homonyms occur when the same virtual address maps to the different physical addresses (where one wants to avoid false hits), in a VIPT cache homonyms would map to the same cache set but the tags would be different (so there is no false hits) while in a PIVT cache homonyms could map to the same set (if the indexing physical bits happened to match) and would falsely match in the virtual tags.

In summary, the unlikely PIVT design is subject to synonym and homonym issues and the VIPT design is only subject to synonym issues. A VIVT design has all the issues of the unrealistic PIVT and more (even the special direct-mapped case would not work since synonyms could map to different blocks when the virtual address bits used for indexing are different).

(With multiple cores/processors, coherence is typically handled by physical addresses. While it would be possible to provide a TLB-analogue that translates physical addresses to virtual addresses [at least one PA-RISC processor might have done this], an eviction from that cache of translations would force any cache blocks tagged with that virtual address to be evicted somewhat similar to evictions caused by running out of ASIDs.)

Occurrences of synonyms and homonyms

Writable synonyms are probably not common in general, but one way they can occur is if a file is memory mapped by multiple processes. Obviously if one process has already mapped (e.g., for heap memory) the address range used by another process to map a file, then that one process cannot map the file to the same virtual address range that the other process is using.

Read-only synonyms might be more common. Some OSes use a single zero-filled page across the system and map many virtual pages to this same physical zero page (using copy on write when a program attempts to write to that page). If address space layout randomization (a security feature) is applied per process, different processes might use different virtual addresses for the same physical pages of code/text.

Perhaps the most common form of homonyms is associated with having multiple address spaces. In common OSes, each process is given its own address space (though the OS typically reserves part of that address space for itself and uses the same map for that section in different processes). This type of homonym can be made less problematic by appending an Address Space IDentifier to the virtual address. By this means, special handling of such homonyms is only necessary when an ASID is reused for a particular virtually tagged cache. (ASIDs reduce the frequency of special cache management to avoid homonym issues, but they do not eliminate the problem in general. However, even a reduction in frequency can made the software less complex by reducing performance requirements; highly optimized code is often both more difficult to produce and more difficult to maintain.)

Another form of homonym is when a page is swapped out and then swapped back into memory at a different address. If I/O is done from memory (not cache as in some processors), then the OS must at least writeback any cache contents so flushing the appropriate contents is less of an issue. While the probability that a page will have some contents in cache (especially L1 cache where using virtual addresses is most attractive because of the latency advantage) when the OS considers it a good candidate for eviction to disk is low and the probability that any such content will remain in cache until the page is swapped back into memory is low, even the product of these improbabilities is not zero.

In any case, it can be desirable not to require special handling of such cases even if the hardware designer cannot think of any worthwhile use for synonyms and homonyms.

With a Single Address Space OS, homonyms are impossible since all processes use the same mapping of virtual addresses to physical addresses and if synonyms are allowed they are for read-only memory. Under these conditions, VIVT caches can be used without the homonym and synonym issues. (SASOSes can simplify interprocess communication. However, UNIX-like OSes and some other OSes are designed for multiple address spaces.)


As a side note, synonyms of read-only memory do not introduce a correctness issue (merely potentially wasting bandwidth from false misses and cache capacity from duplicate caching of the same physical memory). This makes VIVT less unattractive for instruction caches. (x86 is somewhat unusual in requiring instruction caches to be cache coherent, though providing coherent instruction caches can simplify some software.)

In addition, synonym issues in VIPT caches can be handled by using the initial virtual index as a form of way prediction (probing alternate sets on a miss--this was done by the AMD Athlon's 64 KiB, 2-way cache with 4 KiB pages--or using a physically indexed tag-inclusive L2 cache with the excess virtual address bits used for indexing L1 included, invalidating the block at the previously cached L1 virtual index) or by requiring any synonyms to index the same set of cache blocks (most simply by page coloring where the corresponding physical address bits are artificially the same as the virtual address bits used for indexing).


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...