Quantifying and reducing the effects of wrong-path memory references in cache-coherent multiprocessor systems
Date of Original Version
High-performance multiprocessor systems built around out-of-order processors with aggressive branch predictors execute many memory references that turn out to be on a mispredicted branch path. Previous work that focused on uniprocessors showed that these wrong-path memory references may pollute the caches by bringing in data that are not needed on the correct execution path and by evicting useful data or instructions. Additionally, they may also increase the amount of cache and memory traffic. On the positive side, however, they may have a prefetching effect for memory references on the correct path. While computer architects have thoroughly studied the impact of wrong-path effects in uniprocessor systems, there is no previous work on its effects in multiprocessor systems. In this paper, we explore the effects of wrong-path memory references on the memory system behavior of shared-memory multiprocessor (SMP) systems for both broadcast and directory-based cache coherence. Our results show that these wrong-path memory references can increase the amount of cache-to-cache transfers by 32%, invalidations by 8% and 20% for broadcast and directory-based SMPs, respectively, and the number of writebacks by up to 67% for both systems. In addition to the extra coherence traffic, wrong-path memory references also increase the number of cache line state transitions by 21% and 32% for broadcast and directory-based SMPs, respectively. In order to reduce the performance impact of these wrong-path memory references, we introduce two simple mechanisms -filtering wrong-path blocks that are not likely-to-be-used and wrong-path aware cache replacement - that yield speedups of up to 37%. ©2006 IEEE.
20th International Parallel and Distributed Processing Symposium, IPDPS 2006
Sendag, Resit, Ayse Yilmazer, Joshua J. Yi, and Augustus K. Uht. "Quantifying and reducing the effects of wrong-path memory references in cache-coherent multiprocessor systems." 20th International Parallel and Distributed Processing Symposium, IPDPS 2006 2006, (2006). doi:10.1109/IPDPS.2006.1639260.