Compiler-Directed Page Coloring for Multiprocessors

Edouard Bugnion, Jennifer M. Anderson, Todd C. Mowry,
Mendel Rosenblum, and Monica S. Lam.

Proceedings of The Seventh International Symposium on Architectural Support for Programming Languages and Operating Systems (ASPLOS VII)

Abstract

This paper presents a new technique, compiler-directed page coloring, that eliminates conflict misses in multiprocessor applications. It enables applications to make better use of the increased aggregate cache size available in a multiprocessor. This technique uses the compiler's knowledge of the access patterns of the parallelized applications to direct the operating system's virtual memory page mapping strategy. We demonstrate that this technique can lead to significant performance improvements over two commonly used page mapping strategies for machines with either direct-mapped or two-way set-associative caches. We also show that it is complementary to latency-hiding techniques such as prefetching.

We implemented compiler-directed page coloring in the SUIF parallelizing compiler and on two commercial operating systems. We applied the technique to the SPEC95fp benchmark suite, a representative set of numeric programs. We used the SimOS machine simulator to analyze the applications and isolate their performance bottlenecks. We also validated these results on a real machine, an eight-processor 350MHz Digital AlphaServer. Compiler-directed page coloring leads to significant performance improvements for several applications. Overall, our technique improves the SPEC95fp rating for eight processors by 8% over Digital UNIX's page mapping policy and by 20% over a page coloring, a standard page mapping policy. The SUIF compiler achieves a SPEC95fp ratio of 57.4, the highest ratio to date.


The results in the paper used a 350Mhz AlphaServer. The results have since then been updated on the 440Mhz AlphaServer, which currently holds the highest reported SPEC95fp ratio. The SUIF SPEC ratio using CDPC on this machine is 63.8.
The results on the 350Mhz AlphaServer8400 are available here .
The results on the 440Mhz AlphaServer8400 are available here .
Paper Available as: postscript (641kB), and compressed postscript (235kB).
Slides from the ASPLOS talk available as: postscript (739 kb).


Links

  • SimOS
  • SUIF
  • Stanford FLASH multiprocessor
  • The official SPEC95fp results
    Edouard Bugnion
    Last modified: Mon Mar 3 02:07:33 PST