Andrew Erlichson Basem A. Nayfeh Jaswinder P. Singh Kunle Olukotun
Key Words and Phrases: Clustering, Applications, Shared Memory.
Computer Systems Laboratory
Stanford University
Stanford, CA 94305
aje@pinnacle.stanford.edu
(415) 725-1777
Computer Systems Laboratory
Stanford University
Stanford, CA 94305
bnayfeh@ogun.stanford.edu
(415) 725-3646
Department of Computer Science
Princeton University
Princeton, NJ 08544
jps@cs.princeton.edu
(609) 258-5329
Computer Systems Laboratory
Stanford University
Stanford, CA 94305
kunle@ogun.stanford.edu
(415) 725-3713
Abstract:
Clustering processors together at a level of the memory hierarchy in
shared address space multiprocessors appears to be an attractive
technique from several standpoints: Resources are shared, packaging
technologies are exploited, and processors within a cluster can
share data more effectively. We investigate the performance benefits
that can be obtained by clustering on a range of important scientific
and engineering applications. We find that in general clustering is
not very effective in reducing the inherent communication to
computation ratios. Clustering is more useful in reducing working set
requirements in unstructured applications, and can improve performance
substantially when small first level caches are clustered in these
cases. This suggests that clustering at the first level cache might be
useful in highly-integrated, relatively fine-grained environments.
For less integrated machines such as current distributed shared memory
multiprocessors, our results suggest that clustering is not very
useful in improving application performance, and the decision about
whether or not to cluster should be made on the basis of engineering
and packaging constraints.