


We show not only that several widely used proxies are highly similar to their parents, but also that they differ from non-related codes. This technique uses the angle between two vectors of hardware performance counters to characterize the similarity (or difference) between two applications or proxies. We have recently seen very encouraging results using a cosine similarity metric. Section 3 describes our continuing effort to find methods to accurately assess the similarity of proxies and parents. Further work is needed to refine these comparisons to determine whether the root cause is due to differences more » in the hardware, software stack, platform specific optimization, or some combination of the three. We find that although the MI60 and V100 have nearly the same measured memory bandwidth, memory bound proxy app kernels perform 20-30% worse on the MI60. So far only a small set of ECP proxies are running on AMD GPUs, but we will continue to expand this analysis as additional proxies become available. Section 2 describes work that has been done to compare the performance of proxy applications on AMD MI60 vs. This report presents highlights of these efforts.
