Assist Threads for Data Prefetching in IBM XL Compilers
Processor chips in development today typically support multiple hardware
threads of execution. When an application does not exhibit enough parallelism
to effectively use all available threads, the extra threads can be used as
assist threads to prefetch data for the main thread, and thus improve
performance. In our model, the main thread performs all useful work in the
application. Work done in the assist thread is not necessary for correct
execution of the application, nor does it interfere with any results generated
by the application. Thus, we can throttle assist thread execution or skip work
in the assist thread in order to synchronize with the main thread. In this
paper, we describe the IBM XL compiler transformation that automatically
generates prefetching code for the assist thread, and optimizes the resulting
multi-threaded code by inserting synchronization. We also describe the runtime
system that is used to control execution of the assist thread with respect to
the main thread. We present experimental results that show the potential
benefit of using assist threads to prefetch data in a system with Power5
processors.
Greg Steffan
Last modified: Wed Aug 26 17:52:21 EDT 2009