Krzysztof Rojek,

Czestochowa University of Technology


This talk will address an efficient and portable adaptation of stencil-based 3D MPDATA algorithm to GPU cluster. We propose a performance model, which allows for the efficient distribution of computation across GPU resources. Since MPDATA is strongly memory-bounded, the main challenge of providing a high performance implementation is to reduce GPU global memory transactions. With this purpose, our performance model ensures a comprehensive analysis of transactions based on local memory utilization, sizes of halo areas (ghost zones), data dependencies between and within stencils. The results of analysis performed using the proposed model are number of GPU kernels, distribution of stencils across kernels, as well as sizes of CUDA blocks for each kernel.


Date: 2014-Nov-27     Time: 10:00:00     Room: 336

For more information: