Title
Modeling the performance of geometric multigrid stencils on multicore computer architectures Modeling the performance of geometric multigrid stencils on multicore computer architectures
Author
Faculty/Department
Faculty of Sciences. Mathematics and Computer Science
Publication type
article
Publication
Philadelphia, Pa ,
Subject
Mathematics
Computer. Automation
Source (journal)
SIAM journal on scientific computing. - Philadelphia, Pa
Volume/pages
37(2015) :2 , p. C194-C216
ISSN
1064-8275
ISI
000353838400035
Carrier
E
Target language
English (eng)
Full text (Publishers DOI)
Affiliation
University of Antwerp
Abstract
The basic building blocks of the classic geometric multigrid algorithm all have a low ratio of executed floating point operations per byte fetched from memory. On modern computer architectures, such computational kernels are typically bound by memory traffic and achieve only a small percentage of the theoretical peak floating point performance of the underlying hardware. We suggest the use of state-of-the-art (stencil) compiler techniques to improve the flop per byte ratio, also called the arithmetic intensity, of the steps in the algorithm. Our focus will be on the smoother which is a repeated stencil application. With a tiling approach based on the polyhedral loop optimization framework, data reuse in the smoother can be improved, leading to a higher effective arithmetic intensity. For an academic constant coefficient Poisson problem, we present a performance model for the multigrid $V$-cycle solver based on the tiled smoother. For increasing numbers of smoothing steps, there is a trade-off between the improved efficiency due to better data reuse and the additional flops required for extra smoothing steps. Our performance model predicts time to solution by linking convergence rate to arithmetic intensity via the roofline model. We show results for two-dimensional (2D) and three-dimensional (3D) simulations on Intel Sandy Bridge and for 2D simulations on Intel Xeon Phi architectures. The actual performance is compared with the theoretical predictions.
Full text (open access)
https://repository.uantwerpen.be/docman/irua/e52f99/a7ee861e.pdf
E-info
http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000353838400035&DestLinkType=RelatedRecords&DestApp=ALL_WOS&UsrCustomerID=ef845e08c439e550330acc77c7d2d848
http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000353838400035&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomerID=ef845e08c439e550330acc77c7d2d848
http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000353838400035&DestLinkType=CitingArticles&DestApp=ALL_WOS&UsrCustomerID=ef845e08c439e550330acc77c7d2d848
Handle