Publication:
Parallel Fast Multipole Method accelerated FFT on HPC clusters

dc.contributor.affiliationDA-IICT, Gandhinagar
dc.contributor.authorMehta, Chahak
dc.contributor.authorKarthi, Amarnath
dc.contributor.authorJetly, Vishrut
dc.contributor.authorChaudhury, Bhaskar
dc.contributor.authorChaudhury, Bhaskar
dc.contributor.authorChaudhury, Bhaskar
dc.contributor.authorChaudhury, Bhaskar
dc.contributor.authorChaudhury, Bhaskar
dc.contributor.authorChaudhury, Bhaskar
dc.contributor.researcherMehta, Chahak (201501422)
dc.contributor.researcherKarthi, Amarnath (201501005)
dc.contributor.researcherJetly, Vishrut (201601449)
dc.date.accessioned2025-08-01T13:09:36Z
dc.date.issued07-01-2021
dc.description.abstractWith increasing sizes of distributed systems, there comes an increased risk of communication bottlenecks. In the past decade there has been a growing interest in communication-avoiding algorithms. The distributed memory Fast Fourier Transform is an important algorithm which suffers from major communication bottlenecks. In this work, we take a look at an existing communication-avoiding algorithm FMM-FFT, an alternative to FFT which utilizes the Fast Multipole Method (FMM) to reduce communications to a single all-to-all communication. We present a detailed implementation of FMM-FFT relying on modern libraries and demonstrate it on two distinct distributed memory architectures notably a traditional Intel Xeon based HPC cluster and then a Beowulf cluster. We show that while the FMM-FFT is significantly slower than FFT on the traditional HPC cluster, on the Beowulf cluster it outperforms standard FFT, consistently getting speedups of 1.5x or more against FFTW. We then proceed to show how the communication to computation cost metric is important and useful in explaining the performance results of FMM-FFT against standard FFT. The source code pertaining to this work is being made publicly available under a permissive open source licence at Github.
dc.format.extent102783
dc.identifier.citationChahak Mehta, Amarnath Karthi, Vishrut Jetly and Chaudhury, Bhaskar, ."Parallel Fast Multipole Method accelerated FFT on HPC clusters," Parallel Computing, vol. 104–105, Elsevier, pp. 102783, Jul. 2021. doi: 10.1016/j.parco.2021.102783
dc.identifier.doi10.1016/j.parco.2021.102783
dc.identifier.issn1872-7336
dc.identifier.scopus2-s2.0-85105086761
dc.identifier.urihttps://ir.daiict.ac.in/handle/dau.ir/2058
dc.identifier.wosWOS:000654719400003
dc.language.isoen
dc.publisherElsevier
dc.relation.ispartofseriesVol. 104-105; No.
dc.source Parallel Computing
dc.source.urihttps://www.sciencedirect.com/science/article/pii/S0167819121000405?via%3Dihub
dc.titleParallel Fast Multipole Method accelerated FFT on HPC clusters
dspace.entity.typePublication
relation.isAuthorOfPublicationd0ffe8b6-980b-4a74-bb54-7408522e6da7
relation.isAuthorOfPublicationd0ffe8b6-980b-4a74-bb54-7408522e6da7
relation.isAuthorOfPublication.latestForDiscoveryd0ffe8b6-980b-4a74-bb54-7408522e6da7

Files

Collections