Jump to Content

Nonuniform Fast Fourier Transform on TPUs

Chao Ma
Thibault Marin
TJ Lu
Yi-fan Chen
Yue Zhuo
IEEE International Symposium on Biomedical Imaging 2021 (2021) (to appear)

Abstract

In this work, we present a parallel algorithm for implementing the nonuniform Fast Fourier transform (NUFFT) on Google's Tensor Processing Units (TPUs). TPU is a hardware accelerator originally designed for deep learning applications. NUFFT is considered as the main computation bottleneck in magnetic resonance (MR) image reconstruction. The proposed implementation of NUFFT on TPUs is promising in accelerating MR image reconstruction and achieving clinically practical runtime. The computation of NUFFT consists of three operations: an apodization, an FFT, and an interpolation, all being formulated as tensor operations in order to fully utilize TPU's strength in matrix multiplications. The implementation is with TensorFlow. Numerical examples are provided to show a satisfying acceleration of NUFFT on TPUs. With a breakdown of the computation time, the interpolation operation is found as the most computationally expensive one among the three operations in NUFFT. The strong scaling analysis is used to demonstrate the high parallel efficiency of the implementation.