MPI Cartesian Fast Fourier Transform¶
The MPI Object and how to run MPI processes¶
MPI enables us to run processes across multiply processors. When writing an MPI python script, remember
that each process will see the same script and run through that same script. To get the benefits of MPI
you must split the data you are working on or ensure that each process (called a rank in MPI) works on
different pieces of the data. With this in mind SHIFT will construct 2D/3D grids across multiple processors.
The FFT routines will be computed using a slab decomposition, meaning the grid is divided along the x-axis
in real space and along the y-axis in Fourier space. To run SHIFT with MPI we need to make use of the
MPI object located in shift.mpiutils. This is discussed in MPIutils guide and should
be read before going any further.
The TL;DR, each MPI script should begin with the following:
import sys
from os import environ
# Set thread environment variables FIRST
N_THREADS = '1'
environ['OMP_NUM_THREADS'] = N_THREADS
environ['OPENBLAS_NUM_THREADS'] = N_THREADS
environ['MKL_NUM_THREADS'] = N_THREADS
environ['VECLIB_MAXIMUM_THREADS'] = N_THREADS
environ['NUMEXPR_NUM_THREADS'] = N_THREADS
import shift
MPI = shift.mpiutils.MPI()
and should end with:
MPI.end()
Which closes the MPI environment before ending the script.
To ensure MPI processes are synced use the wait functionality
MPI.wait()
To run the script with 4 processors:
mpirun -n 4 python mpi_script.py
Note
mpirun should be replaced with the correct mpi executable, for instance on some
HPCs this should be srun.
Defining a distributed cartesian grid¶
Before we compute the Fourier transforms it is useful to define the grid the Fourier transform will be computed on. The utility of this will become much more apparent when we go into Fourier space.
First let’s define the size of our box (boxsize) and the resolution of the grid
(ngrid).
Note
Both the boxsize and ngrid can either be single numbers meaning the box is
actually a square/cube or a list for each axes if you would like a rectagle/cuboid shape.
boxsize = 100.
ngrid = 128
xedged, x = shift.cart.mpi_grid1D(boxsize, ngrid, MPI)
Where xedges tells us the xedges of the bin and x the centers.
Note
Notice the MPI object is entered as an argument here. This enables the code to distribute the array to the specific MPI setup.
In 2D and 3D this can be defined as
# in 2D
x2D, y2D = shift.cart.mpi_grid2D(boxsize, ngrid, MPI)
# in 3D
x3D, y3D, z3D = shift.cart.mpi_grid3D(boxsize, ngrid, MPI)
Each process or MPI.rank sees a different part of the grid.
MPI Fourier transforms forward/backwards¶
Let’s now define some field which we call fgrid which is defined in the same
cartesian grid we defined above. You must ensure that fgrid has the same shape
as the real grids defined above – easy to setup if they are directly related. Fourier
transforming this data via an FFT is simple:
# in 2D
fkgrid = shift.cart.mpi_fft2D(fgrid, boxsize, ngrid, MPI)
# in 3D
fkgrid = shift.cart.mpi_fft3D(fgrid, boxsize, ngrid, MPI)
Note
Because we are using slab decomposition we do not support FFTs of 1D grids.
Note
The FFT functions now require the global ngrid as an input as well as the MPI object.
We can find the corresponding Fourier modes to this array using the kgrid functions like so
# in 1D
kx1D = shift.cart.mpi_kgrid1D(boxsize, ngrid)
# in 2D
kx2D, ky2D = shift.cart.mpi_kgrid2D(boxsize, ngrid)
# in 3D
kx3D, ky3D, kz3D = shift.cart.mpi_kgrid3D(boxsize, ngrid)
Computations involving the Fourier modes k can now be easily computed. Once the analysis
in Fourier space is complete, we can return to real space by running
# in 2D
fgrid = shift.cart.mpi_ifft2D(fkgrid, boxsize, ngrid, MPI)
# in 3D
fgrid = shift.cart.mpi_ifft3D(fkgrid, boxsize, ngrid, MPI)
For FFTs using the Discrete Cosine or Sine Transform the analysis can be repeated in a similar fashion.
Note
The DCT/DST produce almost identical results to the FFT, except near the boundaries where the dirichlet and neumann boundary conditions will enforce zeros for the DST and zero derivatives in DCT. Keep this in mind and depending on your setup, you may choose to ignore these boundaries from the analysis.
# Discrete Cosine Transform
# FORWARD DCT Transform
# in 2D
fkgrid = shift.cart.mpi_dct2D(fgrid, boxsize, ngrid, MPI, type=2)
# in 3D
fkgrid = shift.cart.mpi_dct3D(fgrid, boxsize, ngrid, MPI, type=2)
# BACKWARD DCT Transform
# in 2D
fgrid = shift.cart.mpi_idct2D(fkgrid, boxsize, ngrid, MPI, type=2)
# in 3D
fgrid = shift.cart.mpi_idct3D(fkgrid, boxsize, ngrid, MPI, type=2)
# Discrete Sine Transform
# FORWARD DST Transform
# in 2D
fkgrid = shift.cart.mpi_dst2D(fgrid, boxsize, ngrid, MPI, type=2)
# in 3D
fkgrid = shift.cart.mpi_dst3D(fgrid, boxsize, ngrid, MPI, type=2)
# BACKWARD DST Transform
# in 2D
fgrid = shift.cart.mpi_idst2D(fkgrid, boxsize, ngrid, MPI, type=2)
# in 3D
fgrid = shift.cart.mpi_idst3D(fkgrid, boxsize, ngrid, MPI, type=2)
Note
The default is type=2. This can be changed and follows the scipy definitions for DCT/DST. Just
ensure you use the same type for both the forwards and backward transforms.
The corresponding Fourier grid is slightly different for the DCT/DST and can be accessed by running
# Discrete Cosine Transform
# in 2D
kx2D, ky2D = shift.cart.mpi_kgrid2D_dct(boxsize, ngrid, MPI)
# in 3D
kx3D, ky3D, kz3D = shift.cart.mpi_kgrid3D_dct(boxsize, ngrid, MPI)
# Discrete Sine Transform
# in 2D
kx2D, ky2D = shift.cart.mpi_kgrid2D_dst(boxsize, ngrid, MPI)
# in 3D
kx3D, ky3D, kz3D = shift.cart.mpi_kgrid3D_dst(boxsize, ngrid, MPI)