[FIXED] How is the frequency basis chosen for 2d fourier transform in numpy?


I’m converting 2D (spatial) images to that of the frequency domain using tf.signal.fft2d (in numpy: np.fft.fft2) and notice that the start and end shapes are the same, although I don’t see why they have to be. For example:

test_img = np.random.rand(100, 100) # shape (100, 100)
spectral = np.fft.fft2(test_img)

# -> spectral.shape = (100, 100)

Given that the image is now in the spectral basis – how are the basis elements chosen in NumPy (and Tensorflow as the implementations are the same)? Specifically, what are the starting (lowest) frequencies, and how are the more periodic ones chosen?


Why do you expect the two axis to be of different length?

By default, the FFT is computed on the points you supply, resulting in a 2D array that (correctly) has the same shape of the input. To change this behavior, you must provide the s parameter to fft2 (see the docs). For example, in your case, calling np.fft.fft2(test_img, s=(200, 100)) will result in an output of shape (200, 100). This is internally obtained by zero padding your input (i.e. adding 100 trailing zeros along the the first dimension), and computing the FFT on the resulting matrix.

As a general rule, for a FFT output of shape (N, M), the (normalized) frequency basis will be 1/N on axis 0 and 1/M on axis 1. To convert them to an actual frequency you need to multiply each by the sampling frequency of the respective dimension.

Be aware that when you compute double-sided FFT (as you are doing), you’ll have positive frequencies up to Nyquist in the first half, and negative frequencies in the second half (see this page)

Answered By – rveronese

Answer Checked By – Timothy Miller (Easybugfix Admin)

Leave a Reply

(*) Required, Your email will not be published