[FIXED] Access images after tf.keras.utils.image_dataset_from_directory

Issue

I’m using tf.keras.utils.image_dataset_from_directory to load my images into a dataset for tensorflow. However, I’m confused about how it works. I simply want to be able to imshow each image from the dataset.

data = tf.keras.utils.image_dataset_from_directory('/content/gdrive/MyDrive/Skyrmion Vision/testFiles/train/',batch_size=1,image_size=(171,256))
data_iterator = data.as_numpy_iterator()
batch = data_iterator.next()
batch[0].shape
for image in batch:
  print(type(image))

This was my attempt, but it simply shows 2 numpy.ndarrays as the type, I expected it to show all 12 images as numpy arrays (There are only 12 images in the dataset).

How can I access the images after they are in the tensorflow dataset?

Solution

You created batches of one element each by setting batch_size=1.

So if you do:

data_iterator = data.as_numpy_iterator()
batch = data_iterator.next()

You are only accessing one image because your batch only has one image in it. To get the next batch, and so, the next image, you can call again data_iterator.next().

However if you only want to iterate through your dataset and print the images, you can do like this:

data = tf.keras.utils.image_dataset_from_directory('img',batch_size=1,image_size=(171,256))
for x, y in data:
  print("image: {}, label: {}".format(x,y))

Of course you can also set a higher value of batch_size to access more images at a time.

If you want to access an image at a time, you can either:


Method 1 (advised): iterate through your batches, and through the images within a batch:

import numpy as np
from google.colab.patches import cv2_imshow

data = tf.keras.utils.image_dataset_from_directory('img',batch_size=1,image_size=(171,256))
for batch_x, batch_y in data:
    for i, x in enumerate(batch_x):
       x = np.asarray(x)
       cv2_imshow(x)

The inner loop lets you access separately the images inside the batch. With for batch_x, batch_y in data you obtain two arrays where the first dimension is the number of samples inside the batch. So if you add an inner loop, you can iterate through the elements of your batch. In this case there is only one element, but the loop makes the approach more general.


Method 2: access the images within a batch using an index.

Having set batch_size=1, each batch has only one image. So each batch_x element of the loop is an array of shape (1, 171, 256, 3). Let’s say you set batch_size=2. This would give you a batch_x shape of (2, 171, 256, 3). At this point if you want to display the second (i.e. index 1) image of each batch, you can do this by indexing the image directly:

import numpy as np
from google.colab.patches import cv2_imshow

data = tf.keras.utils.image_dataset_from_directory('img',batch_size=1,image_size=(171,256))
for batch_x, batch_y in data:
    x = np.asarray(batch_x[1])  # access second image of batch
    cv2_imshow(x)

The method cv2_imshow is used to display the images in Google Colab.

Answered By – ClaudiaR

Answer Checked By – Candace Johnson (Easybugfix Volunteer)

Leave a Reply

(*) Required, Your email will not be published