Good evening, my problem is that I want to train a Keras CNN that could tell me if in a image there is a sewer or not.
I have 2 datasets (one with positives images with sewer and another with no sewer) files with a 8000x60 matrix of decoded depth images, each image if 80x60 so each dataset has like 100 images.
My problem is that i dont know how to code that input to train the CNN. I have always worked with png datasets and now that type. If you have questions just ask.
Thanks in advance.
If your images are already decoded into a matrix, you can try to use tf.data.Dataset.from_tensor_slices() method (tf.data.Dataset | TensorFlow v2.16.1) to create inputs for your model. You pass a tuple into this method: the first element is decoded image matrix (could be numpy array or other array-like types), the second element is at array of integer labels (0 and 1 in your case).
1 Like
That could be nice but how do i tell tf.data.Dataset.from_tensor_slices () to slice my dataset each 80 lines or ¿he is gonna do it auto because there is a white line separator between matrix of images?
And also i didnt understand ““the second element is at array of integer labels (0 and 1 in your case).”” how do i say to the cnn that the first dataset for example is the 1 label (positive sewer) and the second one is 0 (no sewer)
Thank u so much for your response.
If you image data is np.array of shape=(8000, 60) and each 80 rows represent a separate image, you can do: new_data = data.reshape((100, 80, 60))
Then you create two arrays with target values (for each of your original arrays): y_1 = np.zeros(100) and y_2 = np.ones(100)
You create a dataset passing a tuple where the first element is your input data and the second element contains target values: ds = tf.data.Dataset.from_tensor_slices((new_data, y_1))
In your case you’ll have to create two datasets and then concatenate them and shuffle.
Thank you so much Ekaterina for your help i could have continued a lot my work with these information and i have coded this:
img_width, img_height = 80, 60
n_positives_img, n_negatives_img = 17874, 26308
ds_negatives = ["negative_depth.txt"]
ds_positives = ["positive_depth.txt"]
arrayceros = np.zeros(n_negatives_img)
arrayunos = np.ones(n_positives_img)
arraynegativos= ds_negatives.reshape(( n_negatives_img, img_width, img_height))
arraypositivos= ds_positives.reshape((n_positives_img, img_width, img_height))
ds_negatives_target = tf.data.Dataset.from_tensor_slices((arraynegativos, arrayceros))
ds_positives_target = tf.data.Dataset.from_tensor_slices((arraypositivos, arrayunos))
dataset = pd.concat(ds_negatives_target, ds_positives_target)
datasetfinal = np.random.shuffle(dataset)
Im uploading right now the files to google collab to try this, do u think this is good or i have to change something, love your work.
Thanks in advance
You should concatenate tensorflow datasets directly and then randomly shuffle the result:
ds_combined = ds1.concatenate(ds2).shuffle(n_samples)
n_samples should be total number of images in two datasets.
Thank you for your aclaration but when i run the code it gives me this error:
25 arraynegativos= ds_negatives.reshape(( n_negatives_img, img_width, img_height))
26 arraypositivos= ds_positives.reshape((n_positives_img, img_width, img_height))
AttributeError: 'list' object has no attribute 'reshape'
So i converted my ds_negative to numpy array like this:
ds_negatives1 = np.array(ds_negatives)
But it gives me this error:
cannot reshape array of size 1 into shape (26308,80,60)
So now im a bit confused, how do i transform my dataset to be reshaped into that?
Thanks in advance.
Link to google collab script: Google Colab