My neural network is not trained and stands at 0

Please give it a try to see, if the accuracy goes up a bit. If you only take 2 classes and provide more than 100 examples each. Let’s say you only take:

  • one label for all 00xx (label = Zeros)
  • and a second 12xx (label = twelve)

you might see a better accuracy (assuming the first 2 chars always appear around the same pixel locations …)

Let me know …

If you need a solution by tomorrow, please have a look here (Try the demo API with your images):
Google Cloud Vision API

You can also try to use pre-trained models (OCR) from TF hub to speed up things:
e.g: Kaggle | keras-ocr | Kaggle

Hi @whaile, I have trained the model in colab with the images present in the rar file you have shared but for labels i have used random data. I have got the accuracy around 50%. Please refer to this gist for working code example. Thank You.

Hi @Kiran_Sai_Ramineni I understand that you have an array that stores for example: 1. [0, 1, 1, 0, 0, 1, 1, 1, 0, ...]
2. [1, 0, 0, 1, 1, 0, 1, 0, 1, ...]
3. [0, 0, 1, 1, 0, 1, 0, 0, 1, ...]
4. [1, 1, 0, 0, 1, 1, 1, 0, 0, ...]

And I have

21489N
jbd63J
jjdYSm92

is there a mistake in this?

Hi @whaile, You have to do pre processing for converting those string labels to categorical. i have not done pre processing so that i can quickly train those images. I have taken the labels similar to categorical type. Thank You.

Hi @whaile, As @Dennis said you can use OCR for your use case. Please refer to this gist for OCR implementation. Thank You.

1 Like

Hi @Kiran_Sai_Ramineni I don’t think using OCR is suitable for me, I need to train a large model to recognize the ugliest characters possible as for the array, how can I do this processing if I have the English Russian alphabet (uppercase and lowercase) and also all the numbers?

Hi @whaile, I have trained the model with CNN architecture while converting the labels to categorical and got the expected results. Please refer to this gist for working code exmples. Thank You!

1 Like

Hi @Kiran_Sai_Ramineni, I used your code for example:

import os
import numpy as np
from PIL import Image
import tensorflow as tf
from tqdm import tqdm
from sklearn.model_selection import train_test_split

image_list =
labels =

def load_images_as_numpy(directory, max_images=None):
filenames = os.listdir(directory)[:max_images] if max_images else os.listdir(directory)
for filename in tqdm(filenames, desc=“Загрузка файлов”):
x = filename.strip(‘.png’)
labels.append(x)
if filename.endswith(“.png”):
image_path = os.path.join(directory, filename)
image = Image.open(image_path)
image_array = np.array(image)
image_list.append(image_array)
return np.array(image_list)

images_directory = input("Введите путь к директории с изображениями: ")

total_images = len(os.listdir(images_directory))
print(f"Общее количество доступных изображений: {total_images}")

max_images = int(input("Введите максимальное количество изображений для загрузки (или оставьте пустым): ") or total_images)
epochs = int(input("Введите количество эпох: "))
print(“Загрузка изображений…”)

images_numpy_array = load_images_as_numpy(images_directory, max_images)
print(“Изображения успешно загружены.”)

Normalize pixel values to the range [0, 1]

images_numpy_array = images_numpy_array.astype(‘float32’) / 255.0

label_to_int = {label: index for index, label in enumerate(labels)}
int_labels = [label_to_int[label] for label in labels]
one_hot_labels = tf.keras.utils.to_categorical(int_labels, num_classes=max_images)

Split data into training and validation sets

x_train, x_val, y_train, y_val = train_test_split(images_numpy_array, one_hot_labels, test_size=0.2, random_state=42)

model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Conv2D(32, (3, 3), activation=‘relu’, input_shape=(50, 130, 3)))
model.add(tf.keras.layers.MaxPooling2D((2, 2)))
model.add(tf.keras.layers.Conv2D(64, (3, 3), activation=‘relu’))
model.add(tf.keras.layers.MaxPooling2D((2, 2)))
model.add(tf.keras.layers.Conv2D(128, (3, 3), activation=‘relu’))
model.add(tf.keras.layers.MaxPooling2D((2, 2)))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(128, activation=‘relu’))
model.add(tf.keras.layers.Dense(max_images, activation=‘softmax’))

model.compile(optimizer=‘adam’, loss=‘categorical_crossentropy’, metrics=[‘accuracy’])

model.fit(x_train, y_train, epochs=epochs, batch_size=32)

but as you can see the result is 0, I sent 10000 pictures for training, if you send 100 everything works in the first 3 epochs

Logs:

Epoch 118/200
250/250 [==============================] - 27s 108ms/step - loss: 8.9944 - accur
acy: 0.0000e+00
Epoch 119/200
250/250 [==============================] - 27s 110ms/step - loss: 8.9944 - accur
acy: 1.2500e-04
Epoch 120/200
250/250 [==============================] - 28s 114ms/step - loss: 8.9944 - accur
acy: 0.0000e+00
Epoch 121/200
250/250 [==============================] - 26s 106ms/step - loss: 8.9944 - accur
acy: 0.0000e+00
Epoch 122/200
250/250 [==============================] - 29s 115ms/step - loss: 8.9944 - accur
acy: 0.0000e+00
Epoch 123/200
250/250 [==============================] - 27s 106ms/step - loss: 8.9944 - accur
acy: 0.0000e+00
Epoch 124/200
139/250 [===============>…] - ETA: 11s - loss: 8.9909 - accuracy: 0
.0000e+00

@whaile please feel free to provide more training examples per class label. Currently it’s just 1 example per class. e.g: the Folder Name (ij8HUyadf) is the class label and contains 100 images (0-100.png) with the same capture but in different variations.

Try experimenting with the batch size, different network architecture’s, learning rates and data preprocessing to optimise the overall accuracy.

Hi @Dennis then I need a huge opportunity and what characteristics should be in order to make 100 pictures for each class, because even now, when loading 50k pictures, I get an error about not enough memory (8 RAM 4 cores) what characteristics are needed?

Hi @Kiran_Sai_Ramineni So what should I do if I need to use about a million images, but the neural network is trained normally for a maximum of 1000

Hi @whaile, There is no limitation for the training data. To make more images for the same label you can use Data Augmentation techniques. To overcome memory issues please pass the dataset as batches. Thank You.

Hi @Kiran_Sai_Ramineni Oh I don’t know what you can do dataset in packages how can I do it?

Hi @whaile, To perform the data augmented on images and save those images to a directory using the below code.

for image in os.listdir('/content/captchas'):
  img=Image.open(os.path.join('/content/captchas',image))
  subdirectory_path = os.path.join('/content/captchas_dataset', image.strip('.png'))
  os.makedirs(subdirectory_path)
  
  brightness=tf.image.stateless_random_brightness(img,0.2,(1,2))
  brightness_image=tf.keras.utils.array_to_img(brightness)
  brightness_image.save(os.path.join(subdirectory_path,'brightness.png'))
  
  contrast=tf.image.stateless_random_contrast(img,0.2, 0.5, (1,2))
  contrast_image=tf.keras.utils.array_to_img(contrast)
  contrast_image.save(os.path.join(subdirectory_path,'contrast.png'))
  
  r_h=tf.image.random_hue(img,0.2)
  r_h_image=tf.keras.utils.array_to_img(r_h)
  r_h_image.save(os.path.join(subdirectory_path,'r_h.png'))  
  
  quality=tf.image.random_jpeg_quality(img,35, 100)
  quality_image=tf.keras.utils.array_to_img(quality)
  quality_image.save(os.path.join(subdirectory_path,'quality.png'))

The captchas_dataset directory looks like

image

To get more images add more pre processing functions.

now read the captchas_dataset using tf.keras.utils.image_dataset_from_directory( ) and train the model. Thank you!