List index out of range while saving a trained model

anon33452701 · January 6, 2022, 7:48am

PLEASE HELP !!

I’m trying to fine-tune a pre-trained DistilBERT model from Huggingface using Tensorflow. Everything runs smoothly and the model builds and trains without error. But when I try to save the model it stops with the error “IndexError: list index out of range”. I’m using pycharm with TPU.

Any help would be much appreciated!

Code:

import h5py
import numpy as np
import pandas as pd
import pydot
import simplejson as simplejson
import tensorflow as tf
import os
from transformers import pipeline
from tensorflow import keras
train = pd.read_csv("train.csv")
print("Training_dataset_shape:", train.shape)

MAX_LEN=100

from transformers import BertTokenizer, TFBertModel, TFAutoModel,AutoTokenizer

#model_name = "bert-base-multilingual-cased"
#tokenizer = BertTokenizer.from_pretrained(model_name) # FC: this is the tokenizer we will use on our text data to tokenize it

model_name = "huggingface/distilbert-base-uncased-finetuned-mnli"
tokenizer = AutoTokenizer.from_pretrained(model_name)

# FC we make a function in order to have a list of the id for each word and the separator
def encode_sentence(s):
   tokens = list(tokenizer.tokenize(s)) # FC: split the sentence into tokens that are either words or sub-words
   tokens.append('[SEP]') # FC: a token called [SEP] (=separator) is added to mark end of each sentence
   return tokenizer.convert_tokens_to_ids(tokens) # # FC: instead of returning the list of tokens, a list of each token ID is returned


def bert_encode(hypotheses, premises,
                tokenizer):  # FC: for RoBERTa we remove the input_type_ids from the inputs of the model

    num_examples = len(hypotheses)

    sentence1 = tf.ragged.constant([  # FC: constructs a constant ragged tensor. every entry has a different length
        encode_sentence(s) for s in np.array(hypotheses)])

    sentence2 = tf.ragged.constant([
        encode_sentence(s) for s in np.array(premises)])

    cls = [tokenizer.convert_tokens_to_ids(['[CLS]'])] * sentence1.shape[
        0]  # FC: list of IDs for the token '[CLS]' to denote each beginning

    input_word_ids = tf.concat([cls, sentence1, sentence2],
                               axis=-1)  # FC: put everything together. every row still has a different length.

    # input_word_ids2 = tf.concat([cls, sentence2, sentence1], axis=-1)

    # input_word_ids = tf.concat([input_word_ids1, input_word_ids2], axis=0) # we duplicate the dataset inverting sentence 1 and 2

    input_mask = tf.ones_like(
        input_word_ids).to_tensor()  # FC: first, a tensor with just ones in it is constructed in the same size as input_word_ids. Then, by applying to_tensor the ends of each row are padded with zeros to give every row the same length
    type_cls = tf.zeros_like(cls)  
    type_s1 = tf.zeros_like(sentence1)
    type_s2 = tf.ones_like(sentence2)  
    input_type_ids = tf.concat(
        [type_cls, type_s1, type_s2], axis=-1).to_tensor()  # FC: concatenates everything and again adds padding
    inputs = {
        'input_word_ids': input_word_ids.to_tensor(),  # FC: input_word_ids hasn't been padded yet - do it here now
        'input_mask': input_mask}

    return inputs

train_input = bert_encode(train.premise.values, train.hypothesis.values, tokenizer)
# total_train_input = bert_encode(total_train.premise.values, total_train.hypothesis.values, tokenizer)


max_len = 136  #: FC 50 in the initial tutorial


def build_model():
    encoder = TFAutoModel.from_pretrained(model_name)
    input_word_ids = tf.keras.Input(shape(max_len,),dtype=tf.int32,name="input_word_ids")  
    input_mask = tf.keras.Input(shape=(max_len,), dtype=tf.int32,name="input_mask") 
    embedding = encoder([input_word_ids, input_mask])[0]  # FC: add_input_type_ids for the BERT model
    output = tf.keras.layers.Dense(3, activation='softmax')(embedding[:, 0, :])
    model = tf.keras.Model(inputs=[input_word_ids, input_mask],outputs=output)  # FC: based on the code in the lines above, a model is now constructed and passed into the variable model
    model.compile(tf.keras.optimizers.Adam(learning_rate=1e5),loss='sparse_categorical_crossentropy', metrics=['accuracy'])  
    return model


with strategy.scope(): 
    model = build_model() 
    model.summary()       


# print("model.layers[2]:-------". model.layers[2])
# model.layers[2].trainable=True


for key in train_input.keys():
    train_input[key] = train_input[key][:,:max_len]

print("train the model now")
early_stop = tf.keras.callbacks.EarlyStopping(patience=3,restore_best_weights=True)


model.fit(train_input, train.label.values, epochs = 3, verbose = 1, validation_split = 0.01,
         batch_size=16*strategy.num_replicas_in_sync,
          callbacks=[early_stop])
print("Training is completeted")


model.save("saved_model/trackers/1")```

Ekaterina_Dranitsyna · January 6, 2022, 1:10pm

Try using model.save_weights() instead of model.save()

anon33452701 · January 6, 2022, 1:24pm

But i want to save the whole model as i need to deploy this in production.

ck37 · January 10, 2022, 3:52pm

Hi Chitra,

I ran into this problem as well and HuggingFace has shown that it can be resolved by changing how the inputs are passed to the transformer layer:

github.com/huggingface/huggingface_hub

Can't upload custom TF models e.g. TFBERT, TFRoBERTa due to model.save() failing

opened 08:22PM - 09 Jan 22 UTC

closed 05:08PM - 17 Aug 22 UTC

ck37

Hello, I'm interested in sharing models on the HF Hub but it appears that cus…tom TF models can't be shared because their model.save() doesn't work (appears to be a known issue: https://github.com/huggingface/transformers/issues/13610#issuecomment-921134250). I've [uploaded a Jupyter notebook to colab](https://colab.research.google.com/drive/12uUvOTKUsPTcBMJphhhgglQRT_8pcQmC#scrollTo=VhGQbQvYevrM) which shows the error for simple RoBERTA and BERT models. Note: I ran this in jupyter rather than Colab so it would probably require a little editing to work directly in Colab. Is it possible to also use save_pretrained() for this use case as well? Even if so it would be good to resolve the model.save() issue for the TF user base, e.g. others are struggling with this https://discuss.tensorflow.org/t/list-index-out-of-range-while-saving-a-trained-model/6901 cc: @Rocketknight1 @LysandreJik Thanks, Chris

If you change this line:

embedding = encoder([input_word_ids, input_mask])[0]  # FC: add_input_type_ids for the BERT model

To this (i.e. pass the inputs as separate arguments rather than as a list):

embedding = encoder(input_word_ids, input_mask)[0]  # FC: add_input_type_ids for the BERT model

Are you able to save()?

Cheers,
Chris

anon33452701 · January 11, 2022, 10:31am

Thank you so much criss. i will try this.

Sidharth_Ramachandra · January 24, 2022, 5:51am

Hi ck37,
I did the way you suggested and ended up with an UninplementedError

Sidharth_Ramachandra · January 24, 2022, 5:53am

Hi Chitra,
Were you able to solve the issue somehow?

Ross_Claytor · January 26, 2022, 1:26am

To any interested - I ran across this same issue and ck37’s solution above worked for me.

Kalpana_Rani · March 8, 2022, 3:18am

Thank you soo much @ck37, I was lost for days with this issue, it resolved for me

Tada_Yatabe · March 22, 2022, 1:18pm

@ck37 's solution did it for me. Thank you folks for bringing this up and finding the solution

Topic		Replies	Views
Model optimizer state isn't saved TensorFlow models , help_request	3	1128	April 27, 2023
Trying to create a english story generator with BERT model and graph execution error came unwanted Keras models	1	865	December 19, 2024
Tensorflow dataset has () shape General Discussion models , nlp , datasets , help_request	1	2315	May 12, 2022
BertEncoder inputs? General Discussion models , nlp , model_garden , help_request	2	1534	September 3, 2022
ValueError: `logits` and `labels` must have the same shape, received ((None, 256, 256, 4) vs (None,)) ValueError: `logits` and `labels` must have the same shape, received ((None, 256, 256, 4) vs (None,)) General Discussion models , keras , help_request	2	5294	June 7, 2024

List index out of range while saving a trained model

Related topics