Hello TensorFlow Community,
I am experiencing a protobuf-related error in TensorFlow 1.15 when attempting to save my model using tf.train.Saver()
with a large dataset. The same process works fine with a smaller subset of the dataset. Both subsets are sampled from a 20GB dataset, and I am confident that the data processing flow is correct.
Error Message:
libprotobuf ERROR google/protobuf/wire_format_lite.cc:581] String field 'tensorflow.TensorShapeProto.Dim.name' contains invalid UTF-8 data when parsing a protocol buffer. Use the 'bytes' type if you intend to send raw bytes.
Traceback (most recent call last):
File "main_time_series_deconfounder.py", line 67, in <module>
test_time_series_deconfounder(dataset=dataset, num_substitute_confounders=args.num_substitute_hidden_confounders,
File "/root/autodl-tmp/Conformity_Casual_Inferance/time_series_deconfounder.py", line 370, in test_time_series_deconfounder
rmse_without_confounders = train_rmsn(dataset_map, 'rmsn_' + str(exp_name), b_use_predicted_confounders=False)
File "/root/autodl-tmp/Conformity_Casual_Inferance/time_series_deconfounder.py", line 201, in train_rmsn
rnn_fit(dataset_map=dataset_map, networks_to_train='propensity_networks', MODEL_ROOT=MODEL_ROOT,
File "/root/autodl-tmp/Conformity_Casual_Inferance/rmsn/script_rnn_fit.py", line 162, in rnn_fit
hyperparam_opt = train(net_name, expt_name,
File "/root/autodl-tmp/Conformity_Casual_Inferance/rmsn/core_routines.py", line 219, in train
helpers.save_network(sess, model_folder, cp_name, optimisation_summary)
File "/root/autodl-tmp/Conformity_Casual_Inferance/rmsn/libs/net_helpers.py", line 148, in save_network
save_path = saver.save(tf_session, os.path.join(model_folder, "{0}.ckpt".format(cp_name)))
File "/root/miniconda3/envs/casual/lib/python3.8/site-packages/tensorflow_core/python/training/saver.py", line 1200, in save
File "/root/miniconda3/envs/casual/lib/python3.8/site-packages/tensorflow_core/python/training/saver.py", line 1246, in export_meta_graph
File "/root/miniconda3/envs/casual/lib/python3.8/site-packages/tensorflow_core/python/framework/ops.py", line 3238, in as_graph_def
result, _ = self._as_graph_def(from_version, add_shapes)
File "/root/miniconda3/envs/casual/lib/python3.8/site-packages/tensorflow_core/python/framework/ops.py", line 3166, in _as_graph_def
google.protobuf.message.DecodeError: Error parsing message with type 'tensorflow.GraphDef'
The error occurs during the model-saving step, and it seems to be related to how protobuf handles certain data. I came across a similar issue described in another post, where the user faced the same error. The traceback points to graph.ParseFromString
in TensorFlow’s internal operations.
Additional Observations:
- The error suggests that ‘invalid UTF-8 data’ was sent to protobuf.
- The issue is reproducible only with a larger subset (2.5GB) of the data, not with a smaller subset (1.5GB).
- Before the execution of
, it was confirmed thatdata
is of type<class 'bytes'>
- TensorFlow version: 1.15
- Operating System: Ubuntu 20.04.3 LTS
Has anyone encountered a similar problem or can offer any advice on why this error might be occurring with larger datasets in TensorFlow 1.15? Any thoughts or suggestions would be highly appreciated.
Thank you!