Help With Converting NumPy Function To TensorFlow Ops (graph execution issue)

I’m trying to export my command recognition model for deploymenet on embedded devices, however, I’m facing trouble when trying to encapsulate the preprocessing function into my model, that way, when I export, inference becomes much easier.

Here’s my attempt at integrating the preprocessing within my model:

import tensorflow as tf
import numpy as np
import librosa

model = tf.saved_model.load("saved_model")
class_labels = ["background", "down", "go", "left", "no", "off", "on", "right", "stop", "up", "yes", "unknown"]


class ExportModel(tf.Module):
    def __init__(self, model):
        super().__init__()
        self.model = model
        self.__call__.get_concrete_function(
            x=tf.TensorSpec(shape=(), dtype=tf.string))
        self.__call__.get_concrete_function(
            x=tf.TensorSpec(shape=[None, 16000], dtype=tf.float32))

    @staticmethod
    def preprocess_audio(audio_input, num_hops=98, sample_rate=16000, segment_duration=1,
                         hop_duration=0.010, num_bands=50):
        if isinstance(audio_input, str):
            audio_input = tf.io.read_file(audio_input)
            audio_input, _ = tf.audio.decode_wav(audio_input, desired_channels=1, desired_samples=16000)
            audio_input = tf.squeeze(audio_input, axis=-1)

        segment_samples = int(segment_duration * sample_rate)
        hop_samples = int(hop_duration * sample_rate)

        audio_data = audio_input.numpy()

        audio_data_padded = np.pad(audio_data, (0, max(0, segment_samples - len(audio_data))), mode='constant')
        audio_data_normalized = librosa.util.normalize(audio_data_padded)

        bark_spectrogram = librosa.feature.melspectrogram(y=audio_data_normalized, sr=sample_rate, n_fft=512,
                                                          hop_length=hop_samples, n_mels=num_bands)

        log_bark_spectrogram = np.log10(bark_spectrogram + 1e-6)

        if log_bark_spectrogram.shape[1] > num_hops:
            log_bark_spectrogram = log_bark_spectrogram[:, :num_hops]
        elif log_bark_spectrogram.shape[1] < num_hops:
            pad_width = num_hops - log_bark_spectrogram.shape[1]
            log_bark_spectrogram = np.pad(log_bark_spectrogram, ((0, 0), (0, pad_width)), mode='constant')

        log_bark_spectrogram = np.transpose(log_bark_spectrogram)

        log_bark_spectrogram = np.expand_dims(log_bark_spectrogram, axis=0)
        log_bark_spectrogram = np.expand_dims(log_bark_spectrogram, axis=0)
        log_bark_spectrogram = tf.convert_to_tensor(log_bark_spectrogram, dtype=tf.float32)

        return log_bark_spectrogram

    @tf.function
    def __call__(self, x):
        x = self.preprocess_audio(x)
        result = self.model(x, training=False)

        class_ids = tf.argmax(result, axis=-1)
        class_names = tf.gather(class_labels, class_ids)
        return {'predictions': result,
                'class_ids': class_ids,
                'class_names': class_names}


export = ExportModel(model)
tf.saved_model.save(export, "saved_model1")

The code produces an AttributeError :

File "C:\Users\Aamar\PycharmProjects\pythonProject\main.py", line 55, in __call__  *
    x = self.preprocess_audio(x)
File "C:\Users\Aamar\PycharmProjects\pythonProject\main.py", line 29, in preprocess_audio  *
    audio_data = audio_input.numpy()

AttributeError: 'SymbolicTensor' object has no attribute 'numpy'`

After some research and testing, I tried enabling run_functions_eagerly using tf.config.run_functions_eagerly(True) and I removed the two lines:

self.__call__.get_concrete_function(
            x=tf.TensorSpec(shape=(), dtype=tf.string))
self.__call__.get_concrete_function(
            x=tf.TensorSpec(shape=[None, 16000], dtype=tf.float32))

Which compiled the code successfully, however, when I tried inference on the exported model, it didn’t work as I expected since it threw a TypeError for string input, which means the preprocessing isn’t being integrated properly.

My suspesion is that tf.function executes its code as a graph execution, which behaves differently from eager execution, it’s more efficient and optimized.

Blockquote
In the previous three guides, you ran TensorFlow eagerly. This means TensorFlow operations are executed by Python, operation by operation, and return results back to Python.
While eager execution has several unique advantages, graph execution enables portability outside Python and tends to offer better performance. Graph execution means that tensor computations are executed as a TensorFlow graph, sometimes referred to as a tf.Graph or simply a “graph.”
Graphs are data structures that contain a set of tf.Operation objects, which represent units of computation; and tf.Tensor objects, which represent the units of data that flow between operations. They are defined in a tf.Graph context. Since these graphs are data structures, they can be saved, run, and restored all without the original Python code.

The solution I believe is to remove NumPy array operations and instead replace them with TensorFlow Operations to avoid any graph execution problems, I tried to change the preprocessing logic yet my attempts were futile.

Any feedback or insights on how to do it would be greatly appreaciated.

P.S: Just let me know if you need the standalone model or any other necessary code to furthur understand the problem