Video Swin Transformer in Keras

innat · October 14, 2023, 9:16pm

We have reimplemented Video Swin Transformer model in #Keras, considering supporting multi-backend framework in future. The pretrained weights are also available in both SavedModel and H5 format.

#VideoSwin is a pure transformer based video modeling algorithm, attained top accuracy on the major video recognition benchmarks.

An inference highlights:

from videoswin import VideoSwinT

>>> model = VideoSwinT(num_classes=400)
>>> model.load_weights(
   'TFVideoSwinT_K400_IN1K_P244_W877_32x224.h5'
)
>>> container = read_video('sample.mp4')
>>> frames = frame_sampling(container, num_frames=32)
>>> y = model(frames)
>>> y.shape
TensorShape([1, 400])

>>> probabilities = tf.nn.softmax(y_pred_tf)
>>> probabilities = probabilities.numpy().squeeze(0)
>>> confidences = {
    label_map_inv[i]: float(probabilities[i]) \
    for i in np.argsort(probabilities)[::-1]
}
>>> confidences

A classification results on a sample from Kinetics-400.

{
    'playing_cello': 0.9941741824150085,
    'playing_violin': 0.0016851733671501279,
    'playing_recorder': 0.0011555481469258666,
    'playing_clarinet': 0.0009695519111119211,
    'playing_harp': 0.0007713600643910468
}

innat · October 21, 2023, 9:37am

Starter for fine-tuning Video Swin Transformer on custom dataset.

Topic		Replies	Views
[Kaggle Model] Video Swin Transformer in TF 2, Keras V3, TFLite, ONNX Show and Tell models , kaggle , keras , tflite	0	531	January 5, 2024
Swin Transformers in TensorFlow Show and Tell models , keras , learning , education	2	4561	July 1, 2022
VideoMAE in Keras Show and Tell keras	1	837	October 21, 2023
New Keras Example: Image classification with Swin Transformers Show and Tell models , keras , education	4	2934	April 4, 2022
Introducing my GSOC 2022 project for TF-Hub General Discussion getting_started , tfhub	2	1046	May 30, 2022

Video Swin Transformer in Keras

Related topics