Learning multimodal entailment

Sayak_Paul · August 15, 2021, 1:36pm

Sentence 1: Sourav Ganguly is the greatest captain in BCCI.
Sentence 2: Ricky Ponting is the greatest captain in Cricket Australia.

Do these two sentences contradict/entail each other or are they neutral? In NLP, this problem is known as textual entailment and is a part of the GLUE benchmark for language understanding.

On social media platforms, to better curate and moderate content, we often need to utilize multiple sources of data to understand their semantic behavior. This is where multimodal entailment can be useful. In my latest post, I introduce the basics of this topic and present a set of baseline models for the Multimodal Entailment dataset recently introduced by Google. Some recipes include “modality dropout”, cross-attention, and class-imbalance mitigation.

Fun fact: This marks the 100th example on keras.io.

Sayak_Paul · August 16, 2021, 4:04am

Thanking @markdaoust, @lgusm, and @jbgordon for the amazing tutorial on

My post on multimodal entailment uses a fair bit of code from that post (of course with due citation). With that, I wanted to take the opportunity to thank you, folks, for the tutorial since it DEFINITELY helps in solving GLUE tasks more accessible and readily approachable.

lgusm · August 16, 2021, 10:43am

Thanks Sayak!!

I’m very glad that it was helpful!

Topic		Replies	Views
Help getting started with a personal project General Discussion models , getting_started , education	8	1462	January 13, 2022
Reminder! TensorFlow Community Spotlight Program Show and Tell community_spotlight	1	927	September 24, 2021
Video classification with Transformers Show and Tell keras , learning , education	0	1350	June 16, 2021
MAXIM in TensorFlow TensorFlow models , keras , tfhub	3	1709	November 4, 2022
You Don't Know TensorFlow Show and Tell keras , xla , education	1	1067	March 25, 2023

Learning multimodal entailment

Related topics