How does an embedding layer cluster similar words?

TimoKer · July 22, 2021, 11:24am

Hi there!

I am aware what the embedding process represents, turning an integer that represent a unique word into an n-dimensional vector, and this for each word in the vocab. However, I am trying to understand how it actually clusters words using the gradient descent? For similar words to be clustered, the distance between two words (in vectorspace) should contribute to the error of the classification model (which proxies the “similarity”, or clustering criteria). At this point, I don’t see how this is the case.

Can someone help me along the way?
Thanks

EDIT: Could you think of it as the network changing the embedding weights, such that the resulting inputs for the subsequent dense layers better reflect the similarity between words of the same “class”? Or in other words, that the network trains to make the input conform better to its classification task?

lgusm · July 23, 2021, 5:17pm

Hi Timo,

I think this video can give you some insights: https://www.youtube.com/watch?v=LE3NfEULV6k

TimoKer · August 6, 2021, 8:50am

Thanks Igusm,
It’s clear now, and as I thought. Fascinating!

PS: sorry for the late reply

Topic		Replies	Views
Word Embeddings Not Accurate TensorFlow nlp , help_request	5	1022	July 19, 2021
Network Appears To Stop Learning Early General Discussion epoc , tfkeras , machine-learning , tensorboard	2	420	January 27, 2024
Embedding dim in multiclass text classification Keras api , help_request	1	315	October 9, 2023
How to create embeddings for text data in tensorflow and how to pass it to the neural network model Recommenders keras , help_request	3	1504	April 27, 2022
How to create embeddings for genres in neural collaborative filtering General Discussion keras	1	1054	July 24, 2024

How does an embedding layer cluster similar words?

Related topics