Are there any ways get embeddings from one-hot vectors ? As far as I know, embedding layers only accept integer tokens. But, is there a way to input one-hot vectors and get embeddings ?
Is there a reason you need to use one-hot vectors? If there’s no explicit constraint, you can just convert the vectors to integer tokens:
import tensorflow as tf
from tensorflow.keras import backend as K
# Create random integer tensor, 128 tokens with vocab size of 100
integer_embed = tf.random.uniform((32, 128), minval=0, maxval=100, dtype=tf.int64)
# Create one-hot encoding
one_hot = tf.keras.utils.to_categorical(integer_embed)
model = tf.keras.layers.Embedding(100, 32, input_shape=(128,))
model(K.argmax(one_hot, axis=-1))
If you do have a constraint, here’s an example One-Hot Embedding using Layer subclassing (from: Is there anyway to pass one-hot binary vector as input and embed it? · Issue #2505 · keras-team/keras · GitHub):
class OnehotEmbedding(tf.keras.layers.Layer):
def __init__(self, Nembeddings, **kwargs):
self.Nembeddings = Nembeddings
super(OnehotEmbedding, self).__init__(**kwargs)
def build(self, input_shape):
# Create a trainable weight variable for this layer.
self.kernel = self.add_weight(name='kernel',
shape=(input_shape[2], self.Nembeddings),
initializer='uniform',
trainable=True)
super(OnehotEmbedding, self).build(input_shape) # Be sure to call this at the end
def call(self, x):
return K.dot(x, self.kernel)
def compute_output_shape(self, input_shape):
return (input_shape[0], input_shape[1], self.Nembeddings)
model = OnehotEmbedding(32, input_shape=(32, 100, 128))
model(one_hot)