Word2vec skip gram with negative sampling implementation
Answers
Answered by
0
Note labels is 2 dimensional as required by tf.nn.sampled_softmax_loss used for negative sampling.
The embedding matrix has a size of the number of words by the number of units in the hidden layer. So, if you have 10,000 words and 300 hidden units, the matrix will have size 10,000×300. Remember that we’re using tokenized data for our inputs, usually as integers, where the number of tokens is the number of words in our vocabulary.
n_vocab = len(int_to_vocab)
n_embedding = 300
embedding = tf.Variable(tf.random_uniform((n_vocab, n_embedding), -1, 1))
embed = tf.nn.embedding_lookup(embedding, inputs)
The embedding matrix has a size of the number of words by the number of units in the hidden layer. So, if you have 10,000 words and 300 hidden units, the matrix will have size 10,000×300. Remember that we’re using tokenized data for our inputs, usually as integers, where the number of tokens is the number of words in our vocabulary.
n_vocab = len(int_to_vocab)
n_embedding = 300
embedding = tf.Variable(tf.random_uniform((n_vocab, n_embedding), -1, 1))
embed = tf.nn.embedding_lookup(embedding, inputs)
Similar questions
Science,
8 months ago
Social Sciences,
8 months ago
Math,
8 months ago
English,
1 year ago
History,
1 year ago