machine learning - Why are word embedding actually vectors?

Question

Welcome To Ask or Share your Answers For Others

machine learning - Why are word embedding actually vectors?

asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

machine learning - Why are word embedding actually vectors?

I am sorry for my naivety, but I don't understand why word embeddings that are the result of NN training process (word2vec) are actually vectors.

Embedding is the process of dimension reduction, during the training process NN reduces the 1/0 arrays of words into smaller size arrays, the process does nothing that applies vector arithmetic.

So as result we got just arrays and not the vectors. Why should I think of these arrays as vectors?

Even though, we got vectors, why does everyone depict them as vectors coming from the origin (0,0)?

Again, I am sorry if my question looks stupid.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-23T17:52:35+0000

the process does nothing that applies vector arithmetic

The training process has nothing to do with vector arithmetic, but when the arrays are produced, it turns out they have pretty nice properties, so that one can think of "word linear space".

For example, what words have embeddings closest to a given word in this space?

Put it differently, words with similar meaning form a cloud. Here's a 2-D t-SNE representation:

Another example, the distance between "man" and "woman" is very close to the distance between "uncle" and "aunt":

As a result, you have pretty much reasonable arithmetic:

W("woman") ? W("man") ? W("aunt") ? W("uncle")
W("woman") ? W("man") ? W("queen") ? W("king")

So it's not far fetched to call them vectors. All pictures are from this wonderful post that I very much recommend to read.

Categories

machine learning - Why are word embedding actually vectors?

machine learning - Why are word embedding actually vectors?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags