Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
321 views
in Technique[技术] by (71.8m points)

python - How do I subtract and add vectors with gensim KeyedVectors?

I need to add and subtract word vectors, for a project in which I use gensim.models.KeyedVectors (from the word2vec-google-news-300 model)

Unfortunately, I've tried but can't manage to do it correctly.

Let's look at the poular example queen ~= king - man + woman.
When I want to subtract man from king and add woman,
I can do this with gensim by

# model is loaded using gensim.models.KeyedVectors.load()
model.wv.most_similar(positive=["king", "woman"], negative=["man"])[0]

which, as expected, returns ('queen', 0.7118192911148071) for the model I use.

Now, to achieve the same with adding and subtracting vectors (all of them are unit-normed), I've tried the following code:

 vec1, vec2, vec3 = model.wv["king"], model.wv["man"], model.wv["woman"]
 result = model.similar_by_vector(vec1 - vec2 + vec3)[0]

result in the code above is ('king', 0.7992597222328186) which is not what I'd expect.

What is my mistake?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You're generally doing the right thing, but note:

  • the most_similar() method also disqualifies from its results any of the named words provided - so even if 'king' is (still) the closest word to the result, it will be ignored. Your formulation might very well have 'queen' as the next-closest word, after ignoring the input words - which is all that the 'analogy' tests need.

  • the most_similar() method also does its vector-arithmetic on versions of the vectors that are normalized to unit length, which can result in slightly different answers. If you change your uses of model.wv['king'] to model.get_vector('king', norm=True), you'll get the unit-normed vectors instead.

See also similar earlier answer: https://stackoverflow.com/a/65065084/130288


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...