Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
609 views
in Technique[技术] by (71.8m points)

java - How do I calculate the cosine similarity of two vectors?

How do I find the cosine similarity between vectors?

I need to find the similarity to measure the relatedness between two lines of text.

For example, I have two sentences like:

system for user interface

user interface machine

…?and their respective vectors after tF-idf, followed by normalisation using LSI, for example [1,0.5] and [0.5,1].

How do I measure the smiliarity between these vectors?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

If you want to avoid relying on third-party libraries for such a simple task, here is a plain Java implementation:

public static double cosineSimilarity(double[] vectorA, double[] vectorB) {
    double dotProduct = 0.0;
    double normA = 0.0;
    double normB = 0.0;
    for (int i = 0; i < vectorA.length; i++) {
        dotProduct += vectorA[i] * vectorB[i];
        normA += Math.pow(vectorA[i], 2);
        normB += Math.pow(vectorB[i], 2);
    }   
    return dotProduct / (Math.sqrt(normA) * Math.sqrt(normB));
}

Note that the function assumes that the two vectors have the same length. You may want to explictly check it for safety.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

56.9k users

...