I calculated tf/idf values of two documents. The following are the tf/idf values:
1.txt
0.0
0.5
2.txt
0.0
0.5
The documents are like:
1.txt = > dog cat
2.txt = > cat elephant
How can I use these values to calculate cosine similarity?
I know that I should calculate the dot product, then find distance and divide dot product by it. How can I calculate this using my values?
One more question: Is it important that both documents should have same number of words?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…