Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
593 views
in Technique[技术] by (71.8m points)

python - Proximity Matrix in sklearn.ensemble.RandomForestClassifier

I'm trying to perform clustering in Python using Random Forests. In the R implementation of Random Forests, there is a flag you can set to get the proximity matrix. I can't seem to find anything similar in the python scikit version of Random Forest. Does anyone know if there is an equivalent calculation for the python version?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

We don't implement proximity matrix in Scikit-Learn (yet).

However, this could be done by relying on the apply function provided in our implementation of decision trees. That is, for all pairs of samples in your dataset, iterate over the decision trees in the forest (through forest.estimators_) and count the number of times they fall in the same leaf, i.e., the number of times apply give the same node id for both samples in the pair.

Hope this helps.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...