python 3.x - loop large data in efficient way - OStack Q&A-Knowledge Sharing Community

I have a dataset containing 70k rows and I want to apply jaccard_score to find similarities in every row with other rows. the dataset is look like this.

image

here is my code:

from sklearn.metrics import jaccard_score
mail_list=[]
for index in range(df.shape[0]):
    sub_list=[]
    for row in range(index+1,df.shape[0]):
        
        sub_list.append(round(jaccard_score(df.iloc[index,:],df.iloc[row,:],average='macro'),1))
        
    mail_list.append(max(sub_list))

this code is working fine but it takes too much time. how I modify this code to run fast.

question from:https://stackoverflow.com/questions/65920717/loop-large-data-in-efficient-way

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

Categories

python 3.x - loop large data in efficient way

python 3.x - loop large data in efficient way

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags