Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
371 views
in Technique[技术] by (71.8m points)

pandas - Remove rows from a couple of dataframes with equal values

I have two data frames that I obtained from a couple of csv files. For example:

df1
    data1   data2   data3
0   cow     cat     53
1   girl    boy     12  
2   monkey  island  30
3   lucas   arts    14

df2
    data1   data2   data3
0   girl    boy     50
1   cover   disc    45  
2   girl    boy     47
3   pen     pencil  15
4   book    note    30
5   lucas   arts    15


df2
    data1   data2   data3
0   cover   disc    45
1   pen     pencil  15
2   book    note    30

I would like to delete from df2 those rows that have the same values of data1 and data2 from df1, so that at the end I will end up with an updated df2. So far, I have done the following:

    file1="fileA.csv"
    file2="fileB.csv"
    df1=pd.read_csv(file1)
    df2=pd.read_csv(file2)
    cond = df1[['data1','data2']].isin(df2[['data1','data2']])
    df2.drop(df1[cond].index,inplace=True)
    df2.to_csv("fileBUpdtated.csv",index=False)

But I am not getting the results that I need, what am I missing?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You need DataFrame.merge

cols = ['data1', 'data2']
df2 = df2.merge(df1[cols], on=cols, how='left', indicator=True)
    .loc[lambda x: x._merge.ne('both')].drop('_merge', axis=1)
print(df2)


   data1   data2  data3
1  cover    disc     45
3    pen  pencil     15
4   book    note     30

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...