Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
252 views
in Technique[技术] by (71.8m points)

python - How can I combine two dataframes based on a column of lists in Pandas

import pandas as pd

Reproducible setup

I have two dataframes:

df=
pd.DataFrame.from_dict({'A':['xy','yx','zy','zz'],
                        'B':[[1, 3],[4, 3, 5],[3],[2, 6]]})

df2=
pd.DataFrame.from_dict({'B':[1,3,4,5,6],
                        'C':['pq','rs','pr','qs','sp']})

df looks like:

    A          B
0  xy     [1, 3]
1  yx  [4, 3, 5]
2  zy        [3]
3  zz     [2, 6]

df2 looks like:

   B   C
0  1  pq
1  3  rs
2  4  pr
3  5  qs
4  6  sp

Aim

I would like to combine these two to form res:

res=
pd.DataFrame.from_dict({'A':['xy','yx','zy','zz'],
                        'C':['pq','pr','rs','sp']})

ie

    A   C
0  xy  pq
1  yx  pr
2  zy  rs
3  zz  sp

The row with xy in df has the lsit [1,3]. There is a row with value 1 in column B in df2. The C column has value pq in that row, so I combine xy with pq. Same for the next two rows. Last row: there is no value with 2 in column B in df2, so I go for the value 6 (the last row in df has the list [2,6]).


Question

How can I achieve this without iterating through the dataframe?


A very similar post in Spanish SO, which inspired this post.

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You can explode "B" into separate rows, then merge on "B" and drop duplicates.

Big thanks to Asish M. in the comments for pointing out a potential bug with the ordering.

(df.explode('B')
   .merge(df2, on='B', how='left')
   .dropna(subset=['C'])
   .drop_duplicates('A'))

    A  B   C
0  xy  1  pq
2  yx  4  pr
5  zy  3  rs
7  zz  6  sp

Ideally, the following should have worked:

df.explode('B').merge(df2).drop_duplicates('A')

However, pandas (as of writing, version 1.2dev) does not preserve the ordering of the left keys on a merge which is a bug, see GH18776.

In the meantime, we can use the workaround of a left merge as shown above.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...