import pandas as pd
Reproducible setup
I have two dataframes:
df=
pd.DataFrame.from_dict({'A':['xy','yx','zy','zz'],
'B':[[1, 3],[4, 3, 5],[3],[2, 6]]})
df2=
pd.DataFrame.from_dict({'B':[1,3,4,5,6],
'C':['pq','rs','pr','qs','sp']})
df
looks like:
A B
0 xy [1, 3]
1 yx [4, 3, 5]
2 zy [3]
3 zz [2, 6]
df2
looks like:
B C
0 1 pq
1 3 rs
2 4 pr
3 5 qs
4 6 sp
Aim
I would like to combine these two to form res
:
res=
pd.DataFrame.from_dict({'A':['xy','yx','zy','zz'],
'C':['pq','pr','rs','sp']})
ie
A C
0 xy pq
1 yx pr
2 zy rs
3 zz sp
The row with xy
in df
has the lsit [1,3]
. There is a row with value 1
in column B
in df2
. The C
column has value pq
in that row, so I combine xy
with pq
. Same for the next two rows. Last row: there is no value with 2 in column B
in df2
, so I go for the value 6
(the last row in df
has the list [2,6]
).
Question
How can I achieve this without iterating through the dataframe?
A very similar post in Spanish SO, which inspired this post.