Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
118 views
in Technique[技术] by (71.8m points)

python - pandas new column with difference from series object list

I have a list lets say:

x_lst = [1,2,3,4,5,6,7,8,9]

and a df like:

col_A, col_B
3      [4,3,2]
9      [1,2,3,4,5]
1      [5]
2      [3]

with data types:

col_A            int
col_B            object
dtype: object

now i am trying to create a new column col_C that writes the difference between x_lst and the col_B

expected res:

col_A, col_B         col_C
3      [4,3,2]       False
9      [1,2,3,4,5]   [6,7,8,9]
1      [5]           False
2      [3]           [1,2,4,5,6,7,8,9]

for this i have tried something:

df['col_C'] = np.where(df.col_A != df.col_B.str.len(), x_lst - df.col_B, 'FALSE')

but this gave error:

ValueError: operands could not be broadcast together with shapes (9,) (7964,) 

and if i use tolist():

df['col_C'] = np.where(df.col_A != df.col_B.str.len(), x_lst - df.col_B.tolist(), 'FALSE')

i get:

TypeError: unsupported operand type(s) for -: 'list' and 'list'
question from:https://stackoverflow.com/questions/65952978/pandas-new-column-with-difference-from-series-object-list

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

try using set:

df.col_B.map(set(x_lst).difference).map(list) instead of x_lst - df.col_B

import numpy as np
df['col_C'] = np.where(df.col_A != df.col_B.str.len(), df.col_B.map(set(x_lst).difference).map(list), 'FALSE')

df:

  col_A col_B               col_C
0   3   [4, 3, 2]           FALSE
1   9   [1, 2, 3, 4, 5]     [6, 7, 8, 9]
2   1   [5]                 FALSE
3   2   [3]                 [1, 2, 4, 5, 6, 7, 8, 9]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...