Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
944 views
in Technique[技术] by (71.8m points)

python - Most efficient way to return Column name in a pandas df

I have a pandas df that contains 4 different columns. For every row theres a value thats of importance. I want to return the Column name where that value is displayed. So for the df below I want to return the Column name when the value 2 is labelled.

d = ({
    'A' : [2,0,0,2],     
    'B' : [0,0,2,0],
    'C' : [0,2,0,0],            
    'D' : [0,0,0,0], 
    })

df = pd.DataFrame(data=d)

Output:

   A  B  C  D
0  2  0  0  0
1  0  0  2  0
2  0  2  0  0
3  2  0  0  0

So it would be A,C,B,A

I'm doing this via

m = (df == 2).idxmax(axis=1)[0]

And then changing the row. But this isn't very efficient.

I'm also hoping to produce the output as a Series from pandas df

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Use DataFrame.dot:

df.astype(bool).dot(df.columns).str.cat(sep=',')

Or,

','.join(df.astype(bool).dot(df.columns))

'A,C,B,A'

Or, as a list:

df.astype(bool).dot(df.columns).tolist()
['A', 'C', 'B', 'A']

...or a Series:

df.astype(bool).dot(df.columns)

0    A
1    C
2    B
3    A
dtype: object

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...