python - Pandas unstack with multiindex columns

Question

Welcome To Ask or Share your Answers For Others

python - Pandas unstack with multiindex columns

asked Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Pandas unstack with multiindex columns

I have a pandas dataframe, which can be created with:

pd.DataFrame([[1,'a','green'],[2,'b','blue'],[2,'b','green'],[1,'e','green'],[2,'b','blue']], columns  = ['sales','product','color'], index = ['01-01-2020','01-01-2020','01-02-2020','01-03-2020','01-04-2020'])

and looks like:

I would like to unstack the dataframe with the 'color' feature and create a multiindex by product of [green,blue],[sales,product] with the already existing columns as the second level of the column multiindex. The index of the dataframe is a date. The resultant dataframe that I would like can be created with the code:

pd.DataFrame([[1,'a',2,'b'],[2,'b',np.nan,np.nan],[1,'e',np.nan,np.nan],[np.nan,np.nan,2,'b']],columns = pd.MultiIndex.from_product([['green','blue'],['sales','product']]), index = ['01-01-2020','01-02-2020','01-03-2020','01-04-2020'])

and looks like:

Please note that the resultant dataframe will be shorter than the original due to the common date indices.

For the life of me, I have been unable to figure out how to pivot/unstack correctly to figure this out. I am trying to apply this to a very large dataframe, so performance will be key for me. Many thanks for any and all help!

question from:https://stackoverflow.com/questions/65904085/pandas-unstack-with-multiindex-columns

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-06T19:15:09+0000

Try this:

df.set_index('color', append=True).unstack().swaplevel(0, 1, axis=1).sort_index(axis=1)

Output:

color         blue         green      
           product sales product sales
01-01-2020       b   2.0       a   1.0
01-02-2020     NaN   NaN       b   2.0
01-03-2020     NaN   NaN       e   1.0
01-04-2020       b   2.0     NaN   NaN

Details:

Add 'color' to your existing index with append=True
Unstack the inner most index level, 'color' to add it to columns
Swap the multiindex column header levels and sort

As, @QuangHoang states:

df.set_index('color', append=True).stack().unstack([1,2])

Which is much faster,

4.13 ms ± 274 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

versus

2.78 ms ± 44.7 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Categories

python - Pandas unstack with multiindex columns

python - Pandas unstack with multiindex columns

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags