I have a pandas dataframe, which can be created with:
pd.DataFrame([[1,'a','green'],[2,'b','blue'],[2,'b','green'],[1,'e','green'],[2,'b','blue']], columns = ['sales','product','color'], index = ['01-01-2020','01-01-2020','01-02-2020','01-03-2020','01-04-2020'])
and looks like:
I would like to unstack the dataframe with the 'color' feature and create a multiindex by product of [green,blue],[sales,product] with the already existing columns as the second level of the column multiindex. The index of the dataframe is a date. The resultant dataframe that I would like can be created with the code:
pd.DataFrame([[1,'a',2,'b'],[2,'b',np.nan,np.nan],[1,'e',np.nan,np.nan],[np.nan,np.nan,2,'b']],columns = pd.MultiIndex.from_product([['green','blue'],['sales','product']]), index = ['01-01-2020','01-02-2020','01-03-2020','01-04-2020'])
and looks like:
Please note that the resultant dataframe will be shorter than the original due to the common date indices.
For the life of me, I have been unable to figure out how to pivot/unstack correctly to figure this out. I am trying to apply this to a very large dataframe, so performance will be key for me. Many thanks for any and all help!
question from:
https://stackoverflow.com/questions/65904085/pandas-unstack-with-multiindex-columns