It is pandas/Dataframe, for every row, I want to keep only the top N (N=3) values and set others to nan
,
import pandas as pd
import numpy as np
data = np.array([['','day1','day2','day3','day4','day5'],
['larry',1,4,4,3,5],
['gunnar',2,-1,3,4,4],
['tin',-2,5,5, 6,7]])
df = pd.DataFrame(data=data[1:,1:],
index=data[1:,0],
columns=data[0,1:])
print(df)
output is
day1 day2 day3 day4 day5
larry 1 4 4 3 5
gunnar 2 -1 3 4 4
tin -2 5 5 6 7
I want to get
day1 day2 day3 day4 day5
larry NaN 4 4 NaN 5
gunnar NaN NaN 3 4 4
tin NaN 5 NaN 6 7
Similar to pandas: Keep only top n values and set others to 0, but I need to keep only N highest available values, otherwise the average is not correct
For the result above I want to keep first 5
only
question from:
https://stackoverflow.com/questions/66048632/how-to-keep-the-only-the-top-n-values-in-a-dataframe 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…