Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
253 views
in Technique[技术] by (71.8m points)

python - Fill in missing pandas data with previous non-missing value, grouped by key

I am dealing with pandas DataFrames like this:

   id    x
0   1   10
1   1   20
2   2  100
3   2  200
4   1  NaN
5   2  NaN
6   1  300
7   1  NaN

I would like to replace each NAN 'x' with the previous non-NAN 'x' from a row with the same 'id' value:

   id    x
0   1   10
1   1   20
2   2  100
3   2  200
4   1   20
5   2  200
6   1  300
7   1  300

Is there some slick way to do this without manually looping over rows?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You could perform a groupby/forward-fill operation on each group:

import numpy as np
import pandas as pd

df = pd.DataFrame({'id': [1,1,2,2,1,2,1,1], 'x':[10,20,100,200,np.nan,np.nan,300,np.nan]})
df['x'] = df.groupby(['id'])['x'].ffill()
print(df)

yields

   id      x
0   1   10.0
1   1   20.0
2   2  100.0
3   2  200.0
4   1   20.0
5   2  200.0
6   1  300.0
7   1  300.0

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

56.9k users

...