python 3.x - How to calculate aggregated mean based on time window in Pandas?

Question

Welcome To Ask or Share your Answers For Others

python 3.x - How to calculate aggregated mean based on time window in Pandas?

asked Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

python 3.x - How to calculate aggregated mean based on time window in Pandas?

For the problem, I want to calculate the mean for a particular member with a time window of 3 years. For example for the following data frame:

member_id	Feature 1	Feature 2	Feature 3	Date
1	0.1	0.5	0.2	1/2/20
1	0.2	0.3	0.3	1/2/18
1	0.3	0.2	0.2	1/2/16
1	0.1	0.2	0.1	1/4/17
2	0.4	0.1	0.4	1/2/18
2	0.5	0.1	0.2	1/2/15

question from:https://stackoverflow.com/questions/65920966/how-to-calculate-aggregated-mean-based-on-time-window-in-pandas

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-06T19:08:06+0000

Proceed as follows:

Define a function to get mean features for particular row from the current group, for the required period:
```
def getRowMeans(row, grp):
    dTo = row.Date
    return grp[grp.Date.between(dTo - pd.DateOffset(years=3), dTo)]
        .loc[:, 'Feature 1' : 'Feature 3'].mean()
```
The idea is:
- the current object is a row, from a group of rows (for some member_id),
- for the operations below the whole group must also be known, so it is another parameter of this function (grp),
- from grp take rows for 3 years preceding from the Date from the current row (including this date),
- from these rows take all 3 Feature columns and return their means.
Define a function, to be called for each group of rows (for some member_id), returning a copy of this group with all Feature columns replaced with their means (generated by getRowMeans):
```
def FeaturesToMeans(grp):
    means = grp.apply(getRowMeans, axis=1, grp=grp)
    rv = grp.copy()
    rv.update(means)
    return rv
```
The first step is to compute feature means.

In order not to alter the original group, the object to be finally returned (rv) must be created as a copy of the original group.

Then it is updated with the just computed means. Note however that update operates in place and does not return any result.

The returned object is the updated group.

Generate the actual result, as a new DataFrame:

result = df.groupby('member_id', group_keys=False).apply(FeaturesToMeans)

The result, for your sample data, is:

   member_id  Feature 1  Feature 2  Feature 3       Date
0          1   0.133333   0.333333       0.20 2020-01-02
1          1   0.200000   0.233333       0.20 2018-01-02
2          1   0.300000   0.200000       0.20 2016-01-02
3          1   0.200000   0.200000       0.15 2017-01-04
4          2   0.450000   0.100000       0.30 2018-01-02
5          2   0.500000   0.100000       0.20 2015-01-02

Categories

python 3.x - How to calculate aggregated mean based on time window in Pandas?

python 3.x - How to calculate aggregated mean based on time window in Pandas?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags