>>> y = pandas.Series([0,0,1,1,1,0,0,1,0,1,1])
The following may seem a little magical, but actually uses some common idioms: since pandas
doesn't yet have nice native support for a contiguous groupby
, you often find yourself needing something like this.
>>> y * (y.groupby((y != y.shift()).cumsum()).cumcount() + 1)
0 0
1 0
2 1
3 2
4 3
5 0
6 0
7 1
8 0
9 1
10 2
dtype: int64
Some explanation: first, we compare y
against a shifted version of itself to find when the contiguous groups begin:
>>> y != y.shift()
0 True
1 False
2 True
3 False
4 False
5 True
6 False
7 True
8 True
9 True
10 False
dtype: bool
Then (since False == 0 and True == 1) we can apply a cumulative sum to get a number for the groups:
>>> (y != y.shift()).cumsum()
0 1
1 1
2 2
3 2
4 2
5 3
6 3
7 4
8 5
9 6
10 6
dtype: int32
We can use groupby
and cumcount
to get us an integer counting up in each group:
>>> y.groupby((y != y.shift()).cumsum()).cumcount()
0 0
1 1
2 0
3 1
4 2
5 0
6 1
7 0
8 0
9 0
10 1
dtype: int64
Add one:
>>> y.groupby((y != y.shift()).cumsum()).cumcount() + 1
0 1
1 2
2 1
3 2
4 3
5 1
6 2
7 1
8 1
9 1
10 2
dtype: int64
And finally zero the values where we had zero to begin with:
>>> y * (y.groupby((y != y.shift()).cumsum()).cumcount() + 1)
0 0
1 0
2 1
3 2
4 3
5 0
6 0
7 1
8 0
9 1
10 2
dtype: int64
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…