Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
590 views
in Technique[技术] by (71.8m points)

python - How does .apply work on a Pandas DataFrame.groupby?

              Count
League  Result         
EPL     H      16
        D      9
        A      10
Champ   H      67
        D      15
        A      57
        H      87
La Liga D      35
        A      40
        

I have a breakdown of football results for different leagues and a count of how many times that result occurred.

I want to see the proportion of home wins, draws, away wins as a percentage of the total games played. I have seen a solution to this below:

df.groupby("League").apply(lambda g: (g/g.sum()*100)

At first glance, this made sense, but what exactly is g here? I assumed it was the H, D or A count and then the g.sum() summed all of the H,D,A counts grouped by each division. But, if g is just a value, how are we calling the method g.sum()? What exactly is g here?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

g is a DataFrame. Since you group on 'League' you will split the DataFrame up into separate chunks which contain the unique values of 'League'. To illustrate this, we can iterate over the GroupBy object.

for idx, g in df.groupby('League'):  # `idx` is the unique group key
    print(g, '
')

               Count
League Result       
Champ  H          67
       D          15
       A          57
       H          87

               Count
League Result       
EPL    H          16
       D           9
       A          10

                Count
League  Result       
La Liga D          35
        A          40

The apply then acts to apply your function to each of these DataFrame separately. Calling g.sum() will give you a Series that sums each column within the group.

for idx, g in df.groupby('League'):
    print(g.sum(), '
')

Count    226
dtype: int64 

Count    35
dtype: int64 

Count    75
dtype: int64 

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...