Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
220 views
in Technique[技术] by (71.8m points)

Pandas - column is not found as key to groupby

This is driving me crazy.

I have a dataframe, which has df.columns printed:

None Index(['id', 'gamweek', ... 'FF', 'price'],
      dtype='object')

And which also prints df.info():

Data columns (total 26 columns):
 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----  
 0   id          9000 non-null   int64  
 1   gameweek    9000 non-null   int64  
 2   FF          9000 non-null   float64
 ...
 25  price       9000 non-null   float64

When I try to groupby, on the following line of code after printing, like so:

df[columns].groupby('id').mean(axis=1, skipna=True)

I am getting the error:

KeyError: 'id'

How is this possible, what am I missing?

question from:https://stackoverflow.com/questions/66052056/pandas-column-is-not-found-as-key-to-groupby

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

If you are using columns to select a subset of df.columns, and 'id' is not in columns, which seems to be the case, then the KeyError will occur.

import pandas as pd

# sample dataframe
data = {'id': [1, 1, 1, 2, 2, 2, 3, 3, 3, 4], 'b': [18, 18, 22, 22, 22, 15, 23, 19, 18, 17], 'c': [36, 32, 36, 38, 39, 36, 31, 36, 35, 37], 'd': [52, 51, 50, 54, 48, 53, 52, 54, 54, 45], 'e': [61, 64, 60, 69, 66, 65, 66, 69, 65, 63]}
df = pd.DataFrame(data)

columns = ['b', 'c', 'd']

# groupby columns without id
df[columns].groupby('id').mean()

# results in
KeyError: 'id'

Alternatives, which will work

# 1
df[['b', 'c', 'd']].groupby(df['id']).mean()

# 2
df[columns + ['id']].groupby('id').mean()

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...