Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
414 views
in Technique[技术] by (71.8m points)

python pandas histogram to display binning ranges of qcut

i used qcut to bin the data with ranges. But i want to display the output ranges data in pandas histogram. So, how do i do that? ps: the data is collected from a csv file Link:Csv file link here

i wrote the following codes -

import matplotlib.pyplot as plt
import pandas as pd
from sklearn.metrics import r2_score

dataset = pd.read_csv("datasets.csv")
print(dataset)


qc = pd.qcut(dataset['Active'], q=8, precision=0)
qc_val = qc.value_counts().sort_index()
print(qc_val)

The bining ranges output is-

(-1.0, 63.0]          5
(63.0, 212.0]         5
(212.0, 827.0]        4
(827.0, 1465.0]       8
(1465.0, 1959.0]      2
(1959.0, 4545.0]      4
(4545.0, 8594.0]      5
(8594.0, 221447.0]    5
Name: Active, dtype: int64

So, is there any way to display a histogram from the above bining ranges data?

question from:https://stackoverflow.com/questions/65642235/python-pandas-histogram-to-display-binning-ranges-of-qcut

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You can directly use the bins parameter in the histogram function of a Series, like

import pandas as pd

url = 'https://drive.google.com/file/d/1lYZqeYH_AtUAUG5947Bd51JXJBrOP5Lp/view?usp=sharing'
path = 'https://drive.google.com/uc?export=download&id='+url.split('/')[-2]
df = pd.read_csv(path)
df['Active'].hist(bins=8)

possible hist

or with the labels from the qcut you can use it like this

levels = [f'Level_{i}' for i in range(8)]
df['Active_bins'] = pd.qcut(df['Active'], q=8, precision=0, labels=levels)
df.head()

Data with labels

# from https://stackoverflow.com/a/58288640/7752347
import matplotlib.pyplot as plt

fig,ax = plt.subplots()

hatches = ('\', '//', '..', '**', "!", '$', '^','#')         # fill pattern

for (i, d),hatch in zip(df.groupby('Active_bins'), hatches):
    d['Active'].hist(alpha=0.7, ax=ax, label=i, hatch=hatch)

ax.legend()

Patterned hist


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...