Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
147 views
in Technique[技术] by (71.8m points)

python - Pandas: What determines index order when creating a frame from a dict?

I'm wondering why the index in the below data frame is not sorted when created via a nested dict of dicts? Am expecting that the row containing year 2000 data would be the first row followed by the rows for 2001 and 2002 respectively. I also realize that I can run frame.sort_index() to obtain the desired results but just wondering why it doesn't happen automatically.

In [1]: import pandas as pd

In [2]: pop = {'Nevada': {2001: 2.4, 2002: 2.9},
   ...:      ...:     'Ohio': {2000: 1.5, 2001: 1.7,2002: 3.6}}

In [3]: frame = pd.DataFrame(pop)

In [4]: frame
Out[4]:
      Nevada  Ohio
2001     2.4   1.7
2002     2.9   3.6
2000     NaN   1.5

The above was produced with Python 3.8.3 and iPython 7.18.1 and the example comes from chapter 5 of Python for Data Analysis by Wes McKinney (the index is sorted in the book).

question from:https://stackoverflow.com/questions/65929412/pandas-what-determines-index-order-when-creating-a-frame-from-a-dict

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I think a good way to understand what is going on is to try with the order of the state flipped:

pop = {'Ohio': {2000: 1.5, 2001: 1.7,2002: 3.6}, 'Nevada': {2001: 2.4, 2002: 2.9}}

Now you get:

      Ohio  Nevada
2000   1.5     NaN
2001   1.7     2.4
2002   3.6     2.9

So what happened in the original? It goes through Nevada first which just has index of 2001 and 2002. THEN it goes through Ohio which has a new index (2000) which is added to the bottom, and two old indexes (2001 and 2002) which already exist and so the values are added in the appropriate spots.

As why it shows up in the book sorted, it is probably a Pandas version difference. Modern Pandas (post v0.25. See Docs) maintains the key order as specified. The book was probably written for an older version of pandas which happened to (randomly) use Ohio first.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...