Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.0k views
in Technique[技术] by (71.8m points)

pandas - Merge rows with same index in a DataFrame

I have a DataFrame with species and their occurence. An excerpt follows:

+-------------+------------+-------------------------------------------+--------------+-----------------+-------------+-----------+------------+----------+
| Family      | Genus      | Species                                   | Species code | Nearctic Mexico | Mesoamerica | Caribbean | Los Llanos | Amazonia |
+-------------+------------+-------------------------------------------+--------------+-----------------+-------------+-----------+------------+----------+
| Acanthaceae | Aphelandra | Aphelandra castaneifolia Britton ex Rusby | Aphecast     |                 |             |           |            |          |
+-------------+------------+-------------------------------------------+--------------+-----------------+-------------+-----------+------------+----------+
| Acanthaceae | Aphelandra | Aphelandra pilosa Leonard                 | Aphpilo      |                 | +           |           | +          | +        |
+-------------+------------+-------------------------------------------+--------------+-----------------+-------------+-----------+------------+----------+
| Acanthaceae | Aphelandra | Aphelandra pulcherrima (Jacq.) Kunth      | Aphepulc     |                 | +           | +         | +          | +        |
+-------------+------------+-------------------------------------------+--------------+-----------------+-------------+-----------+------------+----------+
| Acanthaceae | Aphelandra | Aphelandra sinclairiana Nees              | Aphesinc     |                 | +           |           |            |          |
+-------------+------------+-------------------------------------------+--------------+-----------------+-------------+-----------+------------+----------+

I need to analyze genera only, but preserve their occurences as follows. I'm sorry if that's a real rookie question, but I'm having trouble putting it in words what I intend to do. This is the expected result:

+-------------+------------+-----------------+-------------+-----------+------------+----------+
| Family      | Genus      | Nearctic Mexico | Mesoamerica | Caribbean | Los Llanos | Amazonia |
+-------------+------------+-----------------+-------------+-----------+------------+----------+
| Acanthaceae | Aphelandra |                 | +           | +         | +          | +        |
+-------------+------------+-----------------+-------------+-----------+------------+----------+

I have found similar problems, but dealing with numeric data and I couldn't manage to adapt the solution.

question from:https://stackoverflow.com/questions/65932394/merge-rows-with-same-index-in-a-dataframe

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Use GroupBy.first for first non misisng values for 5th+ columns:

#if necessary replace empty space to missing values
df = df.replace('',np.nan)

cols = df.columns[4:].tolist()
df = df.groupby(['Family','Genus'], as_index=False)[cols].first()
print (df)
        Family       Genus  Nearctic Mexico Mesoamerica Caribbean Los Llanos  
0  Acanthaceae  Aphelandra              NaN           +         +          +   

  Amazonia  
0        +  

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...