r - Merge rows in a dataframe where the rows are disjoint and contain NAs

Question

Welcome To Ask or Share your Answers For Others

r - Merge rows in a dataframe where the rows are disjoint and contain NAs

asked Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

r - Merge rows in a dataframe where the rows are disjoint and contain NAs

I have a dataframe that has two rows:

| code | name  | v1 | v2 | v3 | v4 |
|------|-------|----|----|----|----|
| 345  | Yemen | NA | 2  | 3  | NA |
| 346  | Yemen | 4  | NA | NA | 5  |

Is there an easy way to merge these two rows? What if I rename "345" in "346", would that make things easier?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-17T00:08:26+0000

You can use aggregate. Assuming that you want to merge rows with identical values in column name:

aggregate(x=DF[c("v1","v2","v3","v4")], by=list(name=DF$name), min, na.rm = TRUE)
   name v1 v2 v3 v4
1 Yemen  4  2  3  5

This is like the SQL SELECT name, min(v1) GROUP BY name. The min function is arbitrary, you could also use max or mean, all of them return the non-NA value from an NA and a non-NA value if na.rm = TRUE. (An SQL-like coalesce() function would sound better if existed in R.)

However, you should check first if all non-NA values for a given name is identical. For example, run the aggregate both with min and max and compare, or run it with range.

Finally, if you have many more variables than just v1-4, you could use DF[,!(names(DF) %in% c("code","name"))] to define the columns.

Categories

r - Merge rows in a dataframe where the rows are disjoint and contain NAs

r - Merge rows in a dataframe where the rows are disjoint and contain NAs

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags