group by - R Standard deviation across columns and rows by id

Question

Welcome To Ask or Share your Answers For Others

group by - R Standard deviation across columns and rows by id

asked Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

group by - R Standard deviation across columns and rows by id

I have several data frames that look similar to the following data frame (with much more columns):

id col1 col2 col3 col4 col5
1   4    3    5    4    A
1   3    5    4    9    Z
1   5    8    3    4    H
2   6    9    2    1    B
2   4    9    5    4    K
3   2    1    7    5    J
3   5    8    4    3    B
3   6    4    3    9    C

I want to calculate the standard deviation across specific columns (let's say col2 to col4) grouped by the id. I do not know the column index in every data frame. I only know the names for the columns I want to calculate the standard deviation for.

Is there a way I could do that easily? My original data frames contain around 20 columns and I only want the standard deviation for 10 columns with specific column names grouped by the id.

On top, it would be nice if I can directly add the calculated standard deviations to my data frame as a new column according to the id, looking like this:

id col1 col2 col3 col4 col5 SD
1   4    3    5    4    A   SD1
1   3    5    4    9    Z   SD1
1   5    8    3    4    H   SD1
2   6    9    2    1    B   SD2
2   4    9    5    4    K   SD2
3   2    1    7    5    J   SD3
3   5    8    4    3    B   SD3
3   6    4    3    9    C   SD3

question from:https://stackoverflow.com/questions/65885374/r-standard-deviation-across-columns-and-rows-by-id

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-06T19:20:34+0000

You can try :

library(dplyr)
df %>%
  group_by(id) %>%
  mutate(SD = sd(unlist(select(cur_data(), col2:col4))))

#    id  col1  col2  col3  col4 col5     SD
#  <int> <int> <int> <int> <int> <chr> <dbl>
#1     1     4     3     5     4 A      2.12
#2     1     3     5     4     9 Z      2.12
#3     1     5     8     3     4 H      2.12
#4     2     6     9     2     1 B      3.41
#5     2     4     9     5     4 K      3.41
#6     3     2     1     7     5 J      2.62
#7     3     5     8     4     3 B      2.62
#8     3     6     4     3     9 C      2.62

Categories

group by - R Standard deviation across columns and rows by id

group by - R Standard deviation across columns and rows by id

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags