Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
288 views
in Technique[技术] by (71.8m points)

r - Compute the mean of two columns in a dataframe

I have a dataframe storing different values. Sample:

a$open  a$high  a$low   a$close

1.08648 1.08707 1.08476 1.08551
1.08552 1.08623 1.08426 1.08542
1.08542 1.08572 1.08453 1.08465
1.08468 1.08566 1.08402 1.08554
1.08552 1.08565 1.08436 1.08464
1.08463 1.08543 1.08452 1.08475
1.08475 1.08504 1.08427 1.08436
1.08433 1.08438 1.08275 1.08285
1.08275 1.08353 1.08275 1.08325
1.08325 1.08431 1.08315 1.08378
1.08379 1.08383 1.08275 1.08294
1.08292 1.08338 1.08271 1.08325

What I want to do, is creating a new column a$mean storing the mean of a$high and a$low for each row.

Here is how I achieved that:

highlowmean <- function(highs, lows){
  m <- vector(mode="numeric", length=0)
  for (i in 1:length(highs)){
    m[i] <- mean(highs[i], lows[i])
  }
  return(m)
}

a$mean <- highlowmean(a$high, a$low)

However I'm a bit new into R and in functionnal languages in general, so I'm pretty sure that there is a more efficient/simple way to achieve that.

How to achieve that the smartest way?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

We can use rowMeans

 a$mean <- rowMeans(a[,c('high', 'low')], na.rm=TRUE)

NOTE: If there are NA values, it is better to use rowMeans

For example

 a <- data.frame(High= c(NA, 3, 2), low= c(3, NA, 0))
 rowMeans(a, na.rm=TRUE)    
 #[1] 3 3 1

and using +

 a1 <- replace(a, is.na(a), 0)
 (a1[1] + a1[2])/2
#  High
#1  1.5
#2  1.5
#3  1.0

NOTE: This is no way trying to tarnish the other answer. It works in most cases and is fast as well.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...