r - apply with ifelse statement and is.na does not 'sum' but outputs matrix - where is my logical mistake?

Question

Welcome To Ask or Share your Answers For Others

r - apply with ifelse statement and is.na does not 'sum' but outputs matrix - where is my logical mistake?

asked Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

r - apply with ifelse statement and is.na does not 'sum' but outputs matrix - where is my logical mistake?

probably a stupid question but I clearly can't see it and would appreciate your help.

Here is a fictional dataset:

dat <- data.frame(ID = c(101, 202, 303, 404),
                  var1 = c(1, NA, 0, 1),
                  var2 = c(NA, NA, 0, 1))

now I need to create a variable that sums the values up, per subject. The following works but ignores when var1 and var2 are NA:

try1 <- apply(dat[,c(2:3)], MARGIN=1, function(x) {sum(x==1, na.rm=TRUE)})

I would like the script to write NA if both var1 and var2 are NA, but if one of the two variables has an actual value, I'd like the script to treat the NA as 0. I have tried this:

check1 <- apply(dat[,2:3], MARGIN=1, function(x) 
{ifelse(x== is.na(dat$var1) & is.na(dat$var2), NA, {sum(x==1, na.rm=TRUE)})})

This, however, produces a 4x4 matrix (int[1:4,1:4]). The real dataset has hundreds of observations so that just became a mess...Does anybody see where I go wrong?

Thank you!

question from:https://stackoverflow.com/questions/65939225/apply-with-ifelse-statement-and-is-na-does-not-sum-but-outputs-matrix-where

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-06T18:57:24+0000

Here's a working version:

apply(dat[,2:3], MARGIN=1, function(x) 
  {
    if(all(is.na(x))) {
      NA
    } else {
      sum(x==1, na.rm=TRUE)
    }
  }
)
#[1]  1 NA  0  2

Issues with yours:

Inside your function(x), x is the var1 and var2 values for a particular row. You don't want to go back and reference dat$var1 and dat$var2, which is the whole column! Just use x.
x== is.na(dat$var1) & is.na(dat$var2) is strange. It's trying to check whether x is the same as is.na(dat$var1)?
For a given row, we want to check whether all the values are NA. ifelse is vectorized and will return a vector - but we don't want a vector, we want a single TRUE or FALSE indicating whether all values are NA. So we use all(is.na()). And if() instead of ifelse.

Categories

r - apply with ifelse statement and is.na does not 'sum' but outputs matrix - where is my logical mistake?

r - apply with ifelse statement and is.na does not 'sum' but outputs matrix - where is my logical mistake?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags