Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
376 views
in Technique[技术] by (71.8m points)

Removing empty files in R: Error in if (!file.size(x) == 0) { : missing value where TRUE/FALSE needed


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Here are two fixes for your code.

  1. You are looking in a subdirectory for files, but the default action (as bad as it is) is to return just the file names, not the full path required to actually access those files. For instance,

    list.files(path = "dirname")
    # [1] "file1"   "file2"
    list.files(path = "dirname", full.names = TRUE)
    # [1] "dirname/file1"   "dirname/file2"
    

    So however you are calling list.files, just add full.names = TRUE and you will resolve the fact that none of your files will exist. This resolves the problem you do not yet know you have, but is fixing the real cause for the error you see.

  2. Your test for file.size if flawed in that trying to read a file that is not found will return NA. When I have a problem with code, my troubleshooting technique is to actually try each of the sub-components to see what is broken. In your case, since the error fails with the if statement, I would try to run each of the components.

    I'm going to guess that if you did that, it would look something like:

    (!file.size(x) == 0)
    # [1] NA
    file.size(x) == 0
    # [1] NA
    file.size(x)
    # [1] NA
    

    and now that you know of the function file.exists, you might then try

    x
    # [1] "somefile.txt"
    read.table(x)
    # Warning in file(file, "rt") :
    #   cannot open file 'somefile.txt': No such file or directory
    # Error in file(file, "rt") : cannot open the connection
    file.exists(x)
    # [1] FALSE
    

Taking those notes and adjusting your code, I suggest:

CTCF_intersect_files <- list.files(paste(intersect_bed_path, "CTCF/", sep = ""),
                                   full.names = TRUE)
CTCF_intersection <- lapply(
  CTCF_intersect_files, 
  function(x) {
    if (isTRUE(file.size(x) > 0)) {
      read.table(x, header=FALSE)
    }
  })

We don't need to actually call file.exists(x) in this case, since isTRUE(file.size(x) > 0) will correctly handle the times when x does not exist.

However, I would find it annoying to go through this and get no indication. This quick check will give you assurances that this assumption is met:

stopifnot(all(file.exists(CTCF_intersect_files))
CTCF_intersection <- lapply( ... )

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...