Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
443 views
in Technique[技术] by (71.8m points)

r - Saving a data frame as a binary file

I would like to save a whole bunch of relatively large data frames while minimizing the space that the files take up. When opening the files, I need to be able to control what names they are given in the workspace.

Basically I'm looking for the symantics of dput and dget but with binary files.

Example:

n<-10000

for(i in 1:100){
    dat<-data.frame(a=rep(c("Item 1","Item 2"),n/2),b=rnorm(n),
        c=rnorm(n),d=rnorm(n),e=rnorm(n))
    dput(dat,paste("data",i,sep=""))
}


##much later


##extract 3 random data sets and bind them
for(i in 1:10){
    nums<-sample(1:100,3)
    comb<-rbind(dget(paste("data",nums[1],sep="")),
            dget(paste("data",nums[2],sep="")),
            dget(paste("data",nums[3],sep="")))
    ##do stuff here
}
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Your best bet is to use rda files. You can use the save() and load() commands to write and read:

set.seed(101)
a = data.frame(x1=runif(10), x2=runif(10), x3=runif(10))

save(a, file="test.rda")
load("test.rda")

Edit: For completeness, just to cover what Harlan's suggestion might look like (i.e. wrapping the load command to return the data frame):

loadx <- function(x, file) {
  load(file)
  return(x)
}  

loadx(a, "test.rda")

Alternatively, have a look at the hdf5, RNetCDF and ncdf packages. I've experimented with the hdf5 package in the past; this uses the NCSA HDF5 library. It's very simple:

hdf5save(fileout, ...)
hdf5load(file, load = TRUE, verbosity = 0, tidy = FALSE)

A last option is to use binary file connections, but that won't work well in your case because readBin and writeBin only support vectors:

Here's a trivial example. First write some data with "w" and append "b" to the connection:

zz <- file("testbin", "wb")
writeBin(1:10, zz)
close(zz)

Then read the data with "r" and append "b" to the connection:

zz <- file("testbin", "rb")
readBin(zz, integer(), 4)
close(zz)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...