Parsing the file is your first step, ordering from there should be rather direct with order
or dplyr::arrange
.
txt <- readLines("quux.txt")
txt
# [1] "Question 3" "1 6" "2 9" "Question 1" "1 2" "3 5" "Question 2" "2 5" "1 2"
lst_of_frames <- lapply(
split(txt, cumsum(grepl("^Question", txt))),
function(z) {
out <- read.table(header = FALSE, text = z[-1])
cbind(question = z[1], out)
})
lst_of_frames
# $`1`
# question V1 V2
# 1 Question 3 1 6
# 2 Question 3 2 9
# $`2`
# question V1 V2
# 1 Question 1 1 2
# 2 Question 1 3 5
# $`3`
# question V1 V2
# 1 Question 2 2 5
# 2 Question 2 1 2
We now have a list of multiple frames. If you want them combined, then multiple options exist:
results <- do.call(rbind, lst_of_frames)
results
# question V1 V2
# 1.1 Question 3 1 6
# 1.2 Question 3 2 9
# 2.1 Question 1 1 2
# 2.2 Question 1 3 5
# 3.1 Question 2 2 5
# 3.2 Question 2 1 2
dplyr::bind_rows(lst_of_frames) # similar results
data.table::rbindlist(lst_of_frames) # similar results
I'll use the results from the first, and then order with
results[order(results$question, results$V1),]
# question V1 V2
# 2.1 Question 1 1 2
# 2.2 Question 1 3 5
# 3.2 Question 2 1 2
# 3.1 Question 2 2 5
# 1.1 Question 3 1 6
# 1.2 Question 3 2 9
dplyr::arrange(results, question, V1) # similar results
Note: this is sensitive to the number of columns within each question. If there are different number of columns ...
Question 3
1 6
2 9
Question 1
1 2 10
3 5 11
Question 2
2 5
1 2
Then you have some options.
Keep it wide. The simple base R do.call(rbind,...)
no longer works as easily:
do.call(rbind, lst_of_frames)
# Error in rbind(deparse.level, ...) :
# numbers of columns of arguments do not match
But the others work fine:
dplyr::bind_rows(lst_of_frames)
# question V1 V2 V3
# 1 Question 3 1 6 NA
# 2 Question 3 2 9 NA
# 3 Question 1 1 2 10
# 4 Question 1 3 5 11
# 5 Question 2 2 5 NA
# 6 Question 2 1 2 NA
data.table::rbindlist(lst_of_frames, fill = TRUE) # similar results
Pivot to long. (This is a "wide-vs-long" data discussion.)
dplyr::bind_rows(lapply(lst_of_frames, function(z) tidyr::pivot_longer(z, -question)))
# # A tibble: 14 x 3
# question name value
# <chr> <chr> <int>
# 1 Question 3 V1 1
# 2 Question 3 V2 6
# 3 Question 3 V1 2
# 4 Question 3 V2 9
# 5 Question 1 V1 1
# 6 Question 1 V2 2
# 7 Question 1 V3 10
# 8 Question 1 V1 3
# 9 Question 1 V2 5
# 10 Question 1 V3 11
# 11 Question 2 V1 2
# 12 Question 2 V2 5
# 13 Question 2 V1 1
# 14 Question 2 V2 2
# similar results
library(data.table)
rbindlist(lapply(lst_of_frames, function(z) melt(as.data.table(z), id = "question")))
This method has several advantages in other realms (e.g., ggplot2
, tidy data management, easy summarization, etc).
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…