You could do this with extract
from tidyr
library(tidyr)
extract(df, sequence, into=paste0('V', 1:5), '(.)(.)(.)(.)(.)')
# category V1 V2 V3 V4 V5
#1 X A A T . G
#2 Y C C G - T
Or create a delimiter with gsub
and use that as sep
for the separator
library(dplyr)
library(tidyr)
df %>%
mutate(sequence=gsub('(?<=.)(?=.)', ',', sequence, perl=TRUE)) %>%
separate(sequence, into=paste0('V', 1:5), sep=",")
# category V1 V2 V3 V4 V5
#1 X A A T . G
#2 Y C C G - T
Or you can use cSplit
library(splitstackshape)
setnames(cSplit(df, 'sequence', '', stripWhite=FALSE),
2:6, paste0('V', 1:5))[]
# category V1 V2 V3 V4 V5
#1: X A A T . G
#2: Y C C G - T
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…