I'm trying to split a string in R (using strsplit
) at some specific points (dash, -
) however not if the dash are within a string in brackets ([
).
Example:
xx <- c("Radio Stations-Listened to Past Week-Toronto [FM-CFXJ-93.5 (93.5 The Move)]","Total Internet-Time Spent Online-Past 7 Days")
xx
[1] "Radio Stations-Listened to Past Week-Toronto [FM-CFXJ-93.5 (93.5 The Move)]"
[2] "Total Internet-Time Spent Online-Past 7 Days"
should give me something like:
list(c("Radio Stations","Listened to Past Week","Toronto [FM-CFXJ-93.5 (93.5 The Move)]"), c("Total Internet","Time Spent Online","Past 7 Days"))
[[1]]
[1] "Radio Stations" "Listened to Past Week"
[3] "Toronto [FM-CFXJ-93.5 (93.5 The Move)]"
[[2]]
[1] "Total Internet" "Time Spent Online" "Past 7 Days"
Is there a way with regular expression to do this? The position and the number of dashs change within each elements of the vector, and there is not always brackets. However, when there are brackets, they are always at the end.
I've tried different things, but none are working:
## Trying to match "-" before "[" in Perl
strsplit(xx, split = "-(?=\[)", perl=T)
# does nothing
## trying to first extract what follow "[" then splitting what is preceding that
temp <- strsplit(xx, "[", fixed = T)
temp <- lapply(temp, function(yy) substr(head(yy, -1),"-"))
# doesn't work as there are some elements with no brackets...
Any help would be appreciated.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…