Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
547 views
in Technique[技术] by (71.8m points)

r - trimws bug? leading whitespace not removed

Edit: Thanks to R Yoda, I was finally able to create a reproducible example to the issue I am facing:

x = rawToChar(as.raw(c(0xa0, 0x31, 0x31, 0x2e, 0x31, 0x33, 0x32, 0x35, 0x39, 0x32)))
trimws(x)

=> Question: How can I trim x?

Old text of the question:
Please see attached screenshot. Unfortunately I am not able to create reproducible example as dput is affecting the result...

As anyone an idea how to investigate what's going wrong with x? The leading whitespace doesn't seem to be a standard one!

enter image description here

charToRaw(x) gives a0 31 31 2e 31 33 32 35 39 32
dput(charToRaw(x)) gives as.raw(c(0xa0, 0x31, 0x31, 0x2e, 0x31, 0x33, 0x32, 0x35, 0x39, 0x32))
Encoding(x) gives "unknown" (same as Encoding(" 11.132592"))

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

0xa0 is encoding another type of space (the non-breaking space) in R, while 0x20 is the white space.
trimws searches for white spaces or tabs or linebreaks or carriage returns (represented by [ ]+) but not for non-breaking spaces, hence it does not work.
You can use sub (to suppress either leading or trailing spaces) or gsub (to suppress both trailing and leading spaces) to remove any kind of trailing or leading space(s) (including the one represented by 0xa0):

sub("^\s+", "", x)
[1] "11.132592"

And for removing leading and trailing spaces:

gsub("(^\s+)|(\s+$)", "", x)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...