Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
548 views
in Technique[技术] by (71.8m points)

java - Replacing double backslashes with single backslash

I have a string "\u003c", which belongs to UTF-8 charset. I am unable to decode it to unicode because of the presence of double backslashes. How do i get "u003c" from "\u003c"? I am using java.

I tried with,

myString.replace("", "");

but could not achieve what i wanted.

This is my code,

String myString = FileUtils.readFileToString(file);
String a = myString.replace("", "");
byte[] utf8 = a.getBytes();

// Convert from UTF-8 to Unicode
a = new String(utf8, "UTF-8");
System.out.println("Converted string is:"+a);

and content of the file is

u003c

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You can use String#replaceAll:

String str = "\\u003c";
str= str.replaceAll("", "");
System.out.println(str);

It looks weird because the first argument is a string defining a regular expression, and is a special character both in string literals and in regular expressions. To actually put a in our search string, we need to escape it (\) in the literal. But to actually put a in the regular expression, we have to escape it at the regular expression level as well. So to literally get \ in a string, we need write \\ in the string literal; and to get two literal \ to the regular expression engine, we need to escape those as well, so we end up with \\\\. That is:

String Literal        String                      Meaning to Regex
????????????????????? ??????????????????????????? ?????????????????
                     Escape the next character   Would depend on next char
\                                               Escape the next character
\\                  \                          Literal 
\\\\              \\                        Literal \

In the replacement parameter, even though it's not a regex, it still treats and $ specially — and so we have to escape them in the replacement as well. So to get one backslash in the replacement, we need four in that string literal.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...