Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
250 views
in Technique[技术] by (71.8m points)

regex unicode character in vim

I'm being an idiot.

Someone cut and pasted some text from microsoft word into my lovely html files.

I now have these unicode characters instead of regular quote symbols, (i.e. quotes appear as <92> in the text)

I want to do a regex replace but I'm having trouble selecting them.

:%s/u92/'/g
:%s/u5C/'/g
:%s/x92/'/g
:%s/x5C/'/g

...all fail. My google-fu has failed me.

question from:https://stackoverflow.com/questions/3016965/regex-unicode-character-in-vim

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

From :help regexp (lightly edited), you need to use some specific syntax to select unicode characters with a regular expression in Vim:

\%u match specified multibyte character (eg \%u20ac)

That is, to search for the unicode character with hex code 20AC, enter this into your search pattern:

\%u20ac

The full table of character search patterns includes some additional options:

\%d match specified decimal character (eg \%d123)
\%x match specified hex character (eg \%x2a)
\%o match specified octal character (eg \%o040)
\%u match specified multibyte character (eg \%u20ac)
\%U match specified large multibyte character (eg \%U12345678)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...