While JavaScript regexes recognize non-ASCII characters in some cases (like s
), it's hopelessly inadequate when it comes to w
and
. If you want them to work with anything beyond the ASCII word characters, you'll have to either use a different language, or install Steve Levithan's XRegExp library with the Unicode plugin.
By the way, there's an error in your regex. You have a
after the optional trailing comma, but it should be in front:
"\b([a-z]{2})\b,?"
I also removed the square brackets; you would only need those if the comma had a special meaning in regexes, which it doesn't. But I suspect you don't need to match the comma at all;
should be sufficient to make sure you're at the end of the word. And if you don't need the comma, you don't need the capturing group either:
"\b[a-z]{2}\b"
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…