Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
183 views
in Technique[技术] by (71.8m points)

java - Regex for validating alphabetics and numbers in the localized string

I have an input field which is localized. I need to add a validation using a regex that it must take only alphabets and numbers. I could have used [a-z0-9] if I were using only English.

As of now, I am using the method Character.isLetterOrDigit(name.charAt(i)) (yes, I am iterating through each character) to filter out the alphabets present in various languages.

Are there any better ways of doing it? Any regex or other libraries available for this?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Since Java 7 you can use Pattern.UNICODE_CHARACTER_CLASS

String s = "Müller";

Pattern p = Pattern.compile("^\w+$", Pattern.UNICODE_CHARACTER_CLASS);
Matcher m = p.matcher(s);
if (m.find()) {
    System.out.println(m.group());
} else {
    System.out.println("not found");
}

with out the option it will not recognize the word "Müller", but using Pattern.UNICODE_CHARACTER_CLASS

Enables the Unicode version of Predefined character classes and POSIX character classes.

See here for more details

You can also have a look here for more Unicode information in Java 7.

and here on regular-expression.info an overview over the Unicode scripts, properties and blocks.

See here a famous answer from tchrist about the caveats of regex in Java, including an updated what has changed with Java 7 (of will be in Java 8)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...