Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
257 views
in Technique[技术] by (71.8m points)

java - Regular Expressions on Punctuation

So I'm completely new to regular expressions, and I'm trying to use Java's java.util.regex to find punctuation in input strings. I won't know what kind of punctuation I might get ahead of time, except that (1) !, ?, ., ... are all valid puncutation, and (2) "<" and ">" mean something special, and don't count as punctuation. The program itself builds phrases pseudo-randomly, and I want to strip off the punctuation at the end of a sentence before it goes through the random process.

I can match entire words with any punctuation, but the matcher just gives me indexes for that word. In other words:

Pattern p = Pattern.compile("(.*\!)*?");
Matcher m = p.matcher([some input string]);

will grab any words with a "!" on the end. For example:

String inputString = "It is a warm Summer day!";
Pattern p = Pattern.compile("(.*\!)*?");
Matcher m = p.matcher(inputString);
String match = inputString.substring(m.start(), m.end());

results in --> String match ~ "day!"

But I want to have Matcher index just the "!", so I can just split it off.

I could probably make cases, and use String.substring(...) for each kind of punctuation I might get, but I'm hoping there's some mistake in my use of regular expressions to do this.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Java does support POSIX character classes in a roundabout way. For punctuation, the Java equivalent of [:punct:] is p{Punct}.

Please see the following link for details.

Here is a concrete, working example that uses the expression in the comments

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegexFindPunctuation {

    public static void main(String[] args) {
        Pattern p = Pattern.compile("\p{Punct}");

        Matcher m = p.matcher("One day! when I was walking. I found your pants? just kidding...");
        int count = 0;
        while (m.find()) {
            count++;
            System.out.println("
Match number: " + count);
            System.out.println("start() : " + m.start());
            System.out.println("end()   : " + m.end());
            System.out.println("group() : " + m.group());
        }
    }
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...