Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
290 views
in Technique[技术] by (71.8m points)

java Regex - split but ignore text inside quotes?

using only regular expression methods, the method String.replaceAll and ArrayList how can i split a String into tokens, but ignore delimiters that exist inside quotes? the delimiter is any character that is not alphanumeric or quoted text

for example: The string :

hello^world'this*has two tokens'

should output:

  • hello
  • worldthis*has two tokens
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I know there is a damn good and accepted answer already present but I would like to add another regex based (and may I say simpler) approach to split the given text using any non-alphanumeric delimiter which not inside the single quotes using

Regex:

/(?=(([^']+'){2})*[^']*$)[^a-zA-Z\d]+/

Which basically means match a non-alphanumeric text if it is followed by even number of single quotes in other words match a non-alphanumeric text if it is outside single quotes.

Code:

String string = "hello^world'this*has two tokens'#2ndToken";
System.out.println(Arrays.toString(
     string.split("(?=(([^']+'){2})*[^']*$)[^a-zA-Z\d]+"))
);

Output:

[hello, world'this*has two tokens', 2ndToken]

Demo:

Here is a live working Demo of the above code.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...