Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
289 views
in Technique[技术] by (71.8m points)

java - Splitting on comma outside quotes


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You can try out this regex:

str.split(",(?=(?:[^"]*"[^"]*")*[^"]*$)");

This splits the string on , that is followed by an even number of double quotes. In other words, it splits on comma outside the double quotes. This will work provided you have balanced quotes in your string.

Explanation:

,           // Split on comma
(?=         // Followed by
   (?:      // Start a non-capture group
     [^"]*  // 0 or more non-quote characters
     "      // 1 quote
     [^"]*  // 0 or more non-quote characters
     "      // 1 quote
   )*       // 0 or more repetition of non-capture group (multiple of 2 quotes will be even)
   [^"]*    // Finally 0 or more non-quotes
   $        // Till the end  (This is necessary, else every comma will satisfy the condition)
)

You can even type like this in your code, using (?x) modifier with your regex. The modifier ignores any whitespaces in your regex, so it's becomes more easy to read a regex broken into multiple lines like so:

String[] arr = str.split("(?x)   " + 
                     ",          " +   // Split on comma
                     "(?=        " +   // Followed by
                     "  (?:      " +   // Start a non-capture group
                     "    [^"]* " +   // 0 or more non-quote characters
                     "    "     " +   // 1 quote
                     "    [^"]* " +   // 0 or more non-quote characters
                     "    "     " +   // 1 quote
                     "  )*       " +   // 0 or more repetition of non-capture group (multiple of 2 quotes will be even)
                     "  [^"]*   " +   // Finally 0 or more non-quotes
                     "  $        " +   // Till the end  (This is necessary, else every comma will satisfy the condition)
                     ")          "     // End look-ahead
                         );

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
...