Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
642 views
in Technique[技术] by (71.8m points)

php - Remove nested quotes

I have this text and I'm trying to remove all the inner quotes, just keeping one quoting level. The text inside a quote contains any characters, even line feeds, etc. Is this possible using a regex or I have to write a little parser?

[quote=foo]I really like the movie. [quote=bar]World 

War Z[/quote] It's amazing![/quote]
This is my comment.
[quote]Hello, World[/quote]
This is another comment.
[quote]Bye Bye Baby[/quote]

Here the text I want:

[quote=foo]I really like the movie.  It's amazing![/quote]
This is my comment.
[quote]Hello, World[/quote]
This is another comment.
[quote]Bye Bye Baby[/quote]

This is the regex I'm using in PHP:

%[quotes*(=[a-zA-Z0-9-_]*)?](.*)[/quote]%si

I tried also this variant, but it doesn't match . or , and I can't figure what else I can find inside a quote:

%[quotes*(=[a-zA-Z0-9-_]*)?]([ws]+)[/quote]%i

The problem is located here:

(.*)
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You can use this:

$result = preg_replace('~G(?!A)(?>([quote[^]]*](?>[^[]+|[(?!/?quote)|(?1))*[/quote])|(?<![)(?>[^[]+|[(?!/?quote))+K)|[quote[^]]*]K~', '', $text);

details:

G(?!A)              # contiguous to a precedent match
(?>                   ## content inside "quote" tags at level 0
  (                    ## nested "quote" tags (group 1)
    [quote[^]]*]
    (?>                ## content inside "quote" tags at any level
      [^[]+
     |                  # OR
      [(?!/?quote)
     |                  # OR
      (?1)              # repeat the capture group 1 (recursive)
    )*
    [/quote]
  )
 |
  (?<![)           # not preceded by an opening square bracket
  (?>              ## content that is not a quote tag
    [^[]+           # all that is not a [
   |                # OR
    [(?!/?quote)   # a [ not followed by "quote" or "/quote"
  )+K              # repeat 1 or more and reset the match
)
|                   # OR
[quote[^]]*]K   # "quote" tag at level 0 

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

57.0k users

...