Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
168 views
in Technique[技术] by (71.8m points)

php - Regular Expression to get all words before a single certain character (=, with or without whitespaces before)

We are offering some template engine to our customers on our portal. Attributes getting replaced with information from a data source.

Here's an example of how such template might look like, with different variations of quotes and whitespaces, nestings here and there:

<html>
    <body>
        <h1 class="title">Some Title</h1>
        <div id="output">
            [%if findthis1='123']
                Bla bla bla ["findthis2"] Bla Bla Bla
            [%elseif (findthis3 = "123")]
                Bla Bla Bla ['findthis4'] Bla Bla Bla
            [%elseif ( findthis5 = "123" )]
                Bla Bla Bla [findthis6] Bla Bla Bla
            [%elseif (   findthis7   =   "123" OR findthis8   =   123   )    AND       findthis9='123']
                [findthis10] Bla Bla Bla
            [%elseif ( findthis11 = "123" OR ( findthis12=123 AND findthis13='123' ) ]
                 Bla Bla Bla [findthis14]
            [%endif]

            [%uppercase findthis15]
            [%lowercase findthis16 ]
        </div>
    </body>
</html>

Our goal is to get all words before the character = between [% and ] where whitespaces might occur.

We stumbled upon this thread, this answer, but since it is made to find html attributes, we couldn't manage to reduce the pattern down to parts between [% and ]. And also, once there's a whitespace between the attribute and the =, it does not match anymore.

How should we modify the regular expression as seen in the thread/answer to get the attributes like findthis1/3/5/7/8/9/11/12/13 without getting class and id, considering anything between [% and ] and with possible whitespaces? As for attributes findthis15 and findthis16 where there is no =, we would like to find another regular expression for that.

EDIT: I forgot to mention 2 things:

  • findthis-Attributes can be anything like "email" or "firstname"
  • There are also Operators like <=, >= and !=

EDIT 2: Right now, I am thinking about using multiple regular expressions. First one would be [\%(.)*], which would get me all lines starting with [% and ending with ]. I am trying to figure out the next regular expression to check if there are operators in it, or if it is one of these lines like [%uppercase findthis15].

EDIT 3: 2nd Regular Expression of 3 would look like this this:

(S+)+[ ]*((=|<>|!=))

EDIT 4: Okay, after some experimenting, we still couldn't manage to improve the regular expression to achieve our goals.

By using /[\%(if|elseif)(.)*?(])/, we are getting something like this (please ignore the fact that I am using a different line compared to the ones above):

[%if hello="abc" OR ( (stack=123 AND overflow = "bla") OR (how= 'bla' AND are ='bla') AND you = 'xyz' )]

But now, the final step is to get the words "hello", "stack", "overflow", "how", "are" and "you" by using PHP's preg_match function.

The following (wrong) regular expression is way too greedy:

( |()+(?:(?!(=|<|&lt;|>|&gt;|<=|>=|<>|&lt;&gt;|!=)).)*

What are we missing in this final regular expression?

question from:https://stackoverflow.com/questions/65643254/regular-expression-to-get-all-words-before-a-single-certain-character-with-o

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

As for the last step that you have already arrived at, you may use

w[^=s]*(?=s*=)

See the regex demo

Details:

  • w - a word char (letter, digit or _)
  • [^=s]* - zero or more chars other than a = and whitespace
  • (?=s*=) - a positive lookahead that matches a location immediately followed with zero or more whitespaces and then a = char.

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...