Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
164 views
in Technique[技术] by (71.8m points)

regex - Regexp "does not contain attribute" in html

I'm looking for a simple regular expression (I think), that would return all html tags not having a "name" attribute, but my weak regexp skills won't help me much.

Finding a html tag is not a problem, but the "which does not contain" is. I simply have no idea (well I had, but none of them work).

Any clue?

question from:https://stackoverflow.com/questions/65829760/how-to-string-replace-multiple-matches-unless-string-already-exists

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

First of all, you should not use regex for this task. An HTML parser surely exists in whatever language you are using and is way better suited for this.

Now, if you need to use regex for whatever reason, you could use a negative lookahead if your implementation supports it. The expression

<w+(?![^>]*name)

identifies an opening HTML tag by <w+ and matches this only if the string "name" (enclosed by word boundaries) does not appear before the next closing bracket.

See it in action with RegExr.

This works only on well behaved HTML, and expanding it to respect quoted strings, javascript or comments will either be impossible or very very ugly. Did I mention HTML parsers? =)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...