Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
914 views
in Technique[技术] by (71.8m points)

javascript - Regex to match certain characters and exclude certain characters but without negative lookahead

I want a regex that matches all emojis (or most of them) but excludes certain characters (such as “|”|‘|’|…|—).

This regex does the job via negative lookahead:

/(?!u201C|u201D|u2018|u2019|u2026|u2014)(u00a9|u00ae|[u2000-u3300]|ud83c[ud000-udfff]|ud83d[ud000-udfff]|ud83e[ud000-udfff])/

But apparently Google Scripts doesn't support this. Error:

Invalid regular expression pattern (?!“|”|‘|’|…|—)(?|?|[?-?]|?[?-?]|?[?-?]|?[?-?])

Is there another way to achieve my goal (a regex that works with Google Script's findText)?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Option 1

Maybe,

[u{1f300}-u{1f5ff}u{1f900}-u{1f9ff}u{1f600}-u{1f64f}u{1f680}-u{1f6ff}u{2600}-u{26ff}u{2700}-u{27bf}u{1f1e6}-u{1f1ff}u{1f191}-u{1f251}u{1f004}u{1f0cf}u{1f170}-u{1f171}u{1f17e}-u{1f17f}u{1f18e}u{3030}u{2b50}u{2b55}u{2934}-u{2935}u{2b05}-u{2b07}u{2b1b}-u{2b1c}u{3297}u{3299}u{303d}u{00a9}u{00ae}u{2122}u{23f3}u{24c2}u{23e9}-u{23ef}u{25b6}u{23f8}-u{23fa}]

might be working OK for your desired emojis.

Demo

Option 2

Otherwise, you might want to negate those undesired chars using char classes, such as:

[these unicode ranges &&[^these unicodes]]

which would become pretty complicated, yet possible.

Option 3

Using this option you can most likely solve your problem much simpler. I guess, your problem is that those undesired punctuations are already among the desired unicodes. Check to see if that'd be the case. For example, in

[u100-u200]

you might have u150 and u175 as undesired chars, which you want them to be removed from your desired ranges of unicodes that you already have.

You can then simply remove those from the range, such as with:

[u100-u149u151-u174u176-u200]

and as simple as that the problem would be solved.

Source

javascript unicode emoji regular expressions


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...