The question is simple: I have a string str
, how do I check if str
is one single emoji, and nothing else?
(问题很简单:我有一个字符串str
,如何检查str
是否是一个表情符号,仅此而已?)
Additionally I would prefer not using another library.(另外,我宁愿不使用其他库。)
Match "??"
, "????♂?"
, "3??"
but not "??a"
, "??"
, "????"
(匹配"??"
, "????♂?"
, "3??"
但不"??a"
, "??"
, "????"
)
I'm having trouble finding a solution but here are some things I've tried so far:
(我在寻找解决方案时遇到了麻烦,但是到目前为止,我已经尝试了一些方法:)
Attempted Solution 1 - Play around lengths and ...
operator(尝试的解决方案1-播放长度和...
运算符)
I learned that emojis occupy more than one byte, some even occupy 4 bytes, or more... and we can measure that via the string's length
property:
(我了解到,表情符号占据一个以上的字节,有些甚至占据4个字节,甚至更多……我们可以通过字符串的length
属性来测量:)
console.log("??".length); // 2
console.log("???".length); // 3
console.log("????♂?".length); // 6
Then I found out that the ...
operator takes this into account and correctly separates emojis in the array - I could then see the resulting array's length
property and detect if they were different.
(然后,我发现...
运算符考虑了这一点,并正确分隔了数组中的表情符号-然后,我可以查看结果数组的length
属性并检测它们是否不同。)
str = "????♂?";
if (str.length !== [...str].length) {
// is emoji?
} else {
// is not emoji
}
But this doesn't check for other multi-byte characters such as ??
whose length is 2. Plus some emojis were still getting separated in a weird.
(但是,这不会检查其他多字节字符,例如长度为2的??
。此外,有些表情符号仍然被怪异地分开。)
Attempted Solution 2 - Regex, regular expressions(尝试解决方案2-正则表达式,正则表达式)
Of course regex would be a thing to look into but I've yet to find a viable solution.
(当然,正则表达式是一个值得研究的问题,但是我还没有找到可行的解决方案。)
This answer 's regex ?|?|[?-?]|?[?-?]|?[?-?]|?[?-?]
works perfectly fine to detect if a string has any emojis, but applied to my situation it produces a lot of problems.
(此答案的正则表达式?|?|[?-?]|?[?-?]|?[?-?]|?[?-?]
可以很好地检测字符串是否包含表情符号,但是将其应用于我的情况会产生很多问题。)
Here are my tests:(这是我的测试:)
Part A - Without start/end of string regex ( ^
and $
)
(A部分-不包含字符串正则表达式的开头/结尾( ^
和$
))
let regex = /(u00a9|u00ae|[u2000-u3300]|ud83c[ud000-udfff]|ud83d[ud000-udfff]|ud83e[ud000-udfff])/;
console.log("5??".match(regex)); // [ '?', '?', index: 2, input: '5??' ]
console.log("??".match(regex)); // [ '??', '??', index: 0, input: '??' ]
console.log("??????".match(regex)); // [ '??', '??', index: 0, input: '??????' ]
console.log("a?".match(regex)); // [ '?', '?', index: 1, input: 'a?' ]
let regex = /(u00a9|u00ae|[u2000-u3300]|ud83c[ud000-udfff]|ud83d[ud000-udfff]|ud83e[ud000-udfff])/;
console.log(regex.test("5??")); // true - correct
console.log(regex.test("a")); // false - correct
console.log(regex.test("??????")); // true - should be false
console.log(regex.test("hello ?!")); // true - should be false
Part B - With start/end of string regex ( ^
and $
)
(B部分-以字符串正则表达式的开头/结尾( ^
和$
))
let regex = /^(u00a9|u00ae|[u2000-u3300]|ud83c[ud000-udfff]|ud83d[ud000-udfff]|ud83e[ud000-udfff])$/;
console.log("5??".match(regex)); // null
console.log("??".match(regex)); // [ '??', '??', index: 0, input: '??' ]
console.log("???".match(regex)); // null
console.log("?".match(regex)); // [ '?', '?', index: 1, input: 'a?' ]
console.log("????".match(regex)); // null
let regex = /^(u00a9|u00ae|[u2000-u3300]|ud83c[ud000-udfff]|ud83d[ud000-udfff]|ud83e[ud000-udfff])$/;
console.log(regex.test("5??")); // false - should be true
console.log(regex.test("??")); // true - correct
console.log(regex.test("???")); // false - should be true
console.log(regex.test("?")); // true - correct
console.log(regex.test("????")); // false - correct
Part C - Other regular expressions
(C部分-其他正则表达式)
let regex = /^(?:[u2700-u27bf]|(?:ud83c[udde6-uddff]){2}|[ud800-udbff][udc00-udfff]|[u0023-u0039]ufe0f?u20e3|u3299|u3297|u303d|u3030|u24c2|ud83c[udd70-udd71]|ud83c[udd7e-udd7f]|ud83cudd8e|ud83c[udd91-udd9a]|ud83c[udde6-uddff]|[ud83c[ude01uddff]|ud83c[ude01-ude02]|ud83cude1a|ud83cude2f|[ud83c[ude32ude02]|ud83cude1a|ud83cude2f|ud83c[ude32-ude3a]|[ud83c[ude50ude3a]|ud83c[ude50-ude51]|u203c|u2049|[u25aa-u25ab]|u25b6|u25c0|[u25fb-u25fe]|u00a9|u00ae|u2122|u2139|ud83cudc04|[u2600-u26FF]|u2b05|u2b06|u2b07|u2b1b|u2b1c|u2b50|u2b55|u231a|u231b|u2328|u23cf|[u23e9-u23f3]|[u23f8-u23fa]|ud83cudccf|u2934|u2935|[u2190-u21ff])$/g
console.log(regex.test("5??")); // true - correct
console.log(regex.test("??")); // false - should be true
console.log(regex.test("???")); // false - should be true
console.log(regex.test("?")); // true - correct
console.log(regex.test("????")); // false - correct
- Also this breaks horribly (second test changes based on first test?)
(这也令人震惊(第二测试基于第一测试而改变?))
let regex = /^(?:[u2700-u27bf]|(?:ud83c[udde6-uddff]){2}|[ud800-udbff][udc00-udfff]|[u0023-u0039]ufe0f?u20e3|u3299|u3297|u303d|u3030|u24c2|ud83c[udd70-udd71]|ud83c[udd7e-udd7f]|ud83cudd8e|ud83c[udd91-udd9a]|ud83c[udde6-uddff]|[ud83c[ude01uddff]|ud83c[ude01-ude02]|ud83cude1a|ud83cude2f|[ud83c[ude32ude02]|ud83cude1a|ud83cude2f|ud83c[ude32-ude3a]|[ud83c[ude50ude3a]|ud83c[ude50-ude51]|u203c|u2049|[u25aa-u25ab]|u25b6|u25c0|[u25fb-u25fe]|u00a9|u00ae|u2122|u2139|ud83cudc04|[u2600-u26FF]|u2b05|u2b06|u2b07|u2b1b|u2b1c|u2b50|u2b55|u231a|u231b|u2328|u23cf|[u23e9-u23f3]|[u23f8-u23fa]|ud83cudccf|u2934|u2935|[u2190-u21ff])$/g
console.log(regex.test("????♂?")); // false
console.log(regex.test("?")); // true
let regex = /^(?:[u2700-u27bf]|(?:ud83c[udde6-uddff]){2}|[ud800-udbff][udc00-udfff]|[u0023-u0039]ufe0f?u20e3|u3299|u3297|u303d|u3030|u24c2|ud83c[udd70-udd71]|ud83c[udd7e-udd7f]|ud83cudd8e|ud83c[udd91-udd9a]|ud83c[udde6-uddff]|[ud83c[ude01uddff]|ud83c[ude01-ude02]|ud83cude1a|ud83cude2f|[ud83c[ude32ude02]|ud83cude1a|ud83cude2f|ud83c[ude32-ude3a]|[ud83c[ude50ude3a]|ud83c[ude50-ude51]|u203c|u2049|[u25aa-u25ab]|u25b6|u25c0|[u25fb-u25fe]|u00a9|u00ae|u2122|u2139|ud83cudc04|[u2600-u26FF]|u2b05|u2b06|u2b07|u2b1b|u2b1c|u2b50|u2b55|u231a|u231b|u2328|u23cf|[u23e9-u23f3]|[u23f8-u23fa]|ud83cudccf|u2934|u2935|[u2190-u21ff])$/g;
console.log(regex.test("?")); // true
console.log(regex.test("?")); // false
Is there a way around all this emoji/unicode/regex mess?
(有没有办法解决所有这些表情符号/ unicode / regex混乱?)
Are libraries/apis the only way?(库/ api是唯一的方法吗?)
How do they do it?(他们是怎么做到的呢?)
ask by luxluxdev translate from so