Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
672 views
in Technique[技术] by (71.8m points)

unicode - What is proper way to test if the input is Korean or Chinese using JavaScript?

My application was relying on this function to test if a string is Korean or not :

const isKoreanWord = (input) => {
  const match = input.match(/[u3131-uD79D]/g);
  return match ? match.length === input.length : false;
}

isKoreanWord('??'); // true
isKoreanWord('mandu'); // false

until I started to include Chinese support and now this function is incoherent :

isKoreanWord('幹嘛'); // true

I believe this is caused by the fact that Korean characters and Chinese ones are intermingled into the same Unicode range.

How should I correct this function to make it returns true if the input contains only Korean characters ?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Here is the unicode range you need for Hangul (Taken from their wikipedia page).

U+AC00–U+D7AF
U+1100–U+11FF
U+3130–U+318F
U+A960–U+A97F
U+D7B0–U+D7FF

So your regex .match should look like this:

const match = input.match(/[uac00-ud7af]|[u1100-u11ff]|[u3130-u318f]|[ua960-ua97f]|[ud7b0-ud7ff]/g);

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

56.9k users

...