javascript - What is this type of string called?

Question

Welcome To Ask or Share your Answers For Others

javascript - What is this type of string called?

asked Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

javascript - What is this type of string called?

In python, we can do something like print("some random string".encode().decode('utf-16')) which will output: 潳敭爠湡潤?瑳楲杮.

I feel like that is utf-16, but I'm not really sure, because I can't reproduce it in any other language. My goal is to create a function that will do exactly this, but in Javascript. The problem is that I can't find what of what type if this type of string...

Does someone know how this is called or/and how I could reproduce this in JS ?

question from:https://stackoverflow.com/questions/65852658/what-is-this-type-of-string-called

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-06T19:28:48+0000

A string is a sequence of runes. Unicode is a standard for assigning numeric values to those runes. UTF-8 or UTF-16 are standards for encoding a sequence of runes, as represented by their unicode numeric values, as a sequence of bytes.

What you did there is use encode with the default encoding, which is UTF-8, to get a sequence of bytes which you then tried to decode back to runes as if the bytes had come from a UTF-16 encoding. Basically (because your input string fits in a 1-byte encoding for UTF-8) you're taking pairs of characters from the input, jamming their bytes together and hoping that the resulting value is a legal UTF-16 encoding of something (which in general you cannot count on being true). You'll also run into issues if the utf-8 encoding is not an even number of bytes, of course.

If you really need to do this thing in javascript, you could do something like this:

const str = "some random string";
var buf = new ArrayBuffer(str.length);
// Reinterpret the sequence of bytes as a sequence of byte pairs.
var bufView = new Uint16Array(buf);
for (var i=0, strLen=str.length; i < strLen-1; i+=2) {
  var c1 = str.charCodeAt(i);
  var c2 = str.charCodeAt(i+1);
  if (c1 > 127 || c2 > 127) {
    // This will be a problem.  How you handle it is up to you.
  }
  bufView[i/2] = c1 << 8 | c2;
}
console.log(String.fromCharCode.apply(String, bufView));

Categories

javascript - What is this type of string called?

javascript - What is this type of string called?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags