A string is a sequence of runes. Unicode is a standard for assigning numeric values to those runes. UTF-8 or UTF-16 are standards for encoding a sequence of runes, as represented by their unicode numeric values, as a sequence of bytes.
What you did there is use encode
with the default encoding, which is UTF-8, to get a sequence of bytes which you then tried to decode back to runes as if the bytes had come from a UTF-16 encoding. Basically (because your input string fits in a 1-byte encoding for UTF-8) you're taking pairs of characters from the input, jamming their bytes together and hoping that the resulting value is a legal UTF-16 encoding of something (which in general you cannot count on being true). You'll also run into issues if the utf-8 encoding is not an even number of bytes, of course.
If you really need to do this thing in javascript, you could do something like this:
const str = "some random string";
var buf = new ArrayBuffer(str.length);
// Reinterpret the sequence of bytes as a sequence of byte pairs.
var bufView = new Uint16Array(buf);
for (var i=0, strLen=str.length; i < strLen-1; i+=2) {
var c1 = str.charCodeAt(i);
var c2 = str.charCodeAt(i+1);
if (c1 > 127 || c2 > 127) {
// This will be a problem. How you handle it is up to you.
}
bufView[i/2] = c1 << 8 | c2;
}
console.log(String.fromCharCode.apply(String, bufView));
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…