This is arguably a HotSpot JVM bug.
The problem is in the string literal interning mechanism.
java.lang.String
instances for the string literals are created lazily during constant pool resolution.
- Initially a string literal is represented in the constant pool by
CONSTANT_String_info
structure that points to CONSTANT_Utf8_info
.
- Each class has its own constant pool. That is,
MyClass
and PrintStream
have their own pair of CONSTANT_String_info
/ CONSTANT_Utf8_info
cpool entries for the literal 'true'.
- When
CONSTANT_String_info
is accessed for the first time, JVM initiates the process of resolution. String interning is the part of this process.
- To find a match for a literal being interned, JVM compares the contents of
CONSTANT_Utf8_info
with the contents of string instances in the StringTable
.
- ^^^ And here is the problem. Raw UTF data from cpool is compared to Java
char[]
array contents that can be spoofed by a user via Reflection.
So, what's happening in your test?
f.set("true", f.get("false"))
initiates the resolution of the literal 'true' in MyClass
.
- JVM discovers no instances in
StringTable
matching the sequence 'true', and creates a new java.lang.String
, which is stored in StringTable
.
value
of that String from StringTable
is replaced via Reflection.
System.out.println(true)
initiates the resolution of the literal 'true' in PrintStream
class.
- JVM compares UTF sequence 'true' with Strings from
StringTable
, but finds no match, since that String already has 'false' value. Another String for 'true' is created and placed in StringTable
.
Why do I think this is a bug?
JLS §3.10.5 and JVMS §5.1 require that string literals containing the same sequence of characters must point to the same instance of java.lang.String
.
However, in the following code the resolution of two string literals with the same sequence of characters result in different instances.
public class Test {
static class Inner {
static String trueLiteral = "true";
}
public static void main(String[] args) throws Exception {
Field f = String.class.getDeclaredField("value");
f.setAccessible(true);
f.set("true", f.get("false"));
if ("true" == Inner.trueLiteral) {
System.out.println("OK");
} else {
System.out.println("BUG!");
}
}
}
A possible fix for JVM is to store a pointer to original UTF sequence in StringTable
along with java.lang.String
object, so that interning process will not compare cpool data (inaccessible by user) with value
arrays (accessible via Reflection).
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…