It seems you are using the COMPUTE_FRAMES
option which will cause the ASM library to merge the types encountered through the possible code paths via getCommonSuperClass
similar to what the old verifier did and somewhat perverting the concept of stack map tables.
As you have already noted, ASM’s implementation of getCommonSuperClass
may return an actually inaccessible type, like a JRE internal base class, and ignores interface relationships. The bigger problem is that you can’t fix this with a different implementation of this method, as the information passed to this method is not sufficient to determine the right type.
The right type is what will be subsequently needed, which, of course, should also be compatible with what is provided via all possible code paths, which is what the verifier will/ought to check. If your code generator is designed in a way that it produces valid code, specifying the subsequently required types should be sufficient to create a valid stack map table entry, but the incoming types passed to getCommonSuperClass
are not sufficient to tell you what will be the required type.
To illustrate the issue, consider the following example class
class Example {
public static CharSequence problematicMethod() {
return Math.random()>0.5? new StringBuilder("x"): new StringBuffer("y");
}
}
and the following code analyzing the compiled (e.g. by javac
) class and what ASM would generate by default when being told to recalculate stack map frames from scratch:
static void printFrame(int nLocal, Object[] local, int nStack, Object[] stack) {
StringBuilder sb = decode(new StringBuilder().append("Locals: "), local, nLocal);
System.out.println(decode(sb.append(", Stack: "), stack, nStack));
}
private static StringBuilder decode(StringBuilder sb, Object[] array, int num) {
if(num==0) return sb.append("[]");
sb.append('[');
for(int ix = 0; ix<num; ix++) {
Object o = array[ix];
if(o==Opcodes.UNINITIALIZED_THIS) sb.append("this <uninit>");
else if(o==Opcodes.INTEGER) sb.append("int");
else if(o==Opcodes.FLOAT) sb.append("float");
else if(o==Opcodes.DOUBLE) sb.append("double");
else if(o==Opcodes.LONG) sb.append("long");
else if(o==Opcodes.NULL) sb.append("null");
else if(o==Opcodes.TOP) sb.append("-");
else sb.append(Type.getObjectType(o.toString()).getClassName());
sb.append(",");
}
sb.setCharAt(sb.length()-1, ']');
return sb;
}
public static void main(String[] args) throws IOException {
final MethodVisitor printFramesMV = new MethodVisitor(Opcodes.ASM5) {
@Override public void visitFrame(int type, int nLocal, Object[] local, int nStack, Object[] stack) {
printFrame(nLocal, local, nStack, stack);
}
};
final ClassVisitor printFrames = new ClassVisitor(Opcodes.ASM5) {
@Override
public MethodVisitor visitMethod(int access, String name, String desc, String signature, String[] exceptions) {
return name.equals("problematicMethod")? printFramesMV: null;
}
};
ClassReader cr = new ClassReader(Example.class.getName());
System.out.println("##original");
cr.accept(printFrames, ClassReader.EXPAND_FRAMES);
ClassWriter cw = new ClassWriter(ClassWriter.COMPUTE_FRAMES);
cr.accept(cw, ClassReader.SKIP_FRAMES);
System.out.println("##from ASM");
new ClassReader(cw.toByteArray()).accept(printFrames, ClassReader.EXPAND_FRAMES);
}
This will print
##original
Locals: [], Stack: []
Locals: [], Stack: [java.lang.CharSequence]
##from ASM
Locals: [], Stack: []
Locals: [], Stack: [java.lang.AbstractStringBuilder]
This shows the same problem you have explained in your question, ASM will generate a frame referring to an implementation specific class. The code generated by javac
refers to the required type, which is compatible with the method’s return type. You could study StringBuilder
and StringBuffer
in getCommonSuperClass
and find out that both implement CharSequence
, but this is not sufficient to understand that CharSequence
is the right type here, as we could simply change the example to
class Example {
public static Appendable problematicMethod() {
return Math.random()>0.5? new StringBuilder("x"): new StringBuffer("y");
}
}
and get
##original
Locals: [], Stack: []
Locals: [], Stack: [java.lang.Appendable]
##from ASM
Locals: [], Stack: []
Locals: [], Stack: [java.lang.AbstractStringBuilder]
Since the incoming classes implement both interfaces, you can’t find out whether CharSequence
or Appendable
is the right merge type, just by looking at the incoming StringBuilder
and StringBuffer
types.
To evaluate this issue further, look at
class Example {
public static Comparable problematicMethod() {
return Math.random()>0.5? BigInteger.valueOf(123): Double.valueOf(1.23);
}
}
which produces
##original
Locals: [], Stack: []
Locals: [], Stack: [java.lang.Comparable]
##from ASM
Locals: [], Stack: []
Locals: [], Stack: [java.lang.Number]
Here, ASM’s result is a public
type, but this common base class doesn’t implement the required Comparable
, so this code is actually broken.
It’s a great luck for all code generators using ASM’s COMPUTE_FRAMES
option, that HotSpot’s verifier has a great tolerance towards interface types or, in other words, that it doesn’t verify the correctness of assignments at all (this includes the receivers of method invocations) when at least one of the two types is an interface.
If you want to generate code that survives verifiers strictly doing their job even for interfaces, you shouldn’t use that option and start to generate stack map frames yourself, by not using the COMPUTE_FRAMES
option and emitting the right visitFrame
calls (or inserting the appropriate nodes if you’re using the tree API).
There seem to be a widespread fear of doing that, but it’s not that complicated. As said before, it basically implies stating what your code generator already knows. It’s actually not about trying to find a common type, it’s about specifying what you will use afterwards and if your code generator is correct, that’s already it, but if not, ASM’s calculation can’t fix the code either.
To stay at your specific example, when dealing with ThaiBuddhistDate
and HijrahDate
you already know that you are processing them as ChronoLocalDate
after the branch merge point (I suppose), whereas ASM ends up at an implementation specific non-public
type, but if that type didn’t exist, ASM just used java.lang.Object
as it doesn’t consider interfaces. If ASM considered interfaces, it had to decide between ChronoLocalDate
and Serializable
, neither being more specific than the other. It’s simply not solvable with this design.
To illustrate further, how different the outcome between “merge the incoming types” and “what will be used” can be, look at
class Example {
public static void problematicMethod() {
if(Math.random()>0.5) {
java.awt.ScrollPane b = new java.awt.ScrollPane();
}
else {
javax.swing.JTabbedPane t = new javax.swing.JTabbedPane();
}
}
}
##original
Locals: [], Stack: []
Locals: [], Stack: []
##from ASM
Locals: [], Stack: []
Locals: [java.awt.Container], Stack: []
Here, ASM wasted resources to find out the common base class in a deep class hierarchy tree, whereas just stating “drop the variable” would be sufficient…