c# - Why the compiler emits box instructions to compare instances of a reference type?

Question

Welcome To Ask or Share your Answers For Others

c# - Why the compiler emits box instructions to compare instances of a reference type?

asked Oct 6, 2021 in Technique[技术] by 深蓝 (71.8m points)

c# - Why the compiler emits box instructions to compare instances of a reference type?

Here is a simple generic type with a unique generic parameter constrained to reference types:

class A<T> where T : class
{
    public bool F(T r1, T r2)
    {
        return r1 == r2;
    }
}

The generated IL by csc.exe is :

ldarg.1
box        !T
ldarg.2
box        !T
ceq

So each parameter is boxed before proceeding with the comparison.

But if the constraint indicates that "T" should never be a value type, why is the compiler trying to box r1 and r2 ?

question from:https://stackoverflow.com/questions/4475576/why-the-compiler-emits-box-instructions-to-compare-instances-of-a-reference-type

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-06T05:05:43+0000

It's required to satisfy the verifiability constraints for the generated IL. Note that unverifiable doesn't necessarily mean incorrect. It works just fine without the box instruction as long as its security context allows running unverifiable code. Verification is conservative and is based on a fixed rule set (like reachability). To simplify things, they chose not to care about presence of generic type constraints in the verification algorithm.

Common Language Infrastructure Specification (ECMA-335)

Section 9.11: Constraints on generic parameters

... Constraints on a generic parameter only restrict the types that the generic parameter may be instantiated with. Verification (see Partition III) requires that a field, property or method that a generic parameter is known to provide through meeting a constraint, cannot be directly accessed/called via the generic parameter unless it is first boxed (see Partition III) or the callvirt instruction is prefixed with the constrained prefix instruction. ...

Removing the box instructions will result in unverifiable code:

.method public hidebysig instance bool 
       F(!T r1,
         !T r2) cil managed
{
   ldarg.1
   ldarg.2
   ceq
   ret
}


c:UsersMehrdadScratch>peverify sc.dll

Microsoft (R) .NET Framework PE Verifier.  Version  4.0.30319.1
Copyright (c) Microsoft Corporation.  All rights reserved.

[IL]: Error: [c:UsersMehrdadScratchsc.dll : A`1[T]::F][offset 0x00000002][fo
und (unboxed) 'T'] Non-compatible types on the stack.
1 Error(s) Verifying sc.dll

UPDATE (Answer to the comment): As I mentioned above, verifiability is not equivalent to correctness (here I'm talking about "correctness" from a type-safety point of view). Verifiable programs are a strict subset of correct programs (i.e. all verifiable programs are demonstrably correct, but there are correct programs that are not verifiable). Thus, verifiability is a stronger property than correctness. Since C# is a Turing-complete language, Rice's theorem states that proving that programs are correct is undecidable in general case.

Let's get back to my reachability analogy as it's easier to explain. Assume you were designing C#. One thing have thought about is when to issue a warning about unreachable code, and to remove that piece of code altogether in the optimizer, but how you are going to detect all unreachable code? Again, Rice's theorem says you can't do that for all programs. For instance:

void Method() {
    while (true) {
    }
    DoSomething();  // unreachable code
}

This is something that C# compiler actually warns about. But it doesn't warn about:

bool Condition() {
   return true;
}

void Method() {
   while (Condition()) {
   }
   DoSomething();  // no longer considered unreachable by the C# compiler
}

A human can prove that control flow never reaches that line in the latter case. One could argue that the compiler could statically prove DoSomething is unreachable in this case too, but it doesn't. Why? The point is you can't do that for all programs, so you should draw the line at some point. At this stage, you have to define a decidable property and call it "reachability". For instance, for reachability, C# sticks to constant expressions and won't look at the contents of the functions at all. Simplicity of analysis and design consistency are important goals in deciding where to draw the line.

Going back to our verifiability concept, it's a similar problem. Verifiability, unlike correctness, is a decidable property. As the runtime designer, you have to decide how to define verifiability, based on performance considerations, easy of implementation, ease of specification, consistency, making it easy for the compiler to confidently generate verifiable code. Like most design decisions, it involves a lot of trade-offs. Ultimately, the CLI designers have decided that they prefer not too look at generic constraints at all when they are checking for verifiability.

Categories

c# - Why the compiler emits box instructions to compare instances of a reference type?

c# - Why the compiler emits box instructions to compare instances of a reference type?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Common Language Infrastructure Specification (ECMA-335)

Section 9.11: Constraints on generic parameters

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags