generics - Why was IEnumerable<T> made covariant in C# 4?

Question

Welcome To Ask or Share your Answers For Others

generics - Why was IEnumerable<T> made covariant in C# 4?

asked Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

generics - Why was IEnumerable<T> made covariant in C# 4?

In earlier versions of C# IEnumerable was defined like this:

public interface IEnumerable<T> : IEnumerable

Since C# 4 the definition is:

public interface IEnumerable<out T> : IEnumerable

Is it just to make the annoying casts in LINQ expressions go away?
Won't this introduce the same problems like with string[] <: object[] (broken array variance) in C#?
How was the addition of the covariance done from a compatibility point of view? Will earlier code still work on later versions of .NET or is recompilation necessary here? What about the other way around?
Was previous code using this interface strictly invariant in all cases or is it possible that certain use cases will behave different now?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-17T00:13:32+0000

Marc's and CodeInChaos's answers are pretty good, but just to add a few more details:

First off, it sounds like you are interested in learning about the design process we went through to make this feature. If so, then I encourage you to read my lengthy series of articles that I wrote while designing and implementing the feature. Start from the bottom of the page:

Covariance and contravariance blog posts

Is it just to make the annoying casts in LINQ expressions go away?

No, it is not just to avoid Cast<T> expressions, but doing so was one of the motivators that encouraged us to do this feature. We realized that there would be an uptick in the number of "why can't I use a sequence of Giraffes in this method that takes a sequence of Animals?" questions, because LINQ encourages the use of sequence types. We knew that we wanted to add covariance to IEnumerable<T> first.

We actually considered making IEnumerable<T> covariant even in C# 3 but decided that it would be strange to do so without introducing the whole feature for anyone to use.

Won't this introduce the same problems like with string[] <: object[] (broken array variance) in C#?

It does not directly introduce that problem because the compiler only allows variance when it is known to be typesafe. However, it does preserve the broken array variance problem. With covariance, IEnumerable<string[]> is implicitly convertible to IEnumerable<object[]>, so if you have a sequence of string arrays, you can treat that as a sequence of object arrays, and then you have the same problem as before: you can try to put a Giraffe into that string array and get an exception at runtime.

How was the addition of the covariance done from a compatibility point of view?

Carefully.

Will earlier code still work on later versions of .NET or is recompilation necessary here?

Only one way to find out. Try it and see what fails!

It's often a bad idea to try to force code compiled against .NET X to run against .NET Y if X != Y, regardless of changes to the type system.

What about the other way around?

Same answer.

Is it possible that certain use cases will behave different now?

Absolutely. Making an interface covariant where it was invariant before is technically a "breaking change" because it can cause working code to break. For example:

if (x is IEnumerable<Animal>)
    ABC();
else if (x is IEnumerable<Turtle>)
    DEF();

When IE<T> is not covariant, this code chooses either ABC or DEF or neither. When it is covariant, it never chooses DEF anymore.

Or:

class B     { public void M(IEnumerable<Turtle> turtles){} }
class D : B { public void M(IEnumerable<Animal> animals){} }

Before, if you called M on an instance of D with a sequence of turtles as the argument, overload resolution chooses B.M because that is the only applicable method. If IE is covariant, then overload resolution now chooses D.M because both methods are applicable, and an applicable method on a more-derived class always beats an applicable method on a less-derived class, regardless of whether the argument type match is exact or not.

Or:

class Weird : IEnumerable<Turtle>, IEnumerable<Banana> { ... }
class B 
{ 
    public void M(IEnumerable<Banana> bananas) {}
}
class D : B
{
    public void M(IEnumerable<Animal> animals) {}
    public void M(IEnumerable<Fruit> fruits) {}
}

If IE is invariant then a call to d.M(weird) resolves to B.M. If IE suddenly becomes covariant then both methods D.M are applicable, both are better than the method on the base class, and neither is better than the other, so, overload resolution becomes ambiguous and we report an error.

When we decided to make these breaking changes, we were hoping that (1) the situations would be rare, and (2) when situations like this arise, almost always it is because the author of the class is attempting to simulate covariance in a language that doesn't have it. By adding covariance directly, hopefully when the code "breaks" on recompilation, the author can simply remove the crazy gear trying to simulate a feature that now exists.

Categories

generics - Why was IEnumerable<T> made covariant in C# 4?

generics - Why was IEnumerable<T> made covariant in C# 4?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags