171

IEnumerable<T> is co-variant but it does not support value type, just only reference type. The below simple code is compiled successfully:

IEnumerable<string> strList = new List<string>();
IEnumerable<object> objList = strList;

But changing from string to int will get compiled error:

IEnumerable<int> intList = new List<int>();
IEnumerable<object> objList = intList;

The reason is explained in MSDN:

Variance applies only to reference types; if you specify a value type for a variant type parameter, that type parameter is invariant for the resulting constructed type.

I have searched and found that some questions mentioned the reason is boxing between value type and reference type. But it does not still clear up my mind much why boxing is the reason?

Could someone please give a simple and detailed explanation why covariance and contravariance do not support value type and how boxing affects this?

2

4 Answers 4

140

Basically, variance applies when the CLR can ensure that it doesn't need to make any representational change to the values. References all look the same - so you can use an IEnumerable<string> as an IEnumerable<object> without any change in representation; the native code itself doesn't need to know what you're doing with the values at all, so long as the infrastructure has guaranteed that it will definitely be valid.

For value types, that doesn't work - to treat an IEnumerable<int> as an IEnumerable<object>, the code using the sequence would have to know whether to perform a boxing conversion or not.

You might want to read Eric Lippert's blog post on representation and identity for more on this topic in general.

EDIT: Having reread Eric's blog post myself, it's at least as much about identity as representation, although the two are linked. In particular:

This is why covariant and contravariant conversions of interface and delegate types require that all varying type arguments be of reference types. To ensure that a variant reference conversion is always identity-preserving, all of the conversions involving type arguments must also be identity-preserving. The easiest way to ensure that all the non-trivial conversions on type arguments are identity-preserving is to restrict them to be reference conversions.

9
  • 5
    @CuongLe: Well it's an implementation detail in some senses, but it's the underlying reason for the restriction, I believe.
    – Jon Skeet
    Commented Sep 17, 2012 at 8:25
  • 2
    @AndréCaron: Eric's blog post is important here - it's not just representation, but also identity preservation. But representation preservation means the generated code doesn't need to care about this at all.
    – Jon Skeet
    Commented Sep 17, 2012 at 12:54
  • 1
    Precisely, identity cannot be preserved because int is not a subtype of object. The fact that a representational change is required is just a consequence of this. Commented Sep 17, 2012 at 17:16
  • 3
    How is int not a subtype of object? Int32 inherits from System.ValueType, which inherits from System.Object. Commented Nov 6, 2018 at 3:39
  • 1
    @DavidKlempfner I think @AndréCaron comment is poorly phrased. Any value type such as Int32 has two representational forms, "boxed" and "unboxed". The compiler has to insert code to convert from one form to the other, even though this is normally invisible at a source code level. In effect, only the "boxed" form is considered by the underlying system to be a subtype of object, but the compiler automatically deals with this whenever a value type is assigned to a compatible interface or to something of type object.
    – Steve
    Commented Jul 20, 2020 at 14:03
12

It is perhaps easier to understand if you think about the underlying representation (even though this really is an implementation detail). Here is a collection of strings:

IEnumerable<string> strings = new[] { "A", "B", "C" };

You can think of the strings as having the following representation:

[0] : string reference -> "A"
[1] : string reference -> "B"
[2] : string reference -> "C"

It is a collection of three elements, each being a reference to a string. You can cast this to a collection of objects:

IEnumerable<object> objects = (IEnumerable<object>) strings;

Basically it is the same representation except now the references are object references:

[0] : object reference -> "A"
[1] : object reference -> "B"
[2] : object reference -> "C"

The representation is the same. The references are just treated differently; you can no longer access the string.Length property but you can still call object.GetHashCode(). Compare this to a collection of ints:

IEnumerable<int> ints = new[] { 1, 2, 3 };
[0] : int = 1
[1] : int = 2
[2] : int = 3

To convert this to an IEnumerable<object> the data has to be converted by boxing the ints:

[0] : object reference -> 1
[1] : object reference -> 2
[2] : object reference -> 3

This conversion requires more than a cast.

1
  • 2
    Boxing is not just an "implementation detail". Boxed value types are stored the same way as class objects, and behave, as far as the outside world can tell, like class objects. The only difference is that within the definition of a boxed value type, this refers to a struct whose fields overlay those of the heap object that stores it, rather than referring to the object which holds them. There is no clean way for a boxed value type instance to get a reference to the enclosing heap object.
    – supercat
    Commented Nov 20, 2012 at 22:31
7

I think everything starts from definiton of LSP (Liskov Substitution Principle), which climes:

if q(x) is a property provable about objects x of type T then q(y) should be true for objects y of type S where S is a subtype of T.

But value types, for example int can not be substitute of object in C#. Prove is very simple:

int myInt = new int();
object obj1 = myInt ;
object obj2 = myInt ;
return ReferenceEquals(obj1, obj2);

This returns false even if we assign the same "reference" to the object.

7
  • 1
    I think you're using the right principle but there's no proof to be made: int is not a subtype of object so the principle does not apply. Your "proof" relies on an intermediate representation Integer, which is a subtype of object and for which the language has an implicit conversion (object obj1=myInt; is actualy expanded to object obj1=new Integer(myInt);). Commented Sep 17, 2012 at 12:05
  • The language takes care of correct casting between types, but ints behaviour does not correspond to that one we would expect from object's subtype.
    – Tigran
    Commented Sep 17, 2012 at 12:10
  • My whole point is precisely that int is not a subtype of object. Moreover, LSP does not apply because myInt, obj1 and obj2 refer to three different objects: one int and two (hidden) Integers. Commented Sep 17, 2012 at 17:13
  • 22
    @André: C# is not Java. The C#'s int keyword is an alias for the BCL's System.Int32, which is in fact a subtype of object (an alias of System.Object). In fact, int's base class is System.ValueType who's base class is System.Object. Try evaluating the following expression and see: typeof(int).BaseType.BaseType. The reason ReferenceEquals returns false here is that the int is boxed into two seperate boxes, and each box's identity is different for any other box. Thus two boxing operation always yield two objects that are never identical, regardless of the boxed value. Commented Sep 18, 2012 at 21:43
  • @AllonGuralnek: Each value type (e.g. System.Int32 or List<String>.Enumerator) actually represents two kinds of things: a storage-location type, and a heap-object type (sometimes called a "boxed value type"). Storage locations whose types derive from System.ValueType will hold the former; heap objects whose types do likewise will hold the latter. In most languages, a widening cast exists from the former the latter, and a narrowing cast from the latter to the former. Note that while boxed value types have the same type descriptor as value-type storage locations, ...
    – supercat
    Commented Nov 20, 2012 at 16:37
3

It does come down to an implementation detail: Value types are implemented differently to reference types.

If you force value types to be treated as reference types (i.e. box them, e.g. by referring to them via an interface) you can get variance.

The easiest way to see the difference is simply consider an Array: an array of Value types are put together in memory contiguously (directly), where as an array of Reference types only have the reference (a pointer) contiguously in memory; the objects being pointed to are separately allocated.

The other (related) issue(*) is that (almost) all Reference types have the same representation for variance purposes and much code does not need to know of the difference between types, so co- and contra-variance is possible (and easily implemented -- often just by omission of extra type checking).

(*) It may be seen to be the same issue...

Not the answer you're looking for? Browse other questions tagged or ask your own question.