Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Union Types #824

Merged
merged 31 commits into from
Oct 13, 2014
Merged

Union Types #824

merged 31 commits into from
Oct 13, 2014

Conversation

ahejlsberg
Copy link
Member

This is the starting implementation of Union Types as proposed in #805. It includes all parts of the core proposal, but not the "Possible Next Steps" (yet). The implementation reflects the decisions we made in the 10/3 design meeting.

@RyanCavanaugh The pull request doesn't yet include new test baselines (and thus the Travis build will appear to fail). The deltas all look good, but a number of tests are now incorrect and need to be modified. I'm wondering if I can get you to help out with that.

@@ -1739,6 +1741,22 @@ module ts {
return type;
}

function parseType(): TypeNode {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right now it looks like it's not possible to have syntax for an array of union types. That seems like it could be problematic in the future. For example, say you have htis in your .js:

function foo(x: string | number) {
    return [x];
}

This function will have a type that is nonexpressable with the syntax of the language (and thus would be a problem for .d.ts files, as well as anyone who wants to explicitly give typings to things).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can write this as Array<string|number>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point. We'll likely have to do that if we don't make additional syntax. As htis is a perfectly reasonable workaround, i'd prefer this approach before adding more syntax. thanks!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to have union type aliases, to keep the length of type names managable:

define number | Array<number> | Matrix<number> = DynamicMatrix;

Other values for the define keyword could be alias, type or class

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine this will fall out as a default feature, based on the type inference system of TypeScript, but I think it bears repeating. Union types should have anonymous interfaces:

class A {
    commonToBoth: string;
    uniqueToA: string;
}

class B {
    commonToBoth: string;
    uniqueToB: string;
}

var either: A | B = new A();

//Valid because there is an anonymous interface containing the common members
either.commonToBoth

The anonymous interface would have the form:

interface `unnamable {
    commonToBoth: string;
}

Or go wild and give this interface a name like:

Intersection[A | B] 
Common[A | B]
Interface[A | B] or
Infc[A | B] //for brevity

This does not have any use case I can think of, but perhaps I'm not thinking hard enough.


@DanielRosenwasser Yes I do. So consider this an upvote. =)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@drubino you mean like in #957? 😃

This is a relatively old conversation at this point - you'll find quite a bit has happened recently with respect to union types, such as #914.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haskell has union types denoted by Either A B. From conversations with members of the Haskell community, I've heard that this feature causes a lot of switch statements, leading to verbose, branching code. I'm not sure how to handle this, but perhaps one could create implicit type checks and blocks that use the order of the types as their listed on the union type. This keeps the syntax light-weight.

Something like:

var x = A | B | C;
for x do {
    ///Handle A
} or {
    ///Handle B
} or {
    //Handle c
}

When combined with the anonymous interface idea, along with syntax to combine or skip blocks if logic can be implemented by common members in certain cases, you might be able enable the user to be quite succinct for common scenarios.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might also be interesting to have a way of defining union types. Perhaps there is one master type that all others can be cast to, and the cast is specified in the definition. The compiler would then handle optimizations where casting can be proven to be unnecessary:

class DynamicMatrix contains number | Array<number> | Matrix<number> {
     super: Matrix<number>
     (x: number): Array<number> =  x => [x];
     (x: Array<number>): Matrix<number> = x => new Matrix([x]);
     (x: Array<number>): number = x => x[0] 
         when x.length === 1
         otherwise throw 'Array to number conversion is not possible';
     (x: Matrix<number>): Array<number> = x => x[0] 
          when x.numberOfRows  = 1
          otherwise throw 'Matrix to Array conversion is not possible';
} 

This is probably a bit more complex than the design team had in mind for union types, but I thought I'd throw it out there all the same.

I believe this, or features like it, would provide us with language constructs that would help to bridge the gap between dynamically typed languages and statically typed ones.

@ahejlsberg
Copy link
Member Author

Latest commit adds support for type guards along the lines of "Local Meaning of Union Types" in #805. Some examples of what is now supported:

function foo(x: number|string) {
    if (typeof x === "string") {
        return x.length;  // x has type string here
    }
    else {
        return x + 1;  // x has type number here
    }
}

function isLongString(obj: any) {
    return typeof obj === "string" && obj.length > 100;  // obj has type string after guard
}

function processData(data: string|() => string) {
    var d = typeof data !== "string" ? data() : data;  // d has type string
    // Process string in d
}

class NamedItem {
    name: string;
}

function getName(obj: any) {
    return obj instanceof NamedItem ? obj.name : "unknown";
}

The type of the local variable or parameter may be _narrowed_ in the following situations:

  • In an if statement with a condition that contains a type guard.
  • In a conditional statement with a condition that contains a type guard.
  • In the right operand of an && operation where the left operand contains a type guard.
  • In the right operand of an || operation where the left operand contains a type guard.

A _type guard_ is one of the following:

  • An expression of the form typeof x === "..." or typeof x !== "..." where x is the identifier of a variable or parameter and "..." is a string literal. When the string literal is "string", "number", or "boolean", the type guard either changes the type of x to that primitive type or removes that primitive type from the possible types of x (when x is of a union type). When the string literal is something other than a primitive type name, the type guard removes all primtive types from the possible types of x in cases where the guard is known to be true.
  • An expression of the form x instanceof C where x is the identifier of a variable or parameter and C is a constructor function. When the type guard is known to be true, the type of x is changed to the type of C.prototype.
  • An expression of the form !x where x is a type guard.
  • An expression of the form x && y where one or both of x and y are type guards.
  • An expression of the form x || y where one or both of x and y are type guards.

A type guard for a variable x has no effect if the statements or expressions it guards contain assignments to x. For example:

function foo(x: string|number) {
    if (typeof x === "string") {
        x = x.length;  // Fails, x still of type string|number here
    }
    var n = typeof x === "string" ? x.length : x;  // Ok, n has type string
}

Note that type guards affect types of simple variables and parameters only (e.g. x). They have no effect on the types of property accesses or other expressions (such as x.y and x()). The latter would require complicated aliasing analysis.

Also note that it is possible to defeat a type guard by calling a function that changes the type of the guarded variable. It would quite difficult to exhaustively prove that a variable isn't modified by function calls.

@CyrusNajmabadi
Copy link
Contributor

The type guard concept looks fantastic. I really like it.

@DanielRosenwasser
Copy link
Member

So are

"string" === typeof obj

and

"string" !== typeof obj

not type guards?

@ahejlsberg
Copy link
Member Author

@DanielRosenwasser Regarding "string" === typeof obj, no I don't think those would be type guards. Nor would all the various forms that can be constructed by adding parentheses. I think there is something to be said for keeping the pattern simple and consistent.

mhegazy and others added 6 commits October 7, 2014 22:48
The new baselines all look correct to me, but obviously a number of the
tests need to be updated to reflect union types and the new behavior of
best common type. This commit does not cover that.
@ahejlsberg
Copy link
Member Author

Latest commit improves type argument inference for union types.

When inferring to a union type, we first infer to those types in the union that aren't naked type parameters. If that produces no new inferences, and if the union type contains a single naked type parameter, we then infer to that type parameter.

To infer from a union type we infer from each of the types in the union.

With these changes the following now works with zero type annotations in the actual code:

declare class Promise<T> {
    static resolve<T>(value: Promise<T>|T): Promise<T>;
    then<U>(onfulfilled?: (value: T) => Promise<U>|U): Promise<U>;
}

var p1 = Promise.resolve("hello");  // p1: Promise<string>
var p2 = Promise.resolve(p1);       // p2: Promise<string>
var p3 = p1.then(i => {             // p3: Promise<number>
    if (true) {
        return Promise.resolve(10);
    }
    else {
        return i.length;
    }
});
@@ -294,22 +294,22 @@ class f {
>base2 : typeof base2

var b1 = [ baseObj, base2Obj, ifaceObj ];
>b1 : base[]
>[ baseObj, base2Obj, ifaceObj ] : base[]
>b1 : iface[]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to confirm, this change is because when BCT finds > 1 possible BCT now it chooses one based on 'an order' (based on the type id internally) rather than the first candidate in the list? This seems innocuous in this case since it's only observable because base, base2 and iface types are empty types, if they have members then iface is chosen today.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, BCT without a contextual type is now the same as a union type of the elements, and the consitutent types of a union type is an unordered set.

The tuple type tests from master need to be updated to reflect the new
best common type behavior from union types. This commit simply accepts
the baselines as they are.
tests/cases/compiler/contextualTyping33.ts(1,66): error TS2345: Argument of type '{}[]' is not assignable to parameter of type '{ (): number; (i: number): number; }[]'.
Type '{}' is not assignable to type '{ (): number; (i: number): number; }'.
tests/cases/compiler/contextualTyping33.ts(1,66): error TS2345: Argument of type 'Array<{ (): number; } | { (): string; }>' is not assignable to parameter of type '{ (): number; (i: number): number; }[]'.
Type '{ (): number; } | { (): string; }' is not assignable to type '{ (): number; (i: number): number; }':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this came up elsewhere but it sure would be nice to display this in lambda form although we don't have a good way to do that right now. It's not actually specific to union types but it becomes slightly worse with them. Something like this:

Array<() => string>

feels much closer to what people write than

Array<{ (): string }>

And gets worse as you add unions:

Array<() => string | () => number>

vs

Array<{ (): number } | { (): string }>

The object literal with call signatures feels a bit like displaying the compiler's internal representation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would basically have to add parentheses to the type grammar:

() => string
(() => string)[]
(() => string) | (() => number)
((() => string) | (() => number))[]

It wouldn't be that complicated, although we'd have to do a bit of lookahead to disambiguate parentheses from parameter lists in function types.

@ahejlsberg
Copy link
Member Author

Latest commit fixes issue where overly eager subtype reduction in union types would cause a union type to become any because of a circular reference.

Also, the compiler now simply uses union types for the results of the ?: and || operators instead of the more complicated best common type scheme we had before.

@jbaron
Copy link

jbaron commented Oct 12, 2014

Did generate a declaration file for large UI library (Qooxdoo) that had its own OO framework. And of the 3 main things missing to really be able to have a very accurate declaration file (and also use that library to its full potential), you guys already addressed 2 of them in recent weeks. Great job!!

Things that were missing from TypeScript version 1.0 :

  • Protected keyword (done)
  • Union type (almost done I guess)
  • Mixin (can always hope for ;)
@ahejlsberg
Copy link
Member Author

Latest commit corrects contextual typing with union types.

  • When an object literal is contextually typed by a union type U, the contextual type for a property P in the object literal is a union of the types of the P properties from those types in U that has one.
  • When an array literal is contextually typed by a union type U, the contextual type of an element expression is a union of the numeric index signature types from those types in U that has one.
  • When a function expression is contextually typed by a union type U, a contextual call signature is available only if each type in U that provides a contextual call signature provides the same one.

Some examples:

var f: number | (x: string) => number;
f = 5;
f = x => x.length;  // x is of type string

var a: Array<number | (x: string) => number>;
a = [5, x => x.length];  // x is of type string

interface A {
    p: Array<number>;
}

interface B {
    p: Array<(x: string) => number>;
}

var obj: A | B = {
    p: [x => x.length];  // x is of type string
};

This commit also changes array and object literals to never produce a result of the contextual type, but rather to use union types to express the exact shape seen. Without this change the last example above wouldn't work.

ahejlsberg and others added 4 commits October 13, 2014 11:26
Conflicts:
	src/compiler/checker.ts
	src/compiler/types.ts
	src/services/services.ts
	tests/baselines/reference/assignmentCompatBetweenTupleAndArray.errors.txt
	tests/baselines/reference/bestCommonTypeOfTuple.types
	tests/baselines/reference/bestCommonTypeOfTuple2.types
	tests/baselines/reference/castingTuple.errors.txt
	tests/baselines/reference/contextualTypeWithTuple.errors.txt
	tests/baselines/reference/genericCallWithTupleType.errors.txt
	tests/baselines/reference/indexerWithTuple.types
	tests/baselines/reference/numericIndexerConstrainsPropertyDeclarations.errors.txt
Conflicts:
	src/compiler/types.ts
	src/services/services.ts
ahejlsberg added a commit that referenced this pull request Oct 13, 2014
@ahejlsberg ahejlsberg merged commit e22500d into master Oct 13, 2014
@ahejlsberg ahejlsberg deleted the unionTypes branch October 13, 2014 23:24
@@ -82,7 +82,7 @@ module ts {
return array1.concat(array2);
}

export function uniqueElements<T>(array: T[]): T[] {
export function deduplicate<T>(array: T[]): T[] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'distinct'.

'deduplicate' indicates you are mutating in the array in place. 'distinct' is the linq name for getting back the set of unique elements.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet