-
Notifications
You must be signed in to change notification settings - Fork 67
Commit
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,158 @@ | ||
Author: John Högberg <john(at)erlang(dot)org>, Ilya Klyuchnikov <ilya(dot)klyuchnikov(at)gmail(dot)com> | ||
Status: Draft | ||
Type: Standards Track | ||
Created: 7-Aug-2024 | ||
Erlang-Version: OTP-28 | ||
Post-History: | ||
**** | ||
EEP 71: Clarify the type documentation, especially regarding type variables | ||
---- | ||
|
||
Abstract | ||
======== | ||
|
||
Type variables are an often-used yet very misunderstood part of the type | ||
language. As `dialyzer` has been lax in enforcing their defined semantics owing | ||
to a few bugs, many users have been led to believe that they are simple generics | ||
as in most other languages. As a significant amount of code has been written | ||
with this in mind, OTP cannot fix the bugs that made `dialyzer` lax by accident | ||
without breaking the analysis of a lot of code. | ||
|
||
While that would be fine if `dialyzer` were the only tool out there, all the | ||
other code analyzers we know of interpret type variables as generics which would | ||
fragment the ecosystem should we elect to fix the bugs. One application might | ||
have been written with `eqWAlizer` in mind and another with `dialyzer`, | ||
preventing either tool from being applied to the whole release. | ||
|
||
As `dialyzer` is simply broken in this regard, this EEP aims to solve the | ||
problem by changing the definition of type variables from _equality constraints_ | ||
to _generics_ (parametric polymorphism). We also propose several other changes | ||
to make the type language more useful and less ambiguous. | ||
|
||
Rationale | ||
========= | ||
|
||
As currently documented, variables in `-spec`s and `-type`s act just like in | ||
ordinary code. That is, if the variable `V` appears several times it has the | ||
same _value_ in all places. In a language like C++ or Java where said value is a | ||
type, this works pretty well: | ||
|
||
|
||
Check failure on line 40 in eeps/eep-0071.md GitHub Actions / markdownlintMultiple consecutive blank lines
Check failure on line 40 in eeps/eep-0071.md GitHub Actions / markdownlintMultiple consecutive blank lines
|
||
template<typename V> | ||
V foo(V a, V b) { | ||
return a + b; | ||
} | ||
``` | ||
Check failure on line 45 in eeps/eep-0071.md GitHub Actions / markdownlintFenced code blocks should be surrounded by blank lines
Check failure on line 45 in eeps/eep-0071.md GitHub Actions / markdownlintFenced code blocks should have a language specified
Check failure on line 45 in eeps/eep-0071.md GitHub Actions / markdownlintCode block style
Check failure on line 45 in eeps/eep-0071.md GitHub Actions / markdownlintFenced code blocks should be surrounded by blank lines
Check failure on line 45 in eeps/eep-0071.md GitHub Actions / markdownlintFenced code blocks should have a language specified
Check failure on line 45 in eeps/eep-0071.md GitHub Actions / markdownlintCode block style
|
||
For example, `foo(12, 34)` works because both arguments have the same type `int` | ||
(yielding `int` for `T`), but `foo(12, 34.0)` doesn't because `int` is not | ||
`double`. | ||
However, as the documentation for the Erlang type language says that repeated | ||
variables refer to the same _value at runtime_, tools like `gradualizer` and | ||
`eqWAlizer` are not free to treat them as generics. For example: | ||
-spec my_add(T, T) -> T. | ||
my_add(A, B) -> A + B. | ||
Here, the only possible values of `T` that will work are the various | ||
representations of zero: `foo(12, 34)` will fail because `12` is not `34`, and | ||
`foo(1, 1)` will fail because the result `2` is not `1`. | ||
This makes it impossible to declare that a function takes a value of a certain | ||
type and returns something of the same type (but not necessarily _same value_). | ||
This makes sense in `dialyzer` as values and types are the same thing in its | ||
paradigm[^1], but restricts what other type checkers can do; while the | ||
documentation states that the "extra information" given by type variables may be | ||
ignored, it leaves no room for other interpretations when they are taken into | ||
account. | ||
[^1]: In an untyped language we cannot simply say that the type of `1` is | ||
`integer()` since `1 | foo` is just as much of a superset of `1` as | ||
`integer()` is. The type of `1` is simply `1`. | ||
Changes | ||
---------------- | ||
To solve this issue and make the type language less surprising, we propose the | ||
following changes: | ||
1. Redefine `::` as alias instead of "subtype of." | ||
All currently known tools implement this interpretation[^2]. Should bounded | ||
quantification be needed in the future, the "subtype of" operator can be | ||
introduced later under the more widely used `<:` or `=<` syntaxes. | ||
Multiple aliases under the same name are rejected, making the following illegal: | ||
-spec multiple(X :: integer(), X :: integer()) -> atom(). | ||
-spec when_multiple(X) -> atom() when X :: integer(), X :: number(). | ||
To prevent breaking existing code as a result of this change, the compiler | ||
will _not_ enforce the above, leaving that to the type checking tools (or | ||
optionally a compiler flag). | ||
2. Normalize types in the parser, substituting aliases and treating "annotated | ||
types" the same as `when`. This makes the following signatures equivalent: | ||
-spec xyzzy(A) -> term() when A :: number(). | ||
-spec waldo(B :: number()) -> term(). | ||
-spec fred(number()) -> term(). | ||
Where `B :: number()` is mere shorthand for `B ... when B :: number()`, and | ||
all variants declare that the first argument is a `number()`. | ||
3. Treat type variables as generics rather than as equality constraints. | ||
Example: | ||
The current `lists:append` type spec is: | ||
-spec append(List1, List2) -> List3 when List1 :: [T], List2 :: [T], List3 :: [T], T :: term(). | ||
With the defined semantics, `T :: term()` makes the `append` function to | ||
refer to a top type. Instead, the following is preferred and more in line | ||
with current type systems: | ||
-spec append(List1, List2) -> List3 when List1 :: [T], List2 :: [U], List3 :: [T | U]. | ||
4. Have function contracts (`-spec`) describe argument types as they go in. | ||
A very surprising aspect of function contracts is that they describe the | ||
types of the arguments _when the function returns successfully_, and do not | ||
affect argument types _within_ the function. For example: | ||
-spec foo(integer()) -> integer(). | ||
foo(X) -> | ||
bar(X), %% X is not necessarily an integer here! | ||
X + 1. | ||
Most people take the above to mean that the type passed on to `bar/1` is | ||
`integer()`, yet it can be whatever is passed to `foo/1` including values | ||
that cause `foo/1` to fail. This often hides the fact that there is | ||
unreachable code in callees such as `bar/1` here, and sometimes results in | ||
cryptic warnings describing something the user believes could never happen. | ||
This also makes it much easier for a user to see what the type of a variable | ||
is at any point: just start with what it says in the `-spec`. At present, | ||
the type of `X` depends entirely on how `X` got there which is not always | ||
trivial to figure out. | ||
[^2]: `dialyzer` also does this as a consequence of the bugs mentioned in the abstract. | ||
Reference Implementation | ||
------------------------ | ||
-- TBD, we'll need to make a PR with changes in the documentation to reflect | ||
above once we're happy with the above. -- | ||
Copyright | ||
========= | ||
This document is placed in the public domain or under the CC0-1.0-Universal | ||
license, whichever is more permissive. | ||