Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix syntax #62

Open
pitag-ha opened this issue Jul 13, 2021 · 8 comments
Open

Fix syntax #62

pitag-ha opened this issue Jul 13, 2021 · 8 comments

Comments

@pitag-ha
Copy link
Member

With the syntax type t = [%import: Foo.bla], the [%import: Foo.bla] is written at a place where the OCaml parser expects a core type extension node. But ppx_import doesn't only want to inject a core type (ptype_manifest) into the corresponding type declaration. It also wants to inject the concrete type definition (ptype_kind), which breaks with the idea of an extension node being local.

To express ppx_import in an OCaml AST conform syntax, it would have to be written as a extension node at the level of a structure item. Derivers such as sexp or yojson would just be packed into the payload. That would look as follows:

[%%import : type t = Foo.bla [@@deriving sexp,yojson]]

or, expressed in a nicer way:

type%import t = Foo.bla [@@deriving sexp,yojson]

That syntax would be conform with the OCaml AST and, as a nice "side-effect", it would also be convenient from a source-code point of view.

Of course, releasing that syntax change would break all users. But it doesn't break the semantics or features, it only breaks the syntax and so one could use a simple sed command on a project in order to upgrade to the new ppx_import. You could also prepare for the breaking change by making a 1.x release allowing both syntaxes, but deprecating the old one.

What do you think about the proposed syntax in general? And what do you think about introducing it in a major release?

(cc @NathanReb since we've been talking about this)

@pitag-ha pitag-ha mentioned this issue Jul 13, 2021
@pitag-ha pitag-ha changed the title Fix syntax to: type%import t = Foo.bla [@@deriving sexp,yojson] Fix syntax to: type%import t = Foo.bla Jul 13, 2021
@pitag-ha pitag-ha changed the title Fix syntax to: type%import t = Foo.bla Fix syntax Jul 13, 2021
@pitag-ha
Copy link
Member Author

@gasche has pointed out here (in 3.) what I meant with "ppx_import doesn't conform to the OCaml AST":

When you say that ppx_import does not conform to the OCaml AST, you mean that the interpretation of the extension point depends on its placement in a wider context, and transforms other parts of the program than just the AST node it is placed at.

The end of the sentence captures exactly what I want to say: the expansion of the node spreads onto the rest of the tree. From what I understand, extension nodes are meant to be expanded (as nodes), not to expand the rest of the tree.

@gasche
Copy link
Contributor

gasche commented Jul 14, 2021

The proposed syntax is okay, and with the type%import version it is actually nicer than the current syntax. However I don't understand how [@@deriving] directives would be handled with this version. Currently, if I understand correctly, ppx_import is not aware of any ulterior transformation that occur after it imports the definition; it works with [@@deriving] or other extensions without explicit coordination. Does the proposal you have in mind require explicitly handling these attributes in some way? Or is the idea just to collect those attributes and re-apply them to the expanded structure/signature items? But if it is the latter, why not use [%%import type t = foo] [@@deriving bar]?

@gasche
Copy link
Contributor

gasche commented Jul 14, 2021

From what I understand, extension nodes are meant to be expanded (as nodes), not to expand the rest of the tree.

Outside ppxlib, there are no strict requirements on how extension nodes should be interpreted by processors. There are also several good reasons why the interpretation of an extension node may sometimes want to leak outside its initial AST placement. Consider for example an extension that expands to an expensive computation that does not depend on variables in its lexical scope, and only needs to be done once per program execution. You want

let foo x = [%expensive computation]

to generate something like

let __expensive_computation = lazy ....
let foo x = Lazy.force __expensive_computation

One way to work around this issue is to require that such extension nodes are themselves embedded into larger extensions that span a larger scope than the possible rewriting. So one would have to write

begin%with_expensive_computations_inside
let foo x = [%expensive computation]
end;;

and have the interpretation of the outer extension do the job of lifting the fragments to the toplevel. However this approach:

  • is painful to users, in most cases for no clear reasons to them
  • does not actually improve locality, as you trade a small technically-non-local extension (which morally has a local semantics) into local-in-principle extension with a much larger scope, possibly the whole module. Those local-in-principle extensions may not actually improve predictability or performance.

Remark: this situation I describe is not the situation of ppx_import, for which we may reasonably give a local semantics by just increasing the scope slightly. However it does suggest that supporting non-local expansions is useful/important in the general case.

@pitag-ha
Copy link
Member Author

Ok, I see your point. Thanks for the example! So I take back my general statement about extension nodes only meant to be local in general.

Maybe we can agree on the following? For performance and for predictability, extension nodes ideally don't leak outside their original AST placement unless it's in their nature to do so. It is not in the nature of ppx_import to leak outside its original AST placement: it currently does so due to a choice made about its syntax that restricts its scope one step too much. Does that sound right to you?

@pitag-ha
Copy link
Member Author

Does the proposal you have in mind require explicitly handling these attributes in some way? Or is the idea just to collect those attributes and re-apply them to the expanded structure/signature items?

The latter.

But if it is the latter, why not use [%%import type t = foo] [@@deriving bar]?

That's because if you write [%%import type t = foo] [@@deriving bar], the parser expects bar to be an attribute attached to an extension point, but bar is likely written as a type declaration attribute. You can see in this test that [%%import type t = foo] [@@deriving bar] would fail with "Error: Attributes not allowed here" (I was surprised myself yesterday when I wrote the test).

@gasche
Copy link
Contributor

gasche commented Jul 14, 2021

For performance and for predictability, extension nodes ideally don't leak outside their original AST placement unless it's in their nature to do so. It is not in the nature of ppx_import to leak outside its original AST placement: it currently does so due to a choice made about its syntax that restricts its scope one step too much. Does that sound right to you?

Yes, that sounds perfectly right to me.

(Note that, to someone who is not aware of the current AST structure, it is not clear that the extension handling is non-local: for all they can see, type t = [%import Foo.t] gets turned into something like type t = Foo.t = A | B, which at a source level seems to correspond to filling what's after the =.)

I think that the approach that you propose is nicer than the current one. (Especially now that I'm un-confused about the way attributes would be handled; thanks!). But it is not clear to me that the change is worth breaking all user code. Few things are, and I don't know that this particular case has benefits that outweigh the costs, except of course for the rather directly obvious benefit "it pleases ppxlib people, who maintain the whole ppx infrastructure in practice".

@pitag-ha
Copy link
Member Author

except of course for the rather directly obvious benefit "it pleases ppxlib people, who maintain the whole ppx infrastructure in practice".

Another important benefit would be the improvement in performance!

has benefits that outweigh the costs

I agree that usually the cost of a breaking change is high and therefore there must be really good reasons to accept that cost. But take into account that in this particular case the cost of the breaking change is quite low since upgrading to the new version would really just be as simple as running a command along the lines

sed 's/type\(.*\)\[%import:\([^]]*\)\]/type%import\1\2/g'

one time.

@ejgallego
Copy link
Collaborator

IMVHO, the new syntax has several advantages, and given that there are only 10 rev deps in opam for ppx_import I think it'd be reasonable to deprecate it and remove it in a 2.0 version.

Personally, I am a fan of this kind of "little" improvements which make maintenance easier. On their own may not seem much, but in the wider context they do really add up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants