Skip to content
fpirsch edited this page Nov 14, 2013 · 2 revisions

Now discussed at #133.

Proposal: CST By Extending/Annotating AST

I've previously described this format above as a possibility in response to @DavidBruant. I will copy it here for convenience. Essentially, we are just adding optional properties to AST nodes:

  • Add an additional list property containing 0 or more whitespace/comment fragments; one property for each position whitespace/comments are allowed
  • Add a parens list property to each node that contains 0 or more objects which each contain the syntactic information (whitespace/comment fragments) for a surrounding pair of parentheses. They can be listed outside-in or inside-out, it probably doesn't matter.
  • Add a property to each type of node that supports trailing semicolons, indicating whether one is used in the representation.
  • Something equivalent to escodegen's current "verbatim" support on Literal nodes to specify how numbers/strings are represented.

Pros

  • current tools will be able to accept a CST
  • syntax-agnostic transformations will preserve syntax when passed a CST; don't have to have two code paths in AST tools
  • do not have to traverse the tree to convert between CST/AST
  • escodegen can simply treat any input as a partially-filled-in CST and enrich it with defaults to create a full CST before rendering it

Cons

  • slightly harder to reason with, as properties require more logic to interpret than separate nodes

===

Proposal: CST With Structural Syntactic Forms

This proposal also remains close to the AST specification. In this proposal, new syntactic node types are added to the AST spec. as well as properties containing syntactic information.

  • Add a ParenthesisedExpression node, representing a parenthesised expression in expression position. For parentheses in statement position, this node can be wrapped in a ExpressionStatement.
  • Add an optional extras property to each node (@getify: please clarify exactly how this is represented). I believe this includes whitespace/comment/semicolon information all in one property.

Pros

  • new tools that operate on CSTs should have a slightly easier job reasoning about the syntax represented by the tree
  • parsers may have an easier time generating this format, but more evidence is needed to determine this

Cons

  • need a single-pass transform between AST/CST
  • tools that operate on ASTs only will need either a separate code path or two transformations at its interface (which would lose all syntactic information)