Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ng code generation #332

Merged
merged 51 commits into from
Dec 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
a25f0b9
[tool] build path for new code generation
apalala Nov 28, 2023
ad1f639
[walkrers] make walk_children public
apalala Nov 28, 2023
d87ac12
[grammar] fix name to TatSu
apalala Nov 28, 2023
caf280a
[ngcodegen] up to Buffer definition
apalala Nov 28, 2023
d59d2f4
[mixins][indent] allow control over the amount of indentation
apalala Nov 28, 2023
75d43b6
[ngcodegen] refactor
apalala Nov 28, 2023
e8cfa3b
[mixins][indent] clarify and refactor
apalala Nov 29, 2023
feff967
[lint] resolve warnings
apalala Nov 29, 2023
3a849fb
cleanup
apalala Nov 29, 2023
21cbaf9
[ngcodegen] allow naming the parser
apalala Nov 29, 2023
7074718
[python] generate up to rule templates
apalala Nov 29, 2023
fda5915
[lint] solve warnings
apalala Nov 29, 2023
e71087f
[ngcodegen] add more node types to walker
apalala Nov 29, 2023
0bdd945
[ngcodegen] add more node types to walker
apalala Nov 29, 2023
a057888
[ngcodegen] add more node types to walker
apalala Nov 29, 2023
8563ef1
[ngcodegen] add more node types to walker
apalala Nov 29, 2023
c3ab556
[ngcodegen] add more node types to walker
apalala Nov 29, 2023
367f730
[ngcodegen] add more node types to walker
apalala Nov 29, 2023
37c0dc4
[ngcodegen] bug fixes
apalala Nov 29, 2023
82a5b90
[ngcodegen] fix bugs
apalala Nov 29, 2023
09d6139
[mixins][indent] allways trim left spacing in arguments
apalala Nov 29, 2023
18462a5
[ngcodegen][model] bootstrap model generation
apalala Nov 29, 2023
e38a3dc
[ngcodegen][objectmodel] add model generator
apalala Nov 29, 2023
0cc8f0d
[ngcodegen][lint] clear linter warnings
apalala Nov 29, 2023
1deaee3
[test] solve lint warnings and update unit tests
apalala Nov 29, 2023
e6aebf1
[mixins][indent] honor print() kwargs
apalala Nov 30, 2023
b8f4bdd
[ngcodegen][python] debut with what unit tests say
apalala Nov 30, 2023
6064a71
[tool] arg documentation
apalala Dec 8, 2023
4b6d2db
[lint] resolve warnings
apalala Dec 8, 2023
cf7fdfa
[walkers] fix long standing bug
apalala Dec 8, 2023
bae9710
[lint] fix warnings
apalala Dec 8, 2023
0bcdf05
some refactoring
apalala Dec 9, 2023
1d1d059
[docs] bug fix
apalala Dec 9, 2023
58e73aa
[docs] deprecate declarative translation
apalala Dec 9, 2023
48df36e
[walkers] refactor and cleanup
apalala Dec 9, 2023
2f9d487
[docs] update for walkers and refactor
apalala Dec 9, 2023
1d84b25
[docs] refactor translation
apalala Dec 9, 2023
40e9dd2
[ngcodegen] fix bugs
apalala Dec 9, 2023
9cc66af
[tool] replace code generation with ng
apalala Dec 9, 2023
1db6655
[ngcodegen][model] debug
apalala Dec 9, 2023
908549f
[ngcodegen] fix gut with rendering whitespace
apalala Dec 9, 2023
ad1bc9c
[ngcodegen][model] use topological sort for order of model classes
apalala Dec 10, 2023
ef23c64
[util][misc] document topological_sort
apalala Dec 10, 2023
8cefbae
remove debugging statements
apalala Dec 10, 2023
8e3004c
remove debug code
apalala Dec 10, 2023
b356d2a
[lint] resolve warnings
apalala Dec 10, 2023
db369b0
[ngcodegen][model] do not generate model classes for builtins
apalala Dec 10, 2023
3d4b066
remove reference to Py27
apalala Dec 10, 2023
224612c
[bootstrap] make the generated parser be the bootstrap parser
apalala Dec 10, 2023
c48c19f
[ngcodegen][model] refactor and optimize
apalala Dec 10, 2023
59a1ad5
[dist] bump up version number
apalala Dec 11, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/antlr.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
.. include:: links.rst


Using ANTLR Grammars
--------------------
ANTLR Grammars
--------------

.. _grammars: https://github.com/antlr/grammars-v4

Expand Down
25 changes: 0 additions & 25 deletions docs/asjson.rst

This file was deleted.

19 changes: 0 additions & 19 deletions docs/grako.rst

This file was deleted.

3 changes: 0 additions & 3 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -46,13 +46,10 @@ input, much like the `re`_ module does with regular expressions, or it can gener
ast
semantics
models
asjson
print_translation
translation
left_recursion
mini-tutorial
traces
grako
antlr
examples
support
Expand Down
3 changes: 2 additions & 1 deletion docs/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,5 +9,6 @@ Installation
$ pip install tatsu

.. warning::
Versions of |TatSu| since 5.0.0 may require Python>=3.8. Python 2.7 is no longer supported
Modern versions of |TatSu| require active versions of Python (if the Python
version is more than one and a half years old, things may not work).

49 changes: 39 additions & 10 deletions docs/models.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,12 @@
.. include:: links.rst


Models
------


Building Models
---------------
~~~~~~~~~~~~~~~

Naming elements in grammar rules makes the parser discard uninteresting
parts of the input, like punctuation, to produce an *Abstract Syntax
Expand Down Expand Up @@ -41,6 +45,32 @@ You can also use `Python`_'s built-in types as node types, and
default behavior can be overidden by defining a method to handle the
result of any particular grammar rule.



Viewing Models as JSON
~~~~~~~~~~~~~~~~~~~~~~


Models generated by |TatSu| can be viewed by converting them to a JSON-compatible structure
with the help of ``tatsu.util.asjson()``. The protocol tries to provide the best
representation for common types, and can handle any type using ``repr()``. There are provisions for structures with back-references, so there's no infinite recursion.

.. code:: python

import json

print(json.dumps(asjson(model), indent=2))

The ``model``, with richer semantics, remains unaltered.

Conversion to a JSON-compatible structure relies on the protocol defined by
``tatsu.utils.AsJSONMixin``. The mixin defines a ``__json__(seen=None)``
method that allows classes to define their best translation. You can use ``AsJSONMixin``
as a base class in your own models to take advantage of ``asjson()``, and you can
specialize the conversion by overriding ``AsJSONMixin.__json__()``.

You can also write your own version of ``asjson()`` to handle special cases that are recurrent in your context.

Walking Models
~~~~~~~~~~~~~~

Expand Down Expand Up @@ -82,19 +112,18 @@ methods such as:
return s

def walk_object(self, o):
raise Exception('Unexpected tyle %s walked', type(o).__name__)
raise Exception(f'Unexpected type {type(o).__name__} walked')

Predeclared classes can be passed to ``ModelBuilderSemantics`` instances
through the ``types=`` parameter:

.. code:: python
Which nodes get *walked* is up to the ``NodeWalker`` implementation. Some
strategies for walking *all* or *most* nodes are implemented as classes
in ``tatsu.wakers``, such as ``PreOrderWalker`` and ``DepthFirstWalker``.

from mymodel import AddOperator, MulOperator
Sometimes nodes must be walked more than once for the purpose at hand, and it's
up to the walker how and when to do that.

semantics=ModelBuilderSemantics(types=[AddOperator, MulOperator])
Take a look at ``tatsu.ngcodegen.PythonCodeGenerator`` for the walker that generates
a parser in Python from the model of a parsed grammar.

``ModelBuilderSemantics`` assumes nothing about ``types=``, so any
constructor (a function, or a partial function) can be used.

Model Class Hierarchies
~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
41 changes: 0 additions & 41 deletions docs/print_translation.rst

This file was deleted.

64 changes: 59 additions & 5 deletions docs/translation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,68 @@
.. _pegen: https://github.com/we-like-parsers/pegen
.. _PEG parser: https://peps.python.org/pep-0617/

Declarative Translation
-----------------------
Translation
-----------

Translation is one of the most common tasks in language processing.
Analysis often sumarizes the parsed input, and *walkers* are good for that.
In translation, the output can often be as verbose as the input, so a systematic approach that avoids bookkeeping as much as possible is convenient.


|TatSu| doesn't impose a way to create translators, but it
exposes the facilities it uses to generate the `Python`_ source code for
parsers.


Print Translation
~~~~~~~~~~~~~~~~~

Translation in |TatSu| is based on subclasses of ``NodeWalker``. Print-based translation
relies on classes that inherit from ``IndentPrintMixin``, a strategy copied from
the new PEG_ parser in Python_ (see `PEP 617`_).

``IndentPrintMixin`` provides an ``indent()`` method, which is a context manager,
and should be used thus:

.. code:: python

class MyTranslationWalker(NodeWalker, IndentPrintMixin):

def walk_SomeNodeType(self, node: NodeType):
self.print('some preamble')
with self.indent():
# continue walking the tree
self.print('something else')


The ``self.print()`` method takes note of the current level of indentation, so
output will be indented by the `indent` passed to
the ``IndentPrintMixin`` constructor, or to the ``indent(amount: int)`` method.
The mixin keeps as stack of the indent ammounts so it can go back to where it
was after each ``with indent(amount=n):`` statement:


.. code:: python

def walk_SomeNodeType(self, node: NodeType):
with self.indent(amount=2):
self.print(node.exp)

The printed code can be retrieved using the ``printed_text()`` method, but other
posibilities are available by assigning a stream-like object to
``self.output_stream`` in the ``__init__()`` method.

A good example of how to do code generation with a ``NodeWalker`` and ``IndentPrintMixin``
is |TatSu|'s own code generator, which can be
found in ``tatsu/ngcodegen/python.py``, or the model
generation found in ``tatsu/ngcodegen/objectomdel.py``.


.. _PEP 617: https://peps.python.org/pep-0617/


Declarative Translation (deprecated)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


|TatSu| provides support for template-based code generation ("translation", see below)
in the ``tatsu.codegen`` module.
Expand All @@ -26,8 +82,6 @@ breadth or depth first, using only standard Python_. The procedural code must kn
to navigate it, although other strategies are available with ``PreOrderWalker``, ``DepthFirstWalker``,
and ``ContextWalker``.

**deprecated**

|TatSu| doesn't impose a way to create translators with it, but it
exposes the facilities it uses to generate the `Python`_ source code for
parsers.
Expand Down
2 changes: 1 addition & 1 deletion grammar/tatsu.ebnf
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
@@grammar :: Tatsu
@@grammar :: TatSu
@@whitespace :: /\s+/
@@comments :: ?"(?sm)[(][*](?:.|\n)*?[*][)]"
@@eol_comments :: ?"#[^\n]*$"
Expand Down
1 change: 1 addition & 0 deletions ruff.toml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ ignore = [
"PLR0904", # too-many-public-methods
"PLR0913", # too-many-arguments
"PLR0915", # too-many-statements
"PLR0917", # too many possitional arguments
"PLR2004", # magic-value-comparison
"PLW1514", # unspecified-encoding
# "PLW0603", # global-statement
Expand Down
2 changes: 1 addition & 1 deletion tatsu/_version.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = '5.10.7b1'
__version__ = '5.11.0b1'
Loading
Loading