Grako uses Semantic Versioning for its releases, so parts of the version number may increase without any significant changes or backwards incompatibilities in the software.
The format of this Change Log is inspired by keeapachangelog.org.
X.Y.Z @ 2017
3.22.0 @ 2017-03-19
- Add
objectmodel.Node.has_parseinfo()
for querying ifParseInfo
is available for the node without raising exceptions.
- Raising
NoParseInfo
whenobjectmodel.Node.parseinfo
isNone
broke several existing parsers. The change must wait for a major release. Reverted.
3.21.1 @ 2017-03-16
- Backwards compatibility. Renaming of
buffering.LineIndexEntry
broke existing parsers.
3.21.0 @ 2017-03-14
-
118 Name all "info" as
...Info
, make them classes that descend fromnamedtuple
, have__slots__ = ()
, and move them to moduleinfos
. -
116 Status information during parsing is now routed through
logging.getLogger("grako")
. -
119 Methods in
objectmodel.Node
that require aninfos.ParseInfo
will now raiseexceptions.NoParseInfo
if parse information is not available for the node.
- There were broken links in the documentation after the repository was moved from apalala to neogeny on Bitbucket.
3.20.1 @ 2017-03-12
-
1g Ambiguous use of
rule_name
andstart
led to failure ifrule_name
was passed tograko.parse()
. -
2g
FailedRef
is thrown when a semantic rule raises aKeyError
.
3.20.0 @ 2017-03-06
-
Added
grako.compile(grammar, name=None, **kwargs)
as a substitute forgrako.genmodel
(which remains for backwards compatibility). -
Added a
grako.parse(grammar, input, **kwargs)
that will compile a grammar and uset to parse the given input. For efficiency, parsed grammars are cached. -
Added a
grako.to_python_sourcecode(grammar, name=None, filename=None, **kwargs)
that compiles the grammar to the Python sourcecode that implements the parser. -
Rename the existing join (
.{}
) expression to gather, and use join for a new expression (%{}
) that keeps the separator in the resulting AST. -
Added left-join (
<{}
) and right-join (>{}
) expressions to the grammar.
-
Enable
python setup.py test
with pytest. -
Remove the deprecated
prefix=
argument toParseContext.closure()
. -
Document that Grako may be used as a library (with no code generation) by compiling grammars to
Grammar
objects that can be used to parse any given input, much like Python'sre
does with regular expressions. -
Refactored grammar so double naming (
name1:name2:exp
) is disallowed.
-
The
parseinfo=
keyword parameter was unused inGrammarGenerator.__init__()
. -
The incomplete
examples/python
example was the source of confusion, as users expected it to work even when the example is not mentioned in the documentation. The directory is removed from the distribution and the mail repository branch until the example is complete ans working.
3.19.1 @ 2017-02-15
- Python-style raw string literals are now valid:
r'text'
orr"text"
. - A new syntax for patterns:
?'text'
or?"text"
.
- PyPi (the Python Package Index) now requires there be a single source-code archive in each release. The chosen format is '.zip'.
- PyPi publication requires there be a
README
,README.rst
, orREADME.txt
file. RenamedDISTRIBUTION.rst
toREADME.rst
. - Make
parseinfo
less special so it can be part ofATS
and ofasjson(ast)
. - Move
etc/grako.ebnf
, the Grako grammar, to agrammar/
directory. - Deprecated the
?/regexp/?
style of patterns. - Deprecated the
(* *)
style of comments. - Renamed classes so the entity names are meaningful instead of just being references to the tool's name (
EBNFParser
instead ofGrakoParser
,ParseException
instead ofGrakoException
, ...). - Removed classes that had become irrelevant (
GrakoContext
,GrakoParserBase
, ...) - Updated copyright notices.
- 96 Honor regexp concatenation when prettifying.
- 108 Clarify the use of the
exp
parameter ingrammars.Decorator
,grammars.Rule
, andgrammar.BasedRule
.
3.18.2 @ 2016-02-04
- Fixed signature of
__call__
method inclass ListRules(argparse.Action)
. - 107 Fixed typo, improve English in "Calc" example (sjbrownBitbucket).
- 109 Fixed incorrect definition for the
--name
comand-line argument.
- 104 Wrong shebang in examples/antlr2grako/antlr2grako.py
3.18.1 @ 2016-12-13
- Updated list of contributors
- 101
objectmodel.Node
constructor does not set attributes whenast=None
. - 102
MANIFEST.in
dit not include the calc example, so it wasn't distributed. - 103
grako.contexts.ParseContext._check_name
fails in Python 2.7) with non-ASCII identifiers. For 2.7 compatibility,grako.util.ustr()
must be used instead of the built-instr()
.
3.17.0 @ 2016-12-01
- A new calc example project (
examples/calc
) based on the calculator example from PLY goes through most Grako features in tutorial form.
- Report the parse exception (
grako.exceptions.FailedParse
) ocurring furthest in the input so error messages are more meningful even when no cut (~
) expressions are used.
*args
followed by keyword parameters is not valid in Python 2.7.
3.16.5 @ 2016-11-04
- Experimental
grako.walkers.NodePreOrderWalker
, with more predictable and easy to understand semantics.
- The code generator now uses a set literal for the
KEYWORDS
constant.
- BUG: Parsing
()
(void/nothing) should clear last_node inContext
, soname:()
results inast.name is None
.
3.16.4 @ 2016-11-02
- 99 Object model generation would generate repeated classes when the same class name was applied to different rules in the grammar.
3.16.3 @ 2016-10-27
grako.walkers.NodeWalker
was walking base classes in the wrong order.grako.walkers.PreOrderWalker.walk()
was attempting to walk the children of objects not instances ofgrako.model.Node
.grako.walkers.PreOrderWalker.walk()
was breaking theNodeWalker
protocol by always returningNone
.
3.16.1 @ 2016-10-16
- Make traces represent recursion failures differently.
- Removed
grako.exceptions.FailedParseBase
as it served no purpose. - Refactor the unicode part of traces into a separate module.
- Fix off-by-one preventing multiple include statemtents to work (gegenschall).
- Left recursion was enable by default in generated parsers, though the README says its not. Disabled.
- Bug fixes, in the commit log.
3.16.0 @ 2016-10-01
- Test and publish Grako using Travis CI.
- Added support for case-insensitivity to
grako.symtables
. - A base class can now be specified along with the object model class name in grammar rules:
integer::Integer::Literal
- Reduced the memory used by symbol tables by replacing
symtables.SymbolReference
by the referencingobjectmodel.Node
. - Now
grako.grammars.Decorator
is public. - Demoted support for left recursion to experimental. It has been reported that even some simple cases are not handled.
3.15.1 @ 2016-09-28
- Added
symtables.Namespace.get()
for completenes.
- The result of
grako.model.Node.children()
now defaults toNode.children_list()
. It was too unexpected that the child nodes might be out of order. - Simplified the
main()
function in generated parsers. - Moved
Node
into the newgrako.objectmodel
. MovedNodeWalker
and descendants into the newgrako.walkers
module. MovedModelBuilderSemantics
into thegrako.semantics
module. Thegrako.model
module was updated for backwards compatibility. - Generated parsers and models no longer carry the current date as a version tag. The tags served only to confuse version control.
- Use
weakref.proxy
for back-references (likegrako.objectmodel.Node._parent
) to make it easier for the Python garbage collector. - Walker will now also recognize walk methods where the class name has the upper case characters replaced by an underscore followed by the characeter in lower case (
walk_NegativeLookahead()
orwalk__negative_lookahead()
.
- Added a patch for bakcwards compatibility with parsers generated before the switch from
prefix=
tosep=
in generated calls to closures. - Restored special treatment of first line in
grako.util.trim()
(as in Python doc-comnents). There were unexpected results without the special treatment. - Use
_args_
and_kwargs_
in generated models to avoid conflicts with grammar elelemts that use the standard Python names. - The generated parser was overriding
Buffer
creation without regard for settings passed to theParser
class. There's now abuffer_class
kwarg to theContext
,Parser
, and the generated parser classes. - Deprecation warnings were always being enabled by
grako.util
. - Found programs that expect
grako.ast.AST
to be reexported fromgrako.model
, so the re-export was re-instated.
3.14.0 @ 2016-08-19
- Now
grako.symtables.Namespace
supports duplicate entries.
- Use upercase
-V
instead of-v
to report the tool version, as to be compatible with almost everyone else.
- The new
grako.symtables.Symbol
tried to serialize unknown fields to JSON. - Add
grako.symtables.SymbolReference
to the JSON representation of namespaces. - The new
grako.model.DepthFirstWalker
could reach recursion limit when filtering for Iterables. Now the filter is forlist
, which is the only container used in models. - The definition
grako.grammars.GrakoBuffer
was not overriding the bootstrap buffer, so#include
was not working, among other possibly undetected (and serious) consequences. - The separator character for rules in the trace logs (
C_DERIVE
ingrako.contexts
) was undefined for non-POSIX platforms (traces could not be used on Windows).
3.13.0 @ 2016-08-18
- Added this new change log.
- The
--pretty-lean
command-line option will produce--pretty
output, discardingnamed:
elements and rule::Parameters
. - A
@@parseinfo
directive controls the generation of parse information from the grammar.
- Traded memory for simplicity and replaced the line-based line cache in
buffering.Buffer
for a position-based cache. - Tab characters are left unchanged by
buffering.Buffer
as to keep references to positions in the original text relevant. - Refactored the ever-growing grammar_test.py into multiple files under grako/test/grammar/.
- In traces, the error column pointer was off when tab characters were involved.
3.12.1 @ 2016-08-06
- Also generate a
buffering.Buffer
descendant specific to the grammar for parsers that need to customize theparsing.Parser.parse()
method. - Added the
grako.synth
module which makes syntheticgrako.model.Node
classes pickable. - Now patterns may be concatenated to split a complex pattern into parts, possibly accross several lines:
/regexp/ + /regexp/
. - Added basic support for symbol tables in
grako.symtables
. - Syntax file for Sublime Text (vmuriart).
- Distinguish between positive and normal joins:
s.{e}+
ands.{e}
. Havings.{e}
use a positive closure was too unexpected. - Now
model.ParseModel
is an alias formodel.Node
. - Improved
examples/antlr2grako
so it generates more usable Grako grammars.
- The latest changes to
grako.util.trim()
were incomplete. - Fixed several inconsistencies in the implementation and use of
buffering.Buffer
line indexing. - Repeated parameters to object model constructors.
3.10.1 @ 2016-07-17
- Enhancements to
grako.tool
and the command-line help siemer.
- Unlink output file before attempting parser generation.
- A
-G FILE
command-line option forces saving of the object model. - The function
grako.util.trim()
now also considers the first text ine. - Tested with Python 3.6.0a3.
grako.model.Node._adopt_children()
was incorrect, soNode.parent
was not being set. Adopted a simple-approach solution based on suggestions by linkdd.- Avoid recovering the same comment against the same line in
grako.buffering.Buffer
. - Recovering comments and end-of-line comments together was incorrect.
model.Node
parenting still broken.- 73 The
--draw
option did not recognize the new object model node typesJoin
andConstant
. Now--draw
works with Python 3.x using pygraphviz 1.3.1. - 77 81 Advance over whitespace before memoization or left recursion.
3.9.3 @ 2016-07-15
- Added
@@grammar
directive to grammars as to avoid having to pass a-m NAME
through the command line. - Added the
@@namechars
directive to allow specifying additional characters that may be part of tokens considerd names by@@nameguard :: True
. - Now a choice expression may start with a leading
'|'
. - The
--object-model
command-line option will generate a python module with definitions for the class names specified as rule parameters (untested).
- Simplified the regular expression for floats in the Grako grammar siemer
- Set all flake8 options in
tox.ini
siemer. - Simplfied
__str__()
for directives siemer. - Now
STARTRULE
defaults tostart
in generated parsers. - Now the AST for a
grako.model.Node
is saved asNode.ast
. - Several simplifications and refactorings by siemer.
- Fixes and improvements to generation of child sets and list in
model.Node
gapag. @@keyword
not working correctly with@@ignorecase
.- Fix for
@@keyword
and@name
by moving check forFailedSemantics
upper in the parsing chain. - Several important bug fixes to the object model generator neumond
- Both
grako.grammars
andgrako.codegen.python
were manipulating the names defined in a grammar rule. - 74
grako.model.Node.children()
returned an empty list even when traversing attributes that with names starting in'_'
. - 57 Still bugs in handling of
@@whitespace
in the generated parser's gkimbar. - Guard against recursive structures in
grako.util.asjson()
. - Cleaned up the grammar in
examples/python
; still untested. - Removed outdated information from the README.
3.8.2 @ 2016-04-23
- Added grammar support for keywords in the source language through the
@@keyword::
directive and the@name
decorator for rules.
- Make
ModelBuilderSemantics
support built-in types.
- Wrong version number (RC) in this document.
- 73 Keywords were not being passed to the base class of the generated parser.
3.7.0 @ 2016-03-05
- Added suport for
`constant`
expressions which don't consume any input yet return the specified constant. - Now an empty closure (
{}
) consumes no input and generates an empty list as AST. - Added the Python-inspired join operator,
s.{e}
, as a convenient syntax for parsing sequences with separators.
- Removed the
--binary
command-line option. It went unused, it was untested, and it was incorrectly implemented. - Generated parsers
pass
onKeyboardInterrupt
. - Moved the bulk of the entry code for generated parsers to
util.generic_main()
. This allows for the verbose code to be verified by the usual tools. - Deprecate
{e}-
by removing it from the documentation.
3.6.7 @ 2016-01-27
- Added
@@whitespace
directive to specify whitespace regular expression within the grammar starkat. - Added
@@nameguard
and@@ignorecase
directives to toggle the respective boolean parameters within the grammar starkat. - All tests pass with Python 3.5.
- Added basic support for output of an AST in YAML format.
- Applied flake8 suggestions.
- More reasonable treatment for ANTLR
token
definitions in theantlr2grako
example. - Upgraded development libraries to their latest versions (see
requirements.txt
).
- Detect and fail promptly on empty tokens in grammars.
- 52 Build with Cython failed on Windows.
- 59 Python keywords can now actually be used as rule names in grammars drothlis.
- 60
@@
directives were not pressent in the output of the--pretty
option. - 58 The parameters to the constructor of generated parsers were being ignored (pgebhard).
grammars.py
would callctx.error()
instead ofctx._error()
on failed rule references.- Overall cleanup of the code and of the development requirements.
- 56 Using @@whitespace generated invalid python programs
- The
@@whitespace
directive was not working for regular expressions nehz. - Left recursion in the grammar was checked for in the wrong place when disabled.
3.5.1 @ 2015-03-12
- Added backwards compatibility with
Buffer.whitespace
. - Added
AST.asjson()
to not have to importgrako.util.asjson()
for the same purpose.
- 45 The
grako
tool now produces basic statistics about the processed grammar. - 46 Left recursion support can be turned off using the
left_recursion=
parameter to parser constructors. - 47 New
@@comments
and@@eol_comments
can be used within a grammar to specify the respective regular expressions. - 48 Rules can now be overriden/redefined using the
@override
decorator.
3.4.3 @ 2014-11-27
- Added a
--no-nameguard
command-line option to generated parsers. - Allow Buffer descendants to customize how text is split into lines starkat.
- Added a
--version
option to the commandline tool. Agrako.__version__
variable is now available. - A re regular expression is now accepted for whitespace matching. Character sets provided as
str
,list
, orset
are converted to the corresponding regular expression starkat. - If installed, the regex module will be used instead of re in all pattern matching starkat. See the section about whitespace above.
- Minor improvements to
buffering.Buffer
. - Now the
re.UNICODE
flag is consistently used in pattern, comment, and whitespace matching.
- 42
setup.py
might give errors under some locales because of the non-ASCII characters inREADME.rst
.
3.3.0 @ 2014-07-22
- 40 The widtn and the separator used in parse traces are now configurable with keyword arguments.
- 38 Trace output uses color if the colorama package is installed.
- Refactorings to enhance consistency in parsing between models and and generated parsers.
- The vertical size of trace logs was reduced to three lines per entry.
- 37 Block comments are preserved when using the
--pretty
option.
3.2.1 @ 2014-07-21
- Now an
eol_comments_re=
parameter can be passed toParser
andBuffer
.
- Now rule parameters and
model.ModelBuilderSemantics
are used to produce grammar models with a minimal set of semantic methods. - Code generation is now separtate from the grammar model, so translation targets different from Python are easier to implement.
- Need to allow newline (
\n
) characters within grammar patterns. - 36 Keyword arguments in rules were not being parsed correctly (franz_g).
- Removed attribute assignment to the underlying
dict
inAST
. It was the source of obscure bugs for Grako users.
3.1.2 @ 2014-07-14
- Grako now supports direct and indirect left recursion thanks to the implementation done by Paul Sargent of the work by Warth et al. Performance for non-left-recursive grammars is unaffected.
- The old grammar syntax is now supported with deprecation warnings. Use the
--pretty
option to upgrade a grammar. - If there are no slashes in a pattern, they can now be specified without the opening and closing question marks.
- 33 Closures were sometimes being treated as plain lists, and that produced inconsistent results for named elements lambdafu.
- The bootstrap parser contained errors due to the previous bug in
util.ustr()
. - 30 Make sure that escapes in
--whitespace
are evaluated before being passed to the model. - 30 Make sure that
--whitespace
and--no-nameguard
indeed affect the behavior of the generated parser as expected.
3.0.4 @ 2014-07-01
- The bump in the major version number is because the grammar syntax changed to accomodate new features better, and to remove sources of ambituity and hard-to-find bugs. The naming changes in some of the advanced features (Walker) should impact only complex projects.
- Grammars may include other files using the
#include ::
directive. - Grammar rules may now inherit the contents of other rules using the
<
operator. - The right hand side of a rule may be included in another rule using the
>
operator. - Added a
--pretty
option to the command-line tool, and refactored pretty-printing (__str__()
in grammar models) enough to make its output a norm for grammar format. - Added compatibility with Cython.
- The cut operator is now
~
, the tilde. - Now name overrides must always be specified with a colon,
@:e
. - Grammar rules may declare Python-style arguments that get passed to their corresponding semantic methods. - Multiple definitions of grammar rules with the same name are now disallowed. They created ambiguity with new features such as rule parameters, based rules, and rule inclusion, and they were an opportunity for hard-to-find bugs (import this).
- Internals and examples were upgraded to use the latest Grako features.
- Parsing exceptions will now show the sequence of rule invocations that led to the failure.
- Renamed
Traverser
andtraverse
toWalker
andwalk
. - Now the keys in
grako.ast.AST
are ordered like incollections.OrderedDict
. - Grako models are now more JSON-friendly with the help of
grako.ast.AST.__json__()
,grako.model.Node.__json__()
andgrako.util.asjon()
. - Removed checking for compatibility with Python 3.3 (use 3.4 instead).
- Incorporated Robert Speer's solution to honoring escape sequences without messing up the encoding.
- Honor simple escape sequences in tokens while trying not to corrupt unicode input. Projects using non-ASCII characters in grammars should prefer to use unicode character literals instead of Python
\x
or\o
escape sequences. There is no standard/stable way to unscape a Python string with escaped escape sequences. Unicode is broken in Python 2.x.
- The
--list
option was not working in Python 3.4.1. - 22 Always exit with non-zero exit code on failure.
- 23 Incorrect encoding of Python escape sequences in grammar tokens.
- 24 Incorrect template for --pretty of multi-line optionals.
2.4.3 @ 2014-06-08
- Added
--whitespace
parameter to generatedmain()
. - Applied flake8 to project and to generated parsers.
- Now a
_default()
method is called in the semantics delegate when no specific method is found. This allows, for example, generating meaningful errors when something in the semantics is missing. - Changes to allow downstream translators to have different target languages with as little code replication as possible. There's new functionality pulled from downstream in
grako.model
andgrako.rendering
.grako.model
is now a module instead of a package. - Added compatibility with tox. Now tests are performed against the latest releases of Python 2.7.x and 3.x, and PyPy 2.x.
- The Visitor Pattern doesn't make much sense in a dynamically typed language, so the functionality was replaced by more flexible
Traverser
classes. The new_traverse_XX()
methods in Traverser classes carry a leading underscore to remind that they shouldn't be used outside of the protocol.
2.3.0 @ 2013-11-27
- Now the
@
operator behaves as a special case of thename:
operator, allowing for simplification of the grammar, parser, semantics, and Grako grammars. It also allows for expressions such as@+:e
, with the expected semantics.
- Refactoring The functionality that was almost identical in generated parsers and in models was refactored into
Context
. - Improve consistency of use Unicode between Python 2.7 and 3.x.
- Compatibility between Python 2.7/3.x print() statements.
2.2.2 @ 2013-11-06
- Optionally, do not memoize during positive or negative lookaheads. This allows lookaheads to fail semantically without committing to the fail.
- Added infrastructure for stateful rules (lambdafu, see the pull request ).
- Grouping expressions no longer produce a list as CST.
- The bootstrap parser is now the one generated by Grako from the bootstrap grammar.
- Protect the names of methods for rules with a leading and trailing underscore. It's the only way to avoid unexpected name clashes.
- Fixed the implementation of the optional operator so the AST/CST_ generated when the optional succeeds is exactly the same as if the expression had been mandatory.
- Make sure closures always return a list.
- The choice operator must restore context even when some of the choices match partially and then fail.
Grammar.parse()
needs to initialize the AST stack.AST.copy()
was too shallow, so an AST could be modified by a closure iteration that matched partially and eventually failed. NowAST.copy()
clones AST values of typelist
to avoid that situation.- A failed
cut
must trickle up the rule-call hierarchy so parsing errors are reported as close to their source as possible. - Several minor bug fixes lambdafu.
2.0.4 @ 2013-08-15
- Now tokens accept Python escape sequences.
- Added a simple Visitor Pattern for
Renderer
nodes. Used it to implement diagramming. - Create a basic diagram of a grammar if pygraphviz is available. Added the
--draw
option to the command-line tool. - Added command-line and parser options to specify the buffering treatment of
whitespace
andnameguard
lambdafu. - It was not possible to pass buffering parameters such as
whitespace
to the parser's constructor lambdafu.
- Grako no longer assumes that parsers implement the semantics. A separate semantics implementation must be provided. This allows for less polluted namespaces and smaller classes.
- A
last_node
protocol allowed the removal of all mentions of variable_e
from generated parsers, which are thus more readable. - Refactored closures to be more pythonic (there are no anonymous blocks in Python!).
- Improved rendering of grammars by grammar models.
- Fixes to the antlr2grako example to let it convert over 6000 lines of an ANTLR grammar to Grako.
- The AST for a closure might fold repeated symbols (thanks to lambdafu).
- Trace information off by one character (thanks to lambdafu).
- Several improvements and bug fixes mostly by lambdafu.
1.4.0 @ 2013-05-02
- Added the antlr example with an ANTLR-to-Grako grammar translator.
- Semantic actions can now be implemented by a delegate.
- The Grako EBNF grammar and the bootstrap parser now align, so the grammar can be used to bootstrap Grako.
- Proved that grammar models can be pickled, unpickled, and reused.
- Reset synthetic method count and use decorators to increase readability of generated parsers.
- The bootstrap parser was refactored to use semantic delegates.
- Changed the licensing to simplified BSD.
- Sometimes the AST for a closure (
{}
) was not a list.
1.3.0 @ 2013-04-11
- Optimization: Remove the memoization information that a cut makes obsolete (thanks to Kota Mizushima).
- Report all the rules missing in a grammar before aborting.
- Make sure that cut actually applies to the nearest fork.
- Finish aligning model parsing with generated code parsing.
- Align the sample etc/grako.ebnf grammar to the language parsed by the bootstrap parser.
- Ensure compatibility with Python 2.7.4 and 3.3.1.
- Update credits.
1.2.1 @ 2013-03-19
- Lazy rendering of template fields.
- Rendering of iterables using a specified separator, indent, and format.
- Added a cache of compiled regexps to
Buffer
. - Basic documentation of the rendering engine.
- Lint using flake8.
- Optimization of rendering engine's
indent()
andtrim()
. - Align bootstrap parser with generated parser framework.
- Add cuts to bootstrap parser so errors are reported closer to their origin.
- Prettify the sample Grako grammar.
FailedCut
exceptions must translate to their nested exception so the reported line and column make sense.- Remove or comment-out code for tagged/named rule names (they don't work, and their usefulness is doubtful).
- Spell-check this document with Vim spell.
1.1.0 @ 2013-02-22
- Improved performance by also memoizing exception results and advancement over whitespace and comments.
- Improved consistency between the way generated parsers and models parse.
- Added a table of contents to this README.
- Document
parseinfo
and default it to False. - Mention the use of context managers.
- Need to preserve state when closure iterations match partially.
- Work with Unicode while rendering.
1.0.0 @ 2013-02-09
- First public release.