Skip to content

factorialco/factorial-presto-parser

Repository files navigation

Factorial Presto Parser

This is a Ruby gem containing a parser for the Presto SQL dialect (now rebranded as Trino). It has been generated using the antlr4-native gem from the Presto grammar specification, found in the official Presto repository.

Note that this gem supports specifically the version 0.172 of Presto, since that's the one used by AWS Athena.

Building

bundle install
rake build
gem install pkg/factorial-presto-parser-[version].gem

Trying it out

In irb:

require 'presto_parser'

statement = PrestoParser::Parser.parse('SELECT 1')
puts statement.singleQuery

Messing with the parser's code

Note

We are using a manually modified version of the original grammar that excludes write operations and only allows database queries. This is accomplished by lowering the grammar's axiom down to the query rule, thus making anything else unrecognizable to the parser.

Also, some constructs in the grammar prevented C++ compilation, so they have been removed or renamed. The changes made to the original grammar can be found in the Presto.g4.patch file.

Regenerate

To regenerate the parser code from the grammar file you can use

rake generate

Be aware that this will overwrite any changes in the C++ code of the parser, as well as the Makefile found in the ext/presto_parser directory.

Compile

To manually compile the C++ native extension you can do it like this:

rake compile