Skip to content
pja35 edited this page Jan 4, 2014 · 50 revisions

Everything that one needs to know in order to understand the code of the project.

Architecture of the system

GOOL grand scheme

Vocabulary:

  • A concrete language is the program as written in a file, for instance System.out.println("Hello") is Java concrete language for printing out to the screen.
  • An abstract language is the computer's representation of the program, as an Abstract Syntax Tree, held in memory.
  • An input language is what the GOOL system can take as input.
  • The source language is what the GOOL system will take as input.
  • An output language is what the GOOL system can take as output.
  • The target language is what the GOOL system will take as output.
  • A primitive is any piece of a language.
  • Recognition is when a specific input primitive (e.g. Java Lists) is associated a specific GOOL primitive (e.g. GOOL Lists).
  • Generation is when a specific GOOL primitive (e.g. GOOL Lists) is associated a specific output primitive (e.g. C# Lists).
  • Passing on is when a specific primitive is associated a generic primitive.
  • The core of a language are the control flow (if,...), basic types (int,...), basic operations (=, new...) of a language.
  • The libraries of a language are the available classes.
  • The system libraries of a language are those available in the standard distribution.
  • The default libraries of a language are those system libraries which are imported by default.
  • The user libraries of a language are those made available by the user.

Words of explanation:

  • If the input language is concrete, then it will be made abstract in the easiest way available: the input's language native parser.
  • The abstract input language then gets translated into the common abstract language GOOL.
  • During this process, some specific input language primitives are recognized (e.g. Java Lists), meaning that they are represented by specific GOOL primitives (e.g. GOOL Lists), i.e. GOOL is aware of their specific nature. The others are just passed on: they are represented in a generic GOOL primitives (e.g. some GOOL Class X).
  • The GOOL abstract language then gets translated into output concrete language.
  • During this process, some specific GOOL primitives (e.g. GOOL Lists) provoke code generation, meaning that they get implemented as specific output language primitives (e.g. C# Lists), i.e. the output language will for sure handle those. The others are just passed on: they are implemented as generic output language primitives (e.g. some C# Class X). This generic, syntactic translation may still be useful (e.g. if the user later provides a C# implementation of Class X). Still, a comment is generated which warns whether a primitive was not recognized or did not provoke code generation.
  • The generated concrete code can be compiled and run.

Who does what?

We explain which package does what. The order is chronological with respect to the above Figure. Within each package, each file comes with an explanation of its purpose.

  • gool: the root of the project, and where global settings are.
  • gool.imports.xxx: just there as information to the user to tell him which subset the xxx input language is recognized.
  • gool.parser.xxx: to parse the xxx input language from concrete to abstract.
  • gool.recognizer.xxx: to translate the xxx input abstract language into its GOOL abstract language representation.
  • gool.ast: the description of the GOOL abstract language.
  • gool.ast.core: the core of the GOOL abstract language.
  • gool.ast.__________: the hard-coded libraries of the GOOL abstract language.
  • gool.library.______: the soft-coded libraries of the GOOL abstract language.
  • gool.ast.printer: pretty printing GOOL.
  • gool.generator.yyy: to translate the GOOL abstract language into yyy output concrete language.
  • gool.generator.common: the classes that must be extended to write a gool.generator.yyy.
  • gool.generator: useful across generators (batching, parameterization etc.)
  • gool.executor.yyy: native compilation and running of the yyy output concrete language.
  • gool.executor.common: the classes that must be extended to write a gool.executor.yyy.
  • gool.executor: useful across executors (running a command etc.)
  • gool.test: to test all that.

Chains of events

Tip: on Eclipse, press control and click on a name to see its declaration.

Parsing

What happens, in the GOOL system, when concrete java gets parsed into abstract GOOL?

  • Initially this is the job of GOOLCompiler.concreteJavaToAbstractGool(...)
  • which delegates it to JavaParser.parseGool(...)
  • which calls Sun's java parser thereby obtaining abstract java trees and
  • which launches JavaRecognizer.scan() on each abstract java tree
  • which calls a JavaRecognizer.visitClass(...) on each abstract java class
  • which creates an abstract GOOL class and fills it by doing an accept(...) on each element of the abstract java class
  • which calls a JavaRecognizer.visitSomething(...) on this element
  • which creates the corresponding abstract GOOL element, possibly filled-in by doing an accept(...) on its sub-elements, etc.
  • until the leafs of the abstract Java tree are reached.

Generating

What happens, in the GOOL system, when abstract GOOL gets flattened into concrete target?

  • Initially this is the job of GOOLCompiler.abstractGool2Target(...)
  • which delegates it to GeneratorHelper.printClassDefs(...)
  • which works out the target language for each class, and delegates it to the right CodePrinter.print(...)
  • which creates the corresponding file, and fills it with ClassDef.getCode()
  • which, via CodePrinter.processTemplate(...), tells velocity to fill in the template class.vm with itself.
  • Take generators.java.templates.class.vm for instance. Velocity fills it with the content of each element of the class, (fields, methods, etc.) by calling a .toString() upon them
  • which calls JavaCodeGenerator.getCode(...) upon itself
  • which generates the appropriate character string for the element in the concrete target language
  • which may require performing .toString() upon sub-elements
  • which calls JavaCodeGenerator.getCode(...) upon sub-elements, etc.
  • until the leafs of the abstract GOOL tree are reached.

Testing

  • Initially one launches a test like GoolTest.helloworld()
  • which wraps a print with TestHelper.surroundWithClassMain(...) and hands it to compareResultsDifferentPlatforms(...)
  • which creates a new GoolTestExecutor(input, expected) and for each of its platfom triggers GoolTestExecutor.compare(...)
  • which compares the expected result with that of GoolTest.compileAndRun(...)
  • which is just a cleaned up version of TestHelper.generateCompileRun(...)
  • which does parsing to GOOL and generating from GOOL as before with a GOOLCompiler.concreteJavaToConcretePlatform() and then seeks to execute the concrete target with ExecutorHelper.compileAndRun(), which:
  • which first delegates compiling the concrete target with its native compiler to SpecificCompiler.compileToExecutable(...) and second delegates executing the compiled concrete target with the machine to SpecificCompiler.run(...).

Important data structures

Abstract Java

is documented on the web, e.g. http://docs.oracle.com/javase/7/docs/api/javax/lang/model/package-summary.html .

Platforms

  • They specify the target language for an abstract GOOL class.
  • They are defined in gool.generator.y.YPlatform.
  • They can be considered to be an abstract GOOL PrimitiveType.
  • They have a name, a CodePrinter, a SpecificCompiler, and often an output directory.
  • The CodePrinter tells how to name files, and their general format, for that target language.
  • The SpecificCompiler tells how to translate each abstract GOOL element into concrete target.
  • They are listed in a global register Platform.registeredPlatforms.

ClassDefs

  • They are the abstract GOOL classes.
  • They are defined in gool.ast.constructs.
  • They can be considered to be Dependency.
  • They have a name, a target platform.
  • They have a package, a parent, interfaces, modifiers, constructors, fields, methods...
  • They have indications on whether they are themselves enums, interfaces...
  • They have dependencies.

Dependencies

  • They stand for the external classes which are used in the code of some abstract GOOL class.
  • They have a name.
  • During parsing process, when JavaRecognizer finds a non-primitive type, or an import, it registers it as a Dependency of the class. Moreover the classes being parsed are registered as dependencies of each other.
  • Often gool.generator.y.YGenerator maintains a register of dependencies.

How libraries are dealt with

In the beginning, the GOOL system was not making any distinction between the core versus the libraries of a language. The GOOL abstract language was representing its libraries (e.g., lists, map, system out print calls) just like it was representing its core primitives: each with a dedicated GOOL Abstract Syntax Tree node. It meant that in order to recognize and generate a new library (a class and its method), you had to :

  • Create new GOOL Abstract Syntax Tree nodes dedicated to representing the library.
  • Implement their recognition by the different input language Recognizers.
  • Implement their generation by the different output language Generators.

This was the ''hard-coded'' solution. However, there is now an alternative, ''soft-coded'' solution, by means of the GOOL Library Manager, which is more modular and much more convenient for users who are not familiar with the whole code of the project. You can find out how a presentation of the GOOL Library Manager here: Libraries support and of its inner workings, here: The library manager.

Some nitty gritty details

  • The code generation gets started by a toString() called upon the GOOL Abstract Syntax Tree. This is because for code generation, we proceed by filling gaps in some velocity templates, and velocity calls toString(). Such a toString() then gets converted into a getCode(this) by the GOOL Abstract Syntax Tree. At this point this needs to have the appropriate type of that piece of the Abstract Syntax Tree, so that the right getCode(...) gets called, as getCode(...) is overloaded. To achieve this, getCode(this) must be performed at that level of the Abstract Syntax Tree. Indeed, if the conversion had been done in some generic GOOL Node, then the overriding of getCode(...) would not have worked, since overriding is done statically in Java. This is why, in GOOL Node, we force overriding of toString() by the children nodes, but this is just a matter of converting it to a getCode(this) in each child.

  • The concept of a Platform specifies a gool.generator.CodePrinter, and hence a gool.generator.CodeGenerator, together with a gool.executor.SpecificCompiler for the Target language. In other words it tells you how to generate the concrete Target, and how to execute it. That way you can have different platforms of the same Target language, which you execute differently etc. Platforms are kept track of at two levels. First of all, they are remembered in some global static register. Second of all, each abstract GOOL class knows its Platform. This is in provision for multi-platform compilation, where different pieces of the abstract GOOL tree compile to different Platforms.

  • In the JavaRecognizer there are some strange Otd classes. These are classes that return classes like TypeList, TypeMap,... why go through this intermediate step, and not replace the instantiations new Otd() by new TypeList(), new TypeMap...? These instances of Otd are used as values in a map. The purpose if this map is to get called with a key corresponding to a type (e.g. the string "List") and yield back as value, a fresh instance of the type (e.g. of TypeList), everytime it gets called. Thus, returning and Otd wrapper and doing a getType() upon it is a convenient way to do the job. It could have been handled through a case, though.