Skip to content

The purpose

Ioan CHIRIAC edited this page Jan 6, 2017 · 1 revision

The parser MUST extract any docblock, and make them available into the AST.

An docblock is a META definition of an element, so it MUST be attached to the corresponding node, and should not be an standalone element :

<?php
/**
 * @return boolean
 */
function foo() {
  /**
   * Some extra documentation
   */
  return true;
}

In the example bellow, the docblock is giving extra information about the function like the type of return value, but not the docblock inside the function.

Attached to AST nodes

We will analyse docblocks for the following elements :

  • function
  • class
  • interface
  • trait
  • assign statements

If the docblock is before any other element, it will be ignored.

How the docBlocks are attached

To be retro-compatible and to avoid extra informations (and extra processing) if not required, the grammar parser will have an option to enable or not the parsing of these docBlocks.

var reader = require('php-parser');
reader.parser.docBlocks = true;
var ast = reader.parseEval('/** @var int */ $x = 1 + 2;');
console.log(ast);

The docBlocks property on the parser object will by default be disabled. When it's disabled, docBlocks are simply ignored.

When enabling the docBlocks parsing, they will surround the AST node with annotation informations.

['program', [
  ['doc', [
    ['var', 'int']
  ], ['assign',
    ['var', ...],
    ['op', ...]
  ]]
]]

The DOC node will contains at offset 1 an array of childs parsed from the docblock contents, and at offset 2 the surrounding AST node that should be concerned by the doc block.

The Syntax

The valid annotations syntax are :

<?php 
  /**
   * @params boolean $name Some information text
   * @return array<string>
   * @return map<string, SomeClass>
   * @author Ioan CHIRIAC <[email protected]>
   * @throws Exception
   * @deprecated
   * @table('tableName', true)
   * @table(
   *   name='tableName', 
   *   primary=true
   * )
   * @annotation1 @annotation2
   * @Target(["METHOD", "PROPERTY"])
   * @Attributes([
   *   @Attribute("stringProperty", type = "string"),
   *   @Attribute("annotProperty",  type = "SomeAnnotationClass"),
   * ])
   * @json({
   *   "key": "value",
   *   "object": { "inner": true },
   *   "list": [1, 2, 3]
   * })
   * <node>
   * Some inner multi line content
   * </node>
   */

Doctrine support

Doctrine had made a terribly bad choice by using { ... } for arrays, because thats removes the possibility use into annotation structures based on json. For this reason, the support of doctrine will not be proposed by default, and maybe? an option will be available for parsing them (and also disabling the json parsing features)

<php 
 /**
* @Annotation
* @Target({"METHOD","PROPERTY"})
* @Attributes({
*   @Attribute("stringProperty", type = "string"),
*   @Attribute("annotProperty",  type = "SomeAnnotationClass"),
* })
*/
class Foo { }

Handling parse errors

When a parse error occurs, there is 2 options, ignore them, or trigger an error. The doc blocks are used for annotation, but also for descriptions. We can not trigger an error on a bad formed comment, but if the '@' is found in the tokens, it will automatically expect a valid syntax.

If the syntax is good, an AST node will be added, if it's not, then the annotation will be considered as text.

<?php 
  /**
   * Some description text
   * and an [email protected] ...
   * @enabled  @public
   * but @flagKo
   * @flag:ko
   */
  $var = 1;

The resulting AST should be like this :

<?php
  ['doc', [
    ['text', 'Some description text'],
    ['text', 'and an [email protected] ...'],
    ['property', 'enabled', true],
    ['property', 'public', true],
    ['text', 'but @flagKo'],
    ['text', '@flag:ko']
  ], ...]

AST nodes

The doc block will contain the following nodes :

  • text : offset 1 will contains all line texts
<?php
  /** 
   * Hello World
   * Second line
   */
  ['doc', [
    ['text', 'Hello World'],
    ['text', 'Second line']
  ], ...]

NOTE : Each empty line will be ignored

  • property: offset 1 will contain the property name, and next each word will be stored on an individual offset
<?php
  /**
   * @deprecated
   * @throws Exception
   * @return boolean Returns false if can't process
   */
  ['doc', [
    ['property', 'deprecated'],
    ['property', 'throws', 'Exception'],
    ['property', 'return', 'boolean', 'Returns', 'false', 'if' ...]
  ], ...]
  • tag: offset 1 will contain the tag name, offset 2 will contain each property, and offset 3 the node inner contents
<?php
  /**
   * <code lang="php" bugfix>
   * if (true) return false;
   * </code>
   */
  ['doc', [
    [
      'tag', 
      'code', 
      [
        ['lang', 'php'], ['bugfix']
      ], 
      'if (true) return false;'
    ],
  ], ...]
  • method : offset 1 will contain the function name, and offset 2 will contain an array with each argument
<?php
  /**
   * @name(option1, option2)
   * @multi(
   *  line1,
   *  line2
   * )
   */
  ['doc', [
    ['method', 'name', ['option1', 'option2']],
    ['method', 'multi', ['line1', 'line2']] 
  ], ...]

AST Elements

To make things easy by defaut identifiers are stored as strings, but in doc blocks we can also have more complex structures :

  • string : a string is a quoted text
<?php
  /**
   * @something("line1\nline2")
   */
  ['doc', [
    ['method', 'something', [
      ['string', 'line1\nline2']
    ]]
  ], ...]
  • number: a numeric value
<?php
  /**
   * @something(123)
   */
  ['doc', [
    ['method', 'something', [
      ['number', 123]
    ]]
  ], ...]
  • generic: a generic definition array<string>
<?php
  /**
   * @something(array<string,object>)
   */
  ['doc', [
    ['method', 'something', [
      ['generic', 'array', ['string', 'object']]
    ]]
  ], ...]
  • assign : property=value
<?php
  /**
   * @Attribute("annotProperty",  type = "SomeAnnotationClass")
   */
  ['doc', [
    ['method', 'Attribute', [
      ['string', 'annotProperty'],
      ['assign', 'type', ['string', 'SomeAnnotationClass']]
    ]]
  ], ...]
  • list : [1, 2, 3]
<?php
  /**
   * @Target(["METHOD", "PROPERTY"])
   */
  ['doc', [
    ['method', 'Target', [
      ['list', [
        ['string', 'METHOD'],
        ['string', 'PROPERTY']
      ]]
    ]]
  ], ...]
  • object : a json like definition { property: value }
<?php
  /**
   * @json({
   *   "key": "value",
   *   "object": { "inner": true },
   *   "list": [1, 2, 3]
   * })
   */
  ['doc', [
    ['method', 'json', [
      ['object', {
        'key': ['string', 'value'],
        'object': ['object', {
           'inner': ['const', 'true']
         }],
        'list': ['list', [
          ['number', 1], ['number', 2], ['number', 3]
        ]]
      }]
    ]]
  ], ...]
  • constant : a constant value like true, null ...
<?php
  /**
   * @flag true
   */
  ['doc', [
    ['property', 'flag', ['const', 'true']]
  ], ...]

List of constants (not case sensitive) : true, false, null

AST Structure

Similar with node positions, annotations nodes are optionnal so they will decorate each documented node :

The offset 2 of the node is the documentated AST node :

Sample AST with documentation enabled :

['doc', [
  ['text', 'Function description']
], [
  'function', ...
]]

Sample AST with documentation and position enabled :

['position', 
  [3, 2, 20],
  [5, 2, 50],
  ['doc', [
    ['text', 'Function description']
  ], [
    'function', ...
  ]]
]

Note that the position will correspond at the function position, and not at the documentation position which is irrelevant into any code analysis

Readings :