Skip to content
This repository has been archived by the owner on Jul 20, 2018. It is now read-only.

Migrate to a real PHP parser #195

Open
ockham opened this issue Nov 10, 2014 · 5 comments
Open

Migrate to a real PHP parser #195

ockham opened this issue Nov 10, 2014 · 5 comments

Comments

@ockham
Copy link
Contributor

ockham commented Nov 10, 2014

(Note: I'm a Code Wrangler in trial starting on 2014-11-10, and this is the project I've been assigned by @nb.)

This issue is supposed to serve as a place for discussion and as a roadmap, which I'll update based on my progress and the comments I get. So please share your thoughts on this, your input is valuable to me as I try to lay this out!

Quoting @nb:

The end goals are two:

  • Make writing rules easier
  • Make the parsing layer more stable and maintainable

I'm currently acquainting myself with the code. In order to replace parts of it, I will have to write tests (#3) to cover all checks in order to make sure I'm not breaking things along the way without even noticing. OTOH, some checks (especially some regular expressions) are currently just insufficient, either missing dangerous stuff (#194), or producing false positives (#18).
Both can probably be fixed by using an actual parser, or at least a tokenizer.

Preliminary research on PHP parsers written in PHP yields this SO question. The most promising candidate found there is https://github.com/nikic/PHP-Parser, which seems actively maintained and licensed under the 3-clause BSD license (which is GPL compatible). The stable version only supports PHP >= 5.3. Question: Is this sufficient, or do we need to support 5.2, too?

@nickdaugherty
Copy link
Contributor

PHP-Parser does look promising, and more user-friendly than parsing the token stream directly.

Only supporting PHP >= 5.3 is perfectly fine.

@sboisvert
Copy link
Contributor

If we go with the "matching wpcom" we can go up to 5.5, Not that I think many things go to 5.5 but I love some of the 5.4 features (okay mostly short array syntax)

@joshbetz
Copy link
Contributor

If we go with the "matching wpcom" we can go up to 5.5

There are still some 5.4 servers.

@joshbetz
Copy link
Contributor

some checks (especially some regular expressions) are currently just insufficient, either missing dangerous stuff, or producing false positives.

Because of the way we use the scanner, false positives are really bad. False negatives (missing stuff) isn't great, but we still review all code before deploying to WordPress.com, so it's more acceptable. Obviously more accurate tests are always welcome.

Sounds like an awesome project!

@ockham ockham mentioned this issue Nov 14, 2014
@ockham ockham mentioned this issue Dec 2, 2014
44 tasks
@fklein-lu
Copy link
Contributor

@ockham: Can we consider this as closed?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants