Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Issue with SConsCPPConditionalScanner misinterpreting ifdefs suffixed with L or UL [Alternate Implementation] #4629

Merged
merged 3 commits into from
Nov 11, 2024

Conversation

jcbrill
Copy link
Contributor

@jcbrill jcbrill commented Nov 10, 2024

This PR is an alternate implementation of PR #4624 based on the discussion in #4623 and #4624.

PR #4624 fails 9 of the tests included in this PR.

This PR also fixes a bug with evaluating hex integer literals.

For example:

#define HEX0 0x0
#if HEX0
#include <file411-yes>   # <=== actual
#else
#include <file411-no>    # <=== expected
#endif

This reason is the int conversion fails and is using string '0x0' which evaluates true (non-empty string) as opposed to an integer literal of 0. The integer conversion was implemented as int(expansion) rather than int(expansion, 0).

This PR replaces the integer conversion with a method to evaluate a constant expression.

Changes:

  • Preserve non-integer literals that contain valid integer specifications
  • Add binary integer specifications
  • Add octal integer specification
  • Zero (0) is considered an octal number
  • Add negative lookbehind/lookahead for number specifications (text is not word/token based)
  • Add method to evaluate constant expression for define constant expressions
  • Replace int conversion with constant evaluation expression
    • int conversion failed for hex numbers due to default base 10 [int(s)] vs unknown base [int(s, 0)]
  • Two fixes to the existing test suite were applied
  • Additional tests were added

Contributor Checklist:

  • I have created a new test or updated the unit tests to cover the new/changed functionality.
  • I have updated CHANGES.txt and RELEASE.txt (and read the README.rst).
  • I have updated the appropriate documentation

Changes:
- Preserve non-integer literals that contain valid integer specifications.
- Add binary integer specifications
- Add octal integer specification
- Zero (0) is considered an octal number
- Add negative lookbehind/lookahead for number specifications (text is not word/token based)
- Add method to evaluate constant expression for define constant expressions
- Replace int conversion with constant evaluation expression
  - int conversion failed for hex numbers due to default base 10 [int(s)] vs unknown base [int(s, 0)]
- Add additional tests
@bdbaddog
Copy link
Contributor

Add a blurb to release and changes.txt and we can merge this.

@bdbaddog
Copy link
Contributor

Wouldn't it be sufficient though to test the regex's alone without parsing a file? Simplify?

@jcbrill
Copy link
Contributor Author

jcbrill commented Nov 10, 2024

Wouldn't it be sufficient though to test the regex's alone without parsing a file? Simplify?

Modifying the regex revealed underlying issues with the implementation and the way the tests were written. The example in the description above has nothing to do with the regexes (i.e., there is no suffix) but is a manifestation of a failure to convert a hex string to integer without specifying the optional base.

There are a number of issues and potential issues with the implementation that I need to write up. I'm guessing there is a sizable gap between how one thinks the code is working and how it actually works.

The expression evaluation for #if is not recursive so only the first level of replacement is performed which may then yield another string that should be expanded. Depending on the way the test is written, that could result in the string effectively being considered "True".

@jcbrill
Copy link
Contributor Author

jcbrill commented Nov 10, 2024

Many of the tests are for the eval_constant_expression() replacement for int().

@bdbaddog
Copy link
Contributor

Wouldn't it be sufficient though to test the regex's alone without parsing a file? Simplify?

Modifying the regex revealed underlying issues with the implementation and the way the tests were written. The example in the description above has nothing to do with the regexes (i.e., there is no suffix) but is a manifestation of a failure to convert a hex string to integer without specifying the optional base.

There are a number of issues and potential issues with the implementation that I need to write up. I'm guessing there is a sizable gap between how one thinks the code is working and how it actually works.

The expression evaluation for #if is not recursive so only the first level of replacement is performed which may then yield another string that should be expanded. Depending on the way the test is written, that could result in the string effectively being considered "True".

Since none of these have been detected (or at least not reported) thus far in the wild. It's probably safe to assume that it's existing functionality (when used) has been sufficient for most.

That said, improvements are always welcome, and as usual @jcbrill 's contributions are well thought out and implemented!

Add and CHANGES/RELEASE and we'll get this one merged.

@jcbrill
Copy link
Contributor Author

jcbrill commented Nov 11, 2024

Quick-and-dirty notes concerning implementation so they are not forgotten.

Known/Potential issues and TODO items for cpp.py:

  • The regex substitution for integer literals strips the optional suffix. It is possible that a macro argument is a valid integer literal but the macro implementation is using the argument as token pasting/concatenation. In this case, stripping the suffix is invalid. This is likely either very difficult or possibly impossible to handle based on the current implementation.
  • The cpp namespace dictionary may need to add symbols true and false to be compatible with the c/cpp spec (i.e., if true and false are not in the namespace dictionary, add them).
  • In some contexts, and undefined preprocessor symbol is treated as 0 where in the SCons implementation, an undefined symbol causes eval() to fail. A UserDict overriding __getitem__ might be necessary.
  • For a defined symbol whose value is a literal integer, the literal integer string is converted to an integer in the namespace dictionary. If the same symbol is then used with token pasting/concatenation, the evaluation most likely fails due trying to "add" a string and an integer.
  • Need to verify if the "stringify" operator (i.e., #) is supported and if not, is there a reasonable implementation.
  • Unlike include resolution, the expression evaluation for #IF and #IFDEF is not recursive at present (i.e., only one expansion is performed).
  • Expansion issues (see above) can cause a string to be "evaluated" as the result of the expansion causing the test to be true. For example, if '0' is returned, the result would be true; where if 0 is returned the result would be false. There may examples of this behavior in the current test suite.
  • Not sure the existing recursive implementation handles macro calls with macros as arguments or parenthesized arguments correctly. Needs testing.

@bdbaddog
Copy link
Contributor

Looks good. Thanks!

@bdbaddog bdbaddog merged commit b4d72c7 into SCons:master Nov 11, 2024
7 of 8 checks passed
@jcbrill jcbrill deleted the jbrill-gh4623-int branch November 11, 2024 10:17
@relfock
Copy link

relfock commented Nov 11, 2024

Thanks for the fix @jcbrill :)

@mwichmann mwichmann added this to the NextRelease milestone Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Complete
Development

Successfully merging this pull request may close these issues.

4 participants