Fix Issue with SConsCPPConditionalScanner misinterpreting ifdefs suffixed with L or UL [Alternate Implementation] #4629

jcbrill · 2024-11-10T16:45:02Z

This PR is an alternate implementation of PR #4624 based on the discussion in #4623 and #4624.

PR #4624 fails 9 of the tests included in this PR.

This PR also fixes a bug with evaluating hex integer literals.

For example:

#define HEX0 0x0
#if HEX0
#include <file411-yes>   # <=== actual
#else
#include <file411-no>    # <=== expected
#endif

This reason is the int conversion fails and is using string '0x0' which evaluates true (non-empty string) as opposed to an integer literal of 0. The integer conversion was implemented as int(expansion) rather than int(expansion, 0).

This PR replaces the integer conversion with a method to evaluate a constant expression.

Changes:

Preserve non-integer literals that contain valid integer specifications
Add binary integer specifications
Add octal integer specification
Zero (0) is considered an octal number
Add negative lookbehind/lookahead for number specifications (text is not word/token based)
Add method to evaluate constant expression for define constant expressions
Replace int conversion with constant evaluation expression
- int conversion failed for hex numbers due to default base 10 [int(s)] vs unknown base [int(s, 0)]
Two fixes to the existing test suite were applied
Additional tests were added

Contributor Checklist:

I have created a new test or updated the unit tests to cover the new/changed functionality.
I have updated CHANGES.txt and RELEASE.txt (and read the README.rst).
I have updated the appropriate documentation

Changes: - Preserve non-integer literals that contain valid integer specifications. - Add binary integer specifications - Add octal integer specification - Zero (0) is considered an octal number - Add negative lookbehind/lookahead for number specifications (text is not word/token based) - Add method to evaluate constant expression for define constant expressions - Replace int conversion with constant evaluation expression - int conversion failed for hex numbers due to default base 10 [int(s)] vs unknown base [int(s, 0)] - Add additional tests

bdbaddog · 2024-11-10T23:20:19Z

Add a blurb to release and changes.txt and we can merge this.

bdbaddog · 2024-11-10T23:21:41Z

Wouldn't it be sufficient though to test the regex's alone without parsing a file? Simplify?

jcbrill · 2024-11-10T23:43:19Z

Wouldn't it be sufficient though to test the regex's alone without parsing a file? Simplify?

Modifying the regex revealed underlying issues with the implementation and the way the tests were written. The example in the description above has nothing to do with the regexes (i.e., there is no suffix) but is a manifestation of a failure to convert a hex string to integer without specifying the optional base.

There are a number of issues and potential issues with the implementation that I need to write up. I'm guessing there is a sizable gap between how one thinks the code is working and how it actually works.

The expression evaluation for #if is not recursive so only the first level of replacement is performed which may then yield another string that should be expanded. Depending on the way the test is written, that could result in the string effectively being considered "True".

jcbrill · 2024-11-10T23:47:53Z

Many of the tests are for the eval_constant_expression() replacement for int().

bdbaddog · 2024-11-10T23:58:11Z

Wouldn't it be sufficient though to test the regex's alone without parsing a file? Simplify?

Modifying the regex revealed underlying issues with the implementation and the way the tests were written. The example in the description above has nothing to do with the regexes (i.e., there is no suffix) but is a manifestation of a failure to convert a hex string to integer without specifying the optional base.

There are a number of issues and potential issues with the implementation that I need to write up. I'm guessing there is a sizable gap between how one thinks the code is working and how it actually works.

The expression evaluation for #if is not recursive so only the first level of replacement is performed which may then yield another string that should be expanded. Depending on the way the test is written, that could result in the string effectively being considered "True".

Since none of these have been detected (or at least not reported) thus far in the wild. It's probably safe to assume that it's existing functionality (when used) has been sufficient for most.

That said, improvements are always welcome, and as usual @jcbrill 's contributions are well thought out and implemented!

Add and CHANGES/RELEASE and we'll get this one merged.

jcbrill · 2024-11-11T01:41:23Z

Quick-and-dirty notes concerning implementation so they are not forgotten.

Known/Potential issues and TODO items for cpp.py:

The regex substitution for integer literals strips the optional suffix. It is possible that a macro argument is a valid integer literal but the macro implementation is using the argument as token pasting/concatenation. In this case, stripping the suffix is invalid. This is likely either very difficult or possibly impossible to handle based on the current implementation.
The cpp namespace dictionary may need to add symbols true and false to be compatible with the c/cpp spec (i.e., if true and false are not in the namespace dictionary, add them).
In some contexts, and undefined preprocessor symbol is treated as 0 where in the SCons implementation, an undefined symbol causes eval() to fail. A UserDict overriding __getitem__ might be necessary.
For a defined symbol whose value is a literal integer, the literal integer string is converted to an integer in the namespace dictionary. If the same symbol is then used with token pasting/concatenation, the evaluation most likely fails due trying to "add" a string and an integer.
Need to verify if the "stringify" operator (i.e., #) is supported and if not, is there a reasonable implementation.
Unlike include resolution, the expression evaluation for #IF and #IFDEF is not recursive at present (i.e., only one expansion is performed).
Expansion issues (see above) can cause a string to be "evaluated" as the result of the expansion causing the test to be true. For example, if '0' is returned, the result would be true; where if 0 is returned the result would be false. There may examples of this behavior in the current test suite.
Not sure the existing recursive implementation handles macro calls with macros as arguments or parenthesized arguments correctly. Needs testing.

bdbaddog · 2024-11-11T05:04:31Z

Looks good. Thanks!

relfock · 2024-11-11T14:31:41Z

Thanks for the fix @jcbrill :)

bdbaddog mentioned this pull request Nov 10, 2024

Fix Issue with SConsCPPConditionalScanner misinterpreting ifdefs suffixed with L or UL #4624

Closed

3 tasks

jcbrill added 2 commits November 10, 2024 19:49

Merge branch 'master' into jbrill-gh4623-int

80ab029

Update CHANGES.txt and RELEASE.txt

e96cf68

jcbrill mentioned this pull request Nov 11, 2024

SConsCPPConditionalScanner fails evaluate nested defines #4623

Closed

bdbaddog merged commit b4d72c7 into SCons:master Nov 11, 2024
7 of 8 checks passed

jcbrill deleted the jbrill-gh4623-int branch November 11, 2024 10:17

mwichmann added the scanner label Nov 11, 2024

mwichmann added this to the NextRelease milestone Nov 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Issue with SConsCPPConditionalScanner misinterpreting ifdefs suffixed with L or UL [Alternate Implementation] #4629

Fix Issue with SConsCPPConditionalScanner misinterpreting ifdefs suffixed with L or UL [Alternate Implementation] #4629

jcbrill commented Nov 10, 2024 •

edited by mwichmann

Loading

bdbaddog commented Nov 10, 2024

bdbaddog commented Nov 10, 2024

jcbrill commented Nov 10, 2024

jcbrill commented Nov 10, 2024

bdbaddog commented Nov 10, 2024

jcbrill commented Nov 11, 2024 •

edited

Loading

bdbaddog commented Nov 11, 2024

relfock commented Nov 11, 2024

Fix Issue with SConsCPPConditionalScanner misinterpreting ifdefs suffixed with L or UL [Alternate Implementation] #4629

Fix Issue with SConsCPPConditionalScanner misinterpreting ifdefs suffixed with L or UL [Alternate Implementation] #4629

Conversation

jcbrill commented Nov 10, 2024 • edited by mwichmann Loading

Contributor Checklist:

bdbaddog commented Nov 10, 2024

bdbaddog commented Nov 10, 2024

jcbrill commented Nov 10, 2024

jcbrill commented Nov 10, 2024

bdbaddog commented Nov 10, 2024

jcbrill commented Nov 11, 2024 • edited Loading

bdbaddog commented Nov 11, 2024

relfock commented Nov 11, 2024

jcbrill commented Nov 10, 2024 •

edited by mwichmann

Loading

jcbrill commented Nov 11, 2024 •

edited

Loading