Ensure correctness of formatted output conversion #169

DrXiao · 2024-12-01T12:29:00Z

Consider the following code printing the several formatted integers:

void print_conversion(int num) {
    printf("%#032x\n%#32x\n%032x\n%32x\n%#x\n%x\n", num, num, num, num, num, num);
    printf("%#012x\n%#12x\n%012x\n%12x\n", num, num, num, num);
    printf("%#032o\n%#32o\n%032o\n%32o\n%#o\n%o\n", num, num, num, num, num, num);
    printf("%#012o\n%#12o\n%012o\n%12o\n", num, num, num, num);
    printf("%032d\n%32d\n%d\n", num, num, num);
    printf("%012d\n%12d\n", num, num);
}
int main() {
    int a = 0xFF00CDE1, b = 0xFFFFF204, c = 65548, d = 0;
    print_conversion(a);
    print_conversion(b);
    print_conversion(c);
    print_conversion(d);
    return 0;
}

I found some incorrect results for certain outputs when using specifier, field width and alternate form:

1. Incorrect conversion from integers to hexadecimal strings

Consider a 32-bit integer -16724511 (0xFF00CDE1 in hexadecimal), if using printf() with the formatted string "%#12x", it outputs ffffff00cde1, and the result has redundent f characters. However, because the integer is just 4 bytes, the result should be ff00cde1 containing 4 whitespace characters.

2. Incorrect conversion from integers to octal strings

As the previous point, consider the negative integer -3580 (0xFFFFF204 in hexadecimal), it produces 77777771004 when using "%o" to print this integer.

Because the function accepts a 32-bit integer, the most significant digit of the octal should be obtained by the leftmost 2 bits only, but the original implementation doesn't consider this point and produce incorrect results.

Thus, the correct result is 37777771004, which obtains 3 from the leftmost 2 bits of the integer.

3. Prepended characters may be placed in incorrect positions

When using field width and flag characters, it may output unexpected strings for certain cases.
printf("%#12x\n", 0xff00cde1) --> "0x ff00cde1\n"
printf("%#12o\n", 0x1000c) --> "0 200014\n"
printf("%12d\n", -3580) --> "0000000-3580\n"

Conclusion

Through the given test code, the current built-in C library has been found to have the above issues, so this pull request resolves them and ensures correctness when using formatted output conversion.

ChAoSUnItY · 2024-12-01T15:35:21Z

lib/c.c

+     * Finally, the remaining 2 bits are processed after
+     * the loop.
+     * */
+    for (int i = 0; i < (sizeof(int) << 3) / 3; i++) {


Perhaps you can lift the rhs expression of condition expression out of loop to squeeze last bit of performance?

ChinYikMing · 2024-12-01T22:20:59Z

tests/driver.sh

+           0"
+
+try_output 0 "$fmt_ans" << EOF
+void print_conversion(int num) {


nit: use printf_conversion for consistency.

The original implementation of __str_base16(), which converts integers to hexadecimal strings, causes errors when processing negative integers. It produces redundant 'f' characters at the beginning of the converted string due to sign extension and results in unexpected output when calling printf() or sprintf(). Therefore, this commit fixes it and ensure correct conversion.

The implementation of __str_base8() produces incorrect conversions when handling negative integers. For example, a 32-bit negative integer '0xfffff204', it will produce '7' at the most significant digit due to sign extension. However, it should produce '3' by using the leftmost 2 bits only. Therefore, this commit modifies the conversion process to fix the mentioned problem.

The implementation of __format() produces incorrect results when using field width and flag characters. For example, consider the following printf calls, where the prepended characters are placed in incorrect positions. printf("%#12x\n", 0xff00cde1) --> "0x ff00cde1\n" printf("%#12o\n", 0x1000c) --> "0 200014\n" printf("%12d\n", -3580) --> "0000000-3580\n" Thus, this commit improves __format() to ensure correctness and to make the results consistent with GCC, and adds a test case to validate the changes.

jserv · 2024-12-02T13:18:57Z

Thank @DrXiao for contributing!

jserv requested review from ChinYikMing, ChAoSUnItY and vacantron December 1, 2024 12:50

ChAoSUnItY reviewed Dec 1, 2024

View reviewed changes

ChinYikMing reviewed Dec 2, 2024

View reviewed changes

DrXiao added 3 commits December 2, 2024 21:03

DrXiao force-pushed the fix-libc branch from 76affb2 to b310322 Compare December 2, 2024 13:10

jserv merged commit 18e005d into sysprog21:master Dec 2, 2024
4 checks passed

DrXiao deleted the fix-libc branch December 2, 2024 13:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensure correctness of formatted output conversion #169

Ensure correctness of formatted output conversion #169

DrXiao commented Dec 1, 2024

ChAoSUnItY Dec 1, 2024

ChinYikMing Dec 1, 2024

jserv commented Dec 2, 2024

Ensure correctness of formatted output conversion #169

Ensure correctness of formatted output conversion #169

Conversation

DrXiao commented Dec 1, 2024

1. Incorrect conversion from integers to hexadecimal strings

2. Incorrect conversion from integers to octal strings

3. Prepended characters may be placed in incorrect positions

Conclusion

ChAoSUnItY Dec 1, 2024

Choose a reason for hiding this comment

ChinYikMing Dec 1, 2024

Choose a reason for hiding this comment

jserv commented Dec 2, 2024