[uclibc-ng-devel] Re: [PATCH] fix possible overflow in pointer arithmetics strnlen()

7 Jan 2025

      On Tue, 7 Jan 2025 at 18:07, Frank Mehnert
frank.mehnert@kernkonzept.com wrote:
...
On Dienstag, 7. Januar 2025 17:38:52 MEZ Kjetil Oftedal wrote:
...
On Tue, 7 Jan 2025 at 17:09, Frank Mehnert
frank.mehnert@kernkonzept.com wrote:
...
On Dienstag, 7. Januar 2025 16:45:03 MEZ Kjetil Oftedal wrote:
...
On Tue, 7 Jan 2025 at 16:19, Frank Mehnert
frank.mehnert@kernkonzept.com wrote:
...
On Dienstag, 7. Januar 2025 15:55:28 MEZ Kjetil Oftedal wrote:
...
On Tue, 7 Jan 2025 at 15:33, Frank Mehnert
frank.mehnert@kernkonzept.com wrote:
> [...]
>
> With offset=1 and x=0xffffffff'fffffff, y will 0, so y does not belong to
> the same object as x. There is no other value of x which could lead to a
> result y<x with offset=1!
>
> The comparison tests the wrap around and adapts end_ptr in that case.
>
> Again: If there was a wrap around, then the pointer arithmetic operation
> was wrong (undefined behavior). Hence, we can remove the test plus the code
> which is executed if the test succeeds.
[...]
I see the point. I just don't agree with the conclusion.
As I see it the LLVM group is claiming that if there is any possiblity for UB,
even if it is dependent on input arguments, then it is always UB, and
the compiler can
remove code as it sees fit. (Which will be a nightmare for security code)
I admit I thought so some time ago but in my opinion you do a wrong
conclusion:
There is a test in the code.
The result of the test can _only_ be true if the code before the test
triggered undefined behavior.
In other words, if the test succeeded, then we can be sure that the test
compared two pointers belonging to different objects -- which is undefined
behavior.
Therefore, the compiler is fine removing the test because the compiler does
not support the case where the test result is true.
If the test result is false, everything is fine, both objects may or may not
point to different objects, but the compiler assumes that they do. But: If
the test result is false, end_ptr does not need an adaption, therefore that
line can be optimized out.
[...]
Let us flip the variables around and review:
E.g.
if ( str < end_ptr )
instead of
if ( end_ptr < str )
Should the compiler then always evaluate the clause to true?
end_ptr might still be lower than str due to overflow in the artihmetic.
Not sure what you mean by "then always evaluate the clause to true".
Are you asking if the compiler may assume that (str < end_ptr) is always
true because only this is "defined behavior"?
No, the compiler cannot assume that. But this is no contradiction to what
I said before.
The compiler does the following steps for the original problem,
(end_ptr < str):

Generate code for the test.
The test result can either be true or false. The compiler generates
code for both cases.
No code needs to be generated for the case "false".
Generate code for the case "true".
Optimize the code. The compiler observes that the "false" case can
only happen if the previous calculation triggered undefined behavior
(str and end_ptr point to different objects).

In your case (str < end_ptr), the compiler still needs to generate code for
the test because the compiler must assume that both pointers (str and end_ptr)
belong to the same object, therefore the test is valid.
In the original case (end_ptr < str), the compiler knows for sure that both
pointers belong to different objects!
[...]
Is this true for all cases for strings?
...
In the original case (end_ptr < str), the compiler knows for sure that
both pointers belong to different objects!

const char* str = "Hello World!"
const char* sub_str = strstr(str, "Wor");
strnlen(sub_str, X);
For some values of X end_ptr is still pointing to a valid array entry
for the original "Hello World" string,
even if it is not within the "World!" substring. Is it then not
pointing to a valid array of objects?
(E.g if X is the unsigned representation of -1,-2,...-6)
OK, I think you have a point.
Basically this derives into the question if adding a huge offset to a pointer
is the same as subtracting a small offset from a pointer. So far I couldn't
find any C++ rule related to this problem.
Kind regards,
Frank
Dr.-Ing. Frank Mehnert, frank.mehnert@kernkonzept.com, +49-351-41 883 224
Kernkonzept GmbH.  Sitz: Dresden.  Amtsgericht Dresden, HRB 31129.
Geschäftsführer: Dr.-Ing. Michael Hohmuth
Hi,
I think it is purely academic as this point though :)
The change is probably fine. Just a bit annoyed that clang forces
changes to the code
in a lot of projects, instead of clang accepting defacto standard patterns.
Best regards,
Kjetil Oftedal

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

[uclibc-ng-devel] Re: [PATCH] fix possible overflow in pointer arithmetics strnlen()

strnlen(sub_str, X);

Frank