On Tue, 7 Jan 2025 at 18:07, Frank Mehnert frank.mehnert@kernkonzept.com wrote:
On Dienstag, 7. Januar 2025 17:38:52 MEZ Kjetil Oftedal wrote:
On Tue, 7 Jan 2025 at 17:09, Frank Mehnert frank.mehnert@kernkonzept.com wrote:
On Dienstag, 7. Januar 2025 16:45:03 MEZ Kjetil Oftedal wrote:
On Tue, 7 Jan 2025 at 16:19, Frank Mehnert frank.mehnert@kernkonzept.com wrote:
On Dienstag, 7. Januar 2025 15:55:28 MEZ Kjetil Oftedal wrote:
On Tue, 7 Jan 2025 at 15:33, Frank Mehnert frank.mehnert@kernkonzept.com wrote: > [...] > > With offset=1 and x=0xffffffff'fffffff, y will 0, so y does not belong to > the same object as x. There is no other value of x which could lead to a > result y<x with offset=1! > > The comparison tests the wrap around and adapts end_ptr in that case. > > Again: If there was a wrap around, then the pointer arithmetic operation > was wrong (undefined behavior). Hence, we can remove the test plus the code > which is executed if the test succeeds.
[...]
I see the point. I just don't agree with the conclusion. As I see it the LLVM group is claiming that if there is any possiblity for UB, even if it is dependent on input arguments, then it is always UB, and the compiler can remove code as it sees fit. (Which will be a nightmare for security code)
I admit I thought so some time ago but in my opinion you do a wrong conclusion:
There is a test in the code.
The result of the test can _only_ be true if the code before the test triggered undefined behavior.
In other words, if the test succeeded, then we can be sure that the test compared two pointers belonging to different objects -- which is undefined behavior.
Therefore, the compiler is fine removing the test because the compiler does not support the case where the test result is true.
If the test result is false, everything is fine, both objects may or may not point to different objects, but the compiler assumes that they do. But: If the test result is false, end_ptr does not need an adaption, therefore that line can be optimized out.
[...]
Let us flip the variables around and review: E.g. if ( str < end_ptr ) instead of if ( end_ptr < str )
Should the compiler then always evaluate the clause to true? end_ptr might still be lower than str due to overflow in the artihmetic.
Not sure what you mean by "then always evaluate the clause to true". Are you asking if the compiler may assume that (str < end_ptr) is always true because only this is "defined behavior"?
No, the compiler cannot assume that. But this is no contradiction to what I said before.
The compiler does the following steps for the original problem, (end_ptr < str):
- Generate code for the test.
- The test result can either be true or false. The compiler generates code for both cases.
- No code needs to be generated for the case "false".
- Generate code for the case "true".
- Optimize the code. The compiler observes that the "false" case can only happen if the previous calculation triggered undefined behavior (str and end_ptr point to different objects).
In your case (str < end_ptr), the compiler still needs to generate code for the test because the compiler must assume that both pointers (str and end_ptr) belong to the same object, therefore the test is valid.
In the original case (end_ptr < str), the compiler knows for sure that both pointers belong to different objects!
[...]
Is this true for all cases for strings?
In the original case (end_ptr < str), the compiler knows for sure that both pointers belong to different objects!
const char* str = "Hello World!" const char* sub_str = strstr(str, "Wor");
strnlen(sub_str, X);
For some values of X end_ptr is still pointing to a valid array entry for the original "Hello World" string, even if it is not within the "World!" substring. Is it then not pointing to a valid array of objects? (E.g if X is the unsigned representation of -1,-2,...-6)
OK, I think you have a point.
Basically this derives into the question if adding a huge offset to a pointer is the same as subtracting a small offset from a pointer. So far I couldn't find any C++ rule related to this problem.
Kind regards,
Frank
Dr.-Ing. Frank Mehnert, frank.mehnert@kernkonzept.com, +49-351-41 883 224
Kernkonzept GmbH. Sitz: Dresden. Amtsgericht Dresden, HRB 31129. Geschäftsführer: Dr.-Ing. Michael Hohmuth
Hi,
I think it is purely academic as this point though :)
The change is probably fine. Just a bit annoyed that clang forces changes to the code in a lot of projects, instead of clang accepting defacto standard patterns.
Best regards, Kjetil Oftedal