On Tue, 7 Jan 2025 at 15:33, Frank Mehnert frank.mehnert@kernkonzept.com wrote:
On Dienstag, 7. Januar 2025 14:54:54 MEZ Kjetil Oftedal wrote:
On Tue, 7 Jan 2025 at 14:39, Frank Mehnert frank.mehnert@kernkonzept.com wrote:
Hi Kjetil,
On Dienstag, 7. Januar 2025 14:00:16 MEZ Kjetil Oftedal wrote:
How can the compiler in this case assume that endptr and str do not point to the same array object?
ISO C99 6.5,8 Relational operators 5) ... "If the expression P points to an element of an array object and the expression Q points to the last element of the same array object, the pointer expression Q+1 compares greater than P. In all other cases, the behavior is undefined"
How does the compiler at compile time know that endptr which is equivalent is to str[maxlen] is not within the range of string declared by str? Or in the terms of the above standard, how does it know that endptr is not Q+1?
[...]
The compiler does not need to know!
Let's consider the pointer 'x', the offset 'offs' and
y = x + offset
with y < x (because offset is huge).
So if y < (x + offset) then y points to a different object by definition. Hence, the compiler is allowed to remove this test.
Frank
Dr.-Ing. Frank Mehnert, frank.mehnert@kernkonzept.com, +49-351-41 883 224
Kernkonzept GmbH. Sitz: Dresden. Amtsgericht Dresden, HRB 31129. Geschäftsführer: Dr.-Ing. Michael Hohmuth
Correspondingly y = x + offset with y < x (Offset is small E.g 1).
So if y < (x + offset) and offset is 1, then the Q+1 rule holds, and the compiler is not allowed to remove this test as it is well defined.
This cannot be assumed at compile time, thus the test needs to be done to respect the requirements of the C standards.
I think you mix things up: The error here is the pointer arithmetic, not the comparison.
With offset=1 and x=0xffffffff'fffffff, y will 0, so y does not belong to the same object as x. There is no other value of x which could lead to a result y<x with offset=1!
The comparison tests the wrap around and adapts end_ptr in that case.
Again: If there was a wrap around, then the pointer arithmetic operation was wrong (undefined behavior). Hence, we can remove the test plus the code which is executed if the test succeeds.
Frank
Dr.-Ing. Frank Mehnert, frank.mehnert@kernkonzept.com, +49-351-41 883 224
Kernkonzept GmbH. Sitz: Dresden. Amtsgericht Dresden, HRB 31129. Geschäftsführer: Dr.-Ing. Michael Hohmuth
Hi,
I see the point. I just don't agree with the conclusion. As I see it the LLVM group is claiming that if there is any possiblity for UB, even if it is dependent on input arguments, then it is always UB, and the compiler can remove code as it sees fit. (Which will be a nightmare for security code)
And this is what I disagree with. If the code have defined behaviour by the C standard, for some input arguments, then it must treat the code as being well behaved and omit code as such.
If I have some special machine where I can use every single byte of the address space.. (For a 32-bit machine that could provide the code with the ability to declare a 4GB continuous string/char array) And in this case should not str < end_of_str and str+X < end_of_str always be true, even with overflow wraparounds?
Best regards, Kjetil Oftedal