Good catch, thanks Peter.
From 880ae1a543f6518f37a32b3af5a734a7b72f9b39 Mon Sep 17 00:00:00 2001
From: Tom Bannink <tombannink(a)gmail.com>
Date: Tue, 12 Apr 2022 11:15:41 +0200
Subject: [PATCH] Fix bug in ARM memset implementation
The ARM implementation of memset has a bug when the fill-value is negative or outside the
[0, 255] range. To reproduce:
char array[256];
memset(array, -5, 256);
This is supposed to fill the array with int8 values -5, -5, -5, ... . On ARM, this does
not work because the implementation assumes the high bytes of the fill-value argument are
already zero. However in this test case they are filled with 1-bits. The aarch64 and
x86_64
implementations do not have this problem: they first convert the fill-value to an
unsigned
byte following the specification of memset.
With GCC one can use `memset(ptr, (-5 & 0xFF), size)` as a workaround, but for clang
users that does not work: clang optimizes the `& 0xFF` away because it assumes that
memset will do it.
---
libc/string/arm/memset.S | 2 ++
1 file changed, 2 insertions(+)
diff --git a/libc/string/arm/memset.S b/libc/string/arm/memset.S
index 412270f50..29c583f16 100644
--- a/libc/string/arm/memset.S
+++ b/libc/string/arm/memset.S
@@ -32,6 +32,7 @@ memset:
cmp r2, #8 @ at least 8 bytes to do?
bcc 2f
+ and r1, r1, #0xFF
lsl r3, r1, #8
orr r1, r3
lsl r3, r1, #16
@@ -68,6 +69,7 @@ memset:
mov a4, a1
cmp a3, $8 @ at least 8 bytes to do?
blo 2f
+ and a2, a2, #0xFF
orr a2, a2, a2, lsl $8
orr a2, a2, a2, lsl $16
1:
--
2.35.1