[PATCH 0/5] add static PIE support for xtensa

Hello, this series does a couple cleanups in the static PIE code and xtensa code and enables static PIE support for the xtensa architecture. Max Filippov (5): static pie: fix building static PDE static pie: replicate PERFORM_BOOTSTRAP_GOT conditions from dl-startup.c xtensa: make GOT protection adjustment conditional xtensa: drop ARCH_NEEDS_BOOTSTRAP_RELOCS xtensa: add static pie support extra/Configs/Config.in | 4 +++- ldso/ldso/xtensa/dl-startup.h | 19 ++++++++++++--- ldso/ldso/xtensa/dl-sysdep.h | 3 --- libc/misc/internals/reloc_static_pie.c | 9 +++++--- libc/sysdeps/linux/xtensa/crt1.S | 27 ++++++++++++++++++++++ libpthread/nptl/sysdeps/generic/libc-tls.c | 7 ++++-- 6 files changed, 57 insertions(+), 12 deletions(-) -- 2.30.2

When uclibc is built with static PIE support the _dl_load_base variable shared between the libc-tls.c and reloc_static_pie.c creates the dependency that requires linking reloc_static_pie.o into static position-dependent executables resulting in the following build errors: gcc -static test.c -o test ...ld: ...usr/lib/libc.a(reloc_static_pie.os):(.text+0x0): undefined reference to `_DYNAMIC' Move _dl_load_base definition to libc-tls.c to resolve this dependency and fix static PDE build. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> --- libc/misc/internals/reloc_static_pie.c | 4 ++-- libpthread/nptl/sysdeps/generic/libc-tls.c | 7 +++++-- 2 files changed, 7 insertions(+), 4 deletions(-) diff --git a/libc/misc/internals/reloc_static_pie.c b/libc/misc/internals/reloc_static_pie.c index c0027de6fe06..ce42cb9b30df 100644 --- a/libc/misc/internals/reloc_static_pie.c +++ b/libc/misc/internals/reloc_static_pie.c @@ -25,7 +25,7 @@ #include <dl-startup.h> #endif -ElfW(Addr) _dl_load_base = NULL; +extern ElfW(Addr) _dl_load_base; void reloc_static_pie (ElfW(Addr) load_addr); @@ -107,4 +107,4 @@ reloc_static_pie(ElfW(Addr) load_addr) #endif } _dl_load_base = load_addr; -} \ No newline at end of file +} diff --git a/libpthread/nptl/sysdeps/generic/libc-tls.c b/libpthread/nptl/sysdeps/generic/libc-tls.c index 54f3cb0c72b5..7cfe9ac1a85c 100644 --- a/libpthread/nptl/sysdeps/generic/libc-tls.c +++ b/libpthread/nptl/sysdeps/generic/libc-tls.c @@ -117,6 +117,10 @@ init_static_tls (size_t memsz, size_t align) GL(dl_tls_static_nelem) = GL(dl_tls_max_dtv_idx); } +#if !defined(__FDPIC__) && !defined(SHARED) && defined(STATIC_PIE) +ElfW(Addr) _dl_load_base; +#endif + void __libc_setup_tls (size_t tcbsize, size_t tcbalign); void __libc_setup_tls (size_t tcbsize, size_t tcbalign) @@ -143,8 +147,7 @@ __libc_setup_tls (size_t tcbsize, size_t tcbalign) #else initimage = (void *) phdr->p_vaddr; #if !defined(SHARED) && defined(STATIC_PIE) - extern ElfW(Addr) _dl_load_base; - initimage += _dl_load_base; + initimage += _dl_load_base; #endif #endif align = phdr->p_align; -- 2.30.2

Code in reloc_static_pie.c is largely based on DL_START function from dl-startup.c, but it's missing the condition that disables the rest of relocation when PERFORM_BOOTSTRAP_GOT macro is defined. As a result when building reloc_static_pie.c for xtensa it applies relative relocations twice. Replicate PERFORM_BOOTSTRAP_GOT conditions from dl-startup.c in reloc_static_pie.c to allow using PERFORM_BOOTSTRAP_GOT macro as is. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> --- libc/misc/internals/reloc_static_pie.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/libc/misc/internals/reloc_static_pie.c b/libc/misc/internals/reloc_static_pie.c index ce42cb9b30df..9f5366cc6ab7 100644 --- a/libc/misc/internals/reloc_static_pie.c +++ b/libc/misc/internals/reloc_static_pie.c @@ -54,6 +54,8 @@ reloc_static_pie(ElfW(Addr) load_addr) #endif +#if !defined(PERFORM_BOOTSTRAP_GOT) || defined(__avr32__) || defined(__mips__) + #if defined(ELF_MACHINE_PLTREL_OVERLAP) # define INDX_MAX 1 #else @@ -106,5 +108,6 @@ reloc_static_pie(ElfW(Addr) load_addr) } #endif } +#endif _dl_load_base = load_addr; } -- 2.30.2

Xtensa PERFORM_BOOTSTRAP_GOT macro uses mprotect to make bits of GOT writable, but noMMU linux kernel returns ENOSYS to mprotect syscalls, and syscall wrapper tries to update errno with the error code. This happens well before the relocations are done and results in writes to unrelated locations, memory corruption or protection violations. Split GOT protection update from PERFORM_BOOTSTRAP_GOT and only do it when building configuration with MMU support. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> --- ldso/ldso/xtensa/dl-startup.h | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/ldso/ldso/xtensa/dl-startup.h b/ldso/ldso/xtensa/dl-startup.h index db223feadc90..19955ffc77a5 100644 --- a/ldso/ldso/xtensa/dl-startup.h +++ b/ldso/ldso/xtensa/dl-startup.h @@ -88,12 +88,11 @@ __asm__ ( /* Function calls are not safe until the GOT relocations have been done. */ #define NO_FUNCS_BEFORE_BOOTSTRAP -#define PERFORM_BOOTSTRAP_GOT(tpnt) \ +#if defined(__ARCH_USE_MMU__) +#define PERFORM_BOOTSTRAP_GOT_ADJUST_PROTECTION(tpnt) \ do { \ xtensa_got_location *got_loc; \ unsigned long l_addr = tpnt->loadaddr; \ - Elf32_Word relative_count; \ - unsigned long rel_addr; \ Elf32_Addr prev_got_start = 0, prev_got_end = 0; \ int x; \ \ @@ -125,7 +124,19 @@ do { \ prev_got_end - prev_got_start, \ PROT_READ | PROT_WRITE | PROT_EXEC); \ } \ +} while (0) +#else +#define PERFORM_BOOTSTRAP_GOT_ADJUST_PROTECTION(tpnt) \ +do { \ +} while (0) +#endif + +#define PERFORM_BOOTSTRAP_GOT(tpnt) \ +do { \ + Elf32_Word relative_count; \ + unsigned long rel_addr; \ \ + PERFORM_BOOTSTRAP_GOT_ADJUST_PROTECTION(tpnt); \ /* The following is a stripped down version of the code following \ the invocation of PERFORM_BOOTSTRAP_GOT in dl-startup.c. That \ code is skipped when PERFORM_BOOTSTRAP_GOT is defined, so it has \ -- 2.30.2

Xtensa does not define PERFORM_BOOTSTRAP_RELOC so it doesn't need ARCH_NEEDS_BOOTSTRAP_RELOCS definition. Remove it. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> --- ldso/ldso/xtensa/dl-sysdep.h | 3 --- 1 file changed, 3 deletions(-) diff --git a/ldso/ldso/xtensa/dl-sysdep.h b/ldso/ldso/xtensa/dl-sysdep.h index d308237d39ed..6b908989a8f1 100644 --- a/ldso/ldso/xtensa/dl-sysdep.h +++ b/ldso/ldso/xtensa/dl-sysdep.h @@ -94,9 +94,6 @@ typedef struct xtensa_got_location_struct { /* Used for error messages. */ #define ELF_TARGET "Xtensa" -/* Need bootstrap relocations */ -#define ARCH_NEEDS_BOOTSTRAP_RELOCS - struct elf_resolve; extern unsigned long _dl_linux_resolver (struct elf_resolve *, int); -- 2.30.2

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> --- extra/Configs/Config.in | 4 +++- ldso/ldso/xtensa/dl-startup.h | 2 ++ libc/misc/internals/reloc_static_pie.c | 2 +- libc/sysdeps/linux/xtensa/crt1.S | 27 ++++++++++++++++++++++++++ 4 files changed, 33 insertions(+), 2 deletions(-) diff --git a/extra/Configs/Config.in b/extra/Configs/Config.in index 43c04fd0a271..dd1beaadcee3 100644 --- a/extra/Configs/Config.in +++ b/extra/Configs/Config.in @@ -324,7 +324,9 @@ config DOPIC config STATIC_PIE bool "Add support for Static Position Independent Executables (PIE)" default n - depends on DOPIC && !UCLIBC_FORMAT_FDPIC_ELF && (TARGET_arm || TARGET_i386 || TARGET_x86_64 || TARGET_aarch64 || TARGET_mips) + depends on DOPIC && !UCLIBC_FORMAT_FDPIC_ELF && \ + (TARGET_arm || TARGET_i386 || TARGET_x86_64 || TARGET_aarch64 || \ + TARGET_mips || TARGET_xtensa) config ARCH_HAS_NO_SHARED bool diff --git a/ldso/ldso/xtensa/dl-startup.h b/ldso/ldso/xtensa/dl-startup.h index 19955ffc77a5..1babaccea7e8 100644 --- a/ldso/ldso/xtensa/dl-startup.h +++ b/ldso/ldso/xtensa/dl-startup.h @@ -7,6 +7,7 @@ * Parts taken from glibc/sysdeps/xtensa/dl-machine.h. */ +#ifndef L_rcrt1 __asm__ ( " .text\n" " .align 4\n" @@ -81,6 +82,7 @@ __asm__ ( " addi a5, a5, 8\n" " bnez a6, 3b\n" " j .Lfixup_stack_ret"); +#endif /* Get a pointer to the argv value. */ #define GET_ARGV(ARGVP, ARGS) ARGVP = (((unsigned long *) ARGS) + 1) diff --git a/libc/misc/internals/reloc_static_pie.c b/libc/misc/internals/reloc_static_pie.c index 9f5366cc6ab7..81b235a51cb3 100644 --- a/libc/misc/internals/reloc_static_pie.c +++ b/libc/misc/internals/reloc_static_pie.c @@ -21,7 +21,7 @@ #include <dl-elf.h> #include <ldso.h> -#ifdef __mips__ +#if defined(__mips__) || defined(__xtensa__) #include <dl-startup.h> #endif diff --git a/libc/sysdeps/linux/xtensa/crt1.S b/libc/sysdeps/linux/xtensa/crt1.S index efbe264c03f0..3fa14ae583a9 100644 --- a/libc/sysdeps/linux/xtensa/crt1.S +++ b/libc/sysdeps/linux/xtensa/crt1.S @@ -76,9 +76,26 @@ .global _start .type _start, @function _start: +#ifdef L_rcrt1 + .begin no-transform + call0 1f +.Lret_addr: + .end no-transform + .align 4 +1: +#endif #if defined(__XTENSA_WINDOWED_ABI__) +#ifdef L_rcrt1 + movi a6, .Lret_addr + sub a6, a0, a6 + movi a0, 0 + movi a4, reloc_static_pie + add a4, a4, a6 + callx4 a4 +#else /* Clear a0 to obviously mark the outermost frame. */ movi a0, 0 +#endif /* Load up the user's main function. */ movi a6, main @@ -106,8 +123,18 @@ _start: movi a4, __uClibc_main callx4 a4 #elif defined(__XTENSA_CALL0_ABI__) +#ifdef L_rcrt1 + mov a12, a2 + movi a2, .Lret_addr + sub a2, a0, a2 + movi a0, reloc_static_pie + add a0, a0, a2 + callx0 a0 + mov a7, a12 +#else /* Setup the shared library termination function. */ mov a7, a2 +#endif /* Load up the user's main function. */ movi a2, main -- 2.30.2

There is a lot to unpack here, so please forgive me while I try to format this in a readable and concise way. [Patch 1/5] I thought this was already submitted and merged in? It looks identical to "[PATCH v2] static pie: building static PDE", or am I mistaken? [Patch 2/5] I'm going to be honest with you, I left that check out because it only affects xtensia, so it's funny to me that it would bite me this quickly. That said, reading the code, I'm not sure why xtensia decided to call elf_machine_relative in PERFORM_BOOTSTRAP_GOT and not use the default code path with ELF_MACHINE_PLTREL_OVERLAP set to ensure only 1 iteration. I am not very familiar with xtensia, so there may be some nuance that I'm not aware of, but I would like your opinion on if we should change xtensia to conform to how every other architecture seems to perform that relocation. If we are going to keep the code the way it is, I would like to know everyone's opinion on whether or not we should change that preprocessor directive to be explicit on which architectures shouldn't do that section of code? Currently it's in an allow list format with seemingly only xtensia having a false condition, but that isn't very clear to me from an initial read. Obviously that change would belong to a separate patch and shouldn't hold this up. [Patch 3/5] looks good to me. [Patch 4/5] I think this removal further implies that we should move the elf_machine_relative call out of PERFORM_BOOTSTRAP_GOT as I suggested above. [Patch 5/5] I am getting segfaults while still in __uClibc_main when using buildroot's qemu_xtensa_lx60_defconfig and qemu-xtensa. I'm not sure the exact cause, but I would guess that some functions aren't being relocated correctly or the TLS wasn't updated correctly. It could also possibly be a problem with the quick gcc patch I wrote to compile with -static-pie. If you have a gcc patch that you think will fix this as well, I would be glad to test it. Here is the stack trace up to where it calls 0x00: #0 0x3efe58e9 in __h_errno_location () at libpthread/nptl/herrno.c:33 #1 0x3efe12aa in __uClibc_main (main=0x3efddfbc <main>, argc=0x1, argv=0x3efdb314, app_init=0x3efdde14 <_init>, app_fini=0x3eff5864 <_fini>, rtld_fini=0x0, stack_end=0x3efdb310) at libc/misc/internals/__uClibc_main.c:514 #2 0x3efdde6c in _start () On Fri, Sep 9, 2022 at 2:30 PM Max Filippov <jcmvbkbc@gmail.com> wrote:
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> --- extra/Configs/Config.in | 4 +++- ldso/ldso/xtensa/dl-startup.h | 2 ++ libc/misc/internals/reloc_static_pie.c | 2 +- libc/sysdeps/linux/xtensa/crt1.S | 27 ++++++++++++++++++++++++++ 4 files changed, 33 insertions(+), 2 deletions(-)
diff --git a/extra/Configs/Config.in b/extra/Configs/Config.in index 43c04fd0a271..dd1beaadcee3 100644 --- a/extra/Configs/Config.in +++ b/extra/Configs/Config.in @@ -324,7 +324,9 @@ config DOPIC config STATIC_PIE bool "Add support for Static Position Independent Executables (PIE)" default n - depends on DOPIC && !UCLIBC_FORMAT_FDPIC_ELF && (TARGET_arm || TARGET_i386 || TARGET_x86_64 || TARGET_aarch64 || TARGET_mips) + depends on DOPIC && !UCLIBC_FORMAT_FDPIC_ELF && \ + (TARGET_arm || TARGET_i386 || TARGET_x86_64 || TARGET_aarch64 || \ + TARGET_mips || TARGET_xtensa)
config ARCH_HAS_NO_SHARED bool diff --git a/ldso/ldso/xtensa/dl-startup.h b/ldso/ldso/xtensa/dl-startup.h index 19955ffc77a5..1babaccea7e8 100644 --- a/ldso/ldso/xtensa/dl-startup.h +++ b/ldso/ldso/xtensa/dl-startup.h @@ -7,6 +7,7 @@ * Parts taken from glibc/sysdeps/xtensa/dl-machine.h. */
+#ifndef L_rcrt1 __asm__ ( " .text\n" " .align 4\n" @@ -81,6 +82,7 @@ __asm__ ( " addi a5, a5, 8\n" " bnez a6, 3b\n" " j .Lfixup_stack_ret"); +#endif
/* Get a pointer to the argv value. */ #define GET_ARGV(ARGVP, ARGS) ARGVP = (((unsigned long *) ARGS) + 1) diff --git a/libc/misc/internals/reloc_static_pie.c b/libc/misc/internals/reloc_static_pie.c index 9f5366cc6ab7..81b235a51cb3 100644 --- a/libc/misc/internals/reloc_static_pie.c +++ b/libc/misc/internals/reloc_static_pie.c @@ -21,7 +21,7 @@ #include <dl-elf.h>
#include <ldso.h> -#ifdef __mips__ +#if defined(__mips__) || defined(__xtensa__) #include <dl-startup.h> #endif
diff --git a/libc/sysdeps/linux/xtensa/crt1.S b/libc/sysdeps/linux/xtensa/crt1.S index efbe264c03f0..3fa14ae583a9 100644 --- a/libc/sysdeps/linux/xtensa/crt1.S +++ b/libc/sysdeps/linux/xtensa/crt1.S @@ -76,9 +76,26 @@ .global _start .type _start, @function _start: +#ifdef L_rcrt1 + .begin no-transform + call0 1f +.Lret_addr: + .end no-transform + .align 4 +1: +#endif #if defined(__XTENSA_WINDOWED_ABI__) +#ifdef L_rcrt1 + movi a6, .Lret_addr + sub a6, a0, a6 + movi a0, 0 + movi a4, reloc_static_pie + add a4, a4, a6 + callx4 a4 +#else /* Clear a0 to obviously mark the outermost frame. */ movi a0, 0 +#endif
/* Load up the user's main function. */ movi a6, main @@ -106,8 +123,18 @@ _start: movi a4, __uClibc_main callx4 a4 #elif defined(__XTENSA_CALL0_ABI__) +#ifdef L_rcrt1 + mov a12, a2 + movi a2, .Lret_addr + sub a2, a0, a2 + movi a0, reloc_static_pie + add a0, a0, a2 + callx0 a0 + mov a7, a12 +#else /* Setup the shared library termination function. */ mov a7, a2 +#endif
/* Load up the user's main function. */ movi a2, main -- 2.30.2

Hi linted, thank you for taking a look. On Wed, Sep 14, 2022 at 12:23 PM linted <linted90@gmail.com> wrote:
[Patch 1/5] I thought this was already submitted and merged in? It looks identical to "[PATCH v2] static pie: building static PDE", or am I mistaken?
Yes, it's the exact same patch. I'm looking at this uClibc-ng repository: https://cgit.uclibc-ng.org/cgi/cgit/uclibc-ng.git/ and it's not there, so I included it for completeness.
[Patch 2/5] I'm going to be honest with you, I left that check out because it only affects xtensia, so it's funny to me that it would bite me this quickly. That said, reading the code, I'm not sure why xtensia decided to call elf_machine_relative in PERFORM_BOOTSTRAP_GOT and not use the default code path with ELF_MACHINE_PLTREL_OVERLAP set to ensure only 1 iteration. I am not very familiar with xtensia, so there may be some nuance that I'm not aware of, but I would like your opinion on if we should change xtensia to conform to how every other architecture seems to perform that relocation.
Yes, I stumbled here too and the same thought crossed my mind. I guess I should do another pass, try to understand why it's done that way and clean it up.
[Patch 3/5] looks good to me.
[Patch 4/5] I think this removal further implies that we should move the elf_machine_relative call out of PERFORM_BOOTSTRAP_GOT as I suggested above.
[Patch 5/5] I am getting segfaults while still in __uClibc_main when using buildroot's qemu_xtensa_lx60_defconfig and qemu-xtensa. I'm not sure the exact cause, but I would guess that some functions aren't being relocated correctly or the TLS wasn't updated correctly. It could also possibly be a problem with the quick gcc patch I wrote to compile with -static-pie. If you have a gcc patch that you think will fix this as well, I would be glad to test it.
You're right, the following recent binutils and gcc changes are required for generation of correct relocations and recognition of the -static-pie option: https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=658ba81aef5e85... https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=9e0c2696724d4d004ea189a69f... -- Thanks. -- Max

On 9/14/2022 1:23 PM, linted wrote:
[Patch 5/5] I am getting segfaults while still in __uClibc_main when using buildroot's qemu_xtensa_lx60_defconfig and qemu-xtensa. I'm not sure the exact cause, but I would guess that some functions aren't being relocated correctly or the TLS wasn't updated correctly. It could also possibly be a problem with the quick gcc patch I wrote to compile with -static-pie. If you have a gcc patch that you think will fix this as well, I would be glad to test it. I've had -static-pie binaries segfault with qemu while they ran fine on actual hardware. I was using Debian 10 so maybe that's because it's an old version of qemu, and maybe newer versions can handle this better?
Lance

[Patch 5/5] I am getting segfaults while still in __uClibc_main when using buildroot's qemu_xtensa_lx60_defconfig and qemu-xtensa. I'm not sure the exact cause, but I would guess that some functions aren't being relocated correctly or the TLS wasn't updated correctly. It could also possibly be a problem with the quick gcc patch I wrote to compile with -static-pie. If you have a gcc patch that you think will fix this as well, I would be glad to test it.
You're right, the following recent binutils and gcc changes are required for generation of correct relocations and recognition of the -static-pie option:
https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=658ba81aef5e85... <https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=658ba81aef5e85a08d67eb211a43c6db775a36b3>
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=9e0c2696724d4d004ea189a69f... <https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=9e0c2696724d4d004ea189a69f15781c7baa68e1> The binutils patch fixed the problems! I was able to make a xtensa toolchain which worked with my test programs for C and C++. On Wed, Sep 14, 2022 at 4:15 PM Lance Fredrickson <lancethepants@gmail.com> wrote:
On 9/14/2022 1:23 PM, linted wrote:
[Patch 5/5] I am getting segfaults while still in __uClibc_main when using buildroot's qemu_xtensa_lx60_defconfig and qemu-xtensa. I'm not sure the exact cause, but I would guess that some functions aren't being relocated correctly or the TLS wasn't updated correctly. It could also possibly be a problem with the quick gcc patch I wrote to compile with -static-pie. If you have a gcc patch that you think will fix this as well, I would be glad to test it. I've had -static-pie binaries segfault with qemu while they ran fine on actual hardware. I was using Debian 10 so maybe that's because it's an old version of qemu, and maybe newer versions can handle this better?
Lance _______________________________________________ devel mailing list -- devel@uclibc-ng.org To unsubscribe send an email to devel-leave@uclibc-ng.org
participants (3)
-
Lance Fredrickson
-
linted
-
Max Filippov