Hello,
I think I've run into a rather odd bug on a big-endian MIPS platform while trying to hand-assemble a MIPS-II ISA netboot image built from a uclibc-ng chroot. In my netboot, I need to include xfsprogs, but this has a dependency on the 'valloc' function call. So in uclibc-ng, I enabled CONFIG_UCLIBC_SUSV2_LEGACY to enable that function, and rebuilt uclibc-ng. This fixes the xfsprogs build, but it very subtly breaks busybox's ash shell.
After rebuilding uclibc-ng, then rebuilding busybox statically/multicall, if you run /bin/ash with a malformed argument or give it a script to execute that doesn't have the execute bit set, you get a SIGSEGV:
Fudging up the argument syntax to /bin/ash: octane / # /bin/ash "-c" /bin/ash: -c requires an argument Segmentation fault
Via a non-executable script "x.sh", we start with this sample: octane / # cat ./x.sh #!/bin/ash echo "foo!"
If "x.sh" has the executable bit set, we're all good: octane / # ls -l ./x.sh -rwxr-xr-x 1 root root 24 Oct 12 01:57 ./x.sh octane / # /bin/ash -c ./x.sh foo!
But if we turn off the executable bit... octane / # chmod -x ./x.sh octane / # ls -l ./x.sh -rw-r--r-- 1 root root 24 Oct 12 01:57 ./x.sh octane / # /bin/ash -c ./x.sh /bin/ash: ./x.sh: Permission denied Segmentation fault
The only backtrace I can get out of it after rebuilding uclibc-ng and busybox with debugging is this (generated via the fudged argument example):
Program received signal SIGSEGV, Segmentation fault. 0x00000000 in ?? () (gdb) bt #0 0x00000000 in ?? () #1 0x00452278 in __GI__longjmp_unwind (env=0x7ffeed58, val=1) at libpthread/nptl/sysdeps/unix/sysv/linux/jmp-unwind.c:30 #2 0x004061e4 in __libc_longjmp (env=0x7ffeed58, val=1) at libc/sysdeps/linux/common/longjmp.c:29 #3 0x0050185c in raise_exception (e=1) at shell/ash.c:448 #4 0x00501f00 in ash_vmsg_and_raise (cond=1, msg=0x60294d <bb_msg_requires_arg> "%s requires an argument", ap=0x7ffeece4) at shell/ash.c:1232 #5 0x00501f4c in ash_msg_and_raise_error (msg=0x60294d <bb_msg_requires_arg> "%s requires an argument") at shell/ash.c:1243 #6 0x0051a918 in procargs (argv=0x7ffeeff4) at shell/ash.c:13009 #7 0x0051afe4 in ash_main (argc=2, argv=0x7ffeeff4) at shell/ash.c:13158 #8 0x0047e320 in run_applet_no_and_exit (applet_no=9, argv=0x7ffeeff4) at libbb/appletlib.c:774 #9 0x0047e370 in run_applet_and_exit (name=0x7ffef130 "ash", argv=0x7ffeeff4) at libbb/appletlib.c:781 #10 0x0047e484 in main (argc=2, argv=0x7ffeeff4) at libbb/appletlib.c:838
Line #30 in jmp-unwind.c leads me to a really old uclibc bug, #3919: https://bugs.busybox.net/show_bug.cgi?id=3919
But further investigation reveals that the null_not_ptr() check introduced by the patch in that bug is already present in uclibc-ng in the patched spots, plus a few new locations. So either I've run into a new area of the code that needs a similar change, or I'm chasing the wrong rabbit down the wrong hole and the bug lies elsewhere, e.g., in busybox (hinted at by the SIGSEGV only when chmod -x on the script).
I am aware that CONFIG_UCLIBC_SUSV2_LEGACY introduces an ABI compatibility, but it appears to remainder of the chroot userland used to build the netboot operates normally. Only /bin/ash seems to have an issue.
No idea if instead, I need to get xfsprogs off of using valloc (given XFS' age as a filesystem, it might need this anyways). I can pursue that avenue if needed, but I think I've stumbled onto a really obscure bug here that still may need looking into.
Anything else I can provide to help chase this down?
On 10/12/2016 03:53, Joshua Kinard wrote:
Hello,
I think I've run into a rather odd bug on a big-endian MIPS platform while trying to hand-assemble a MIPS-II ISA netboot image built from a uclibc-ng chroot.
[snip]
PS, I forgot to add, this is using uclibc-ng-1.0.18 and busybox-1.24.2.
Hi Joshua, Joshua Kinard wrote,
I think I've run into a rather odd bug on a big-endian MIPS platform while trying to hand-assemble a MIPS-II ISA netboot image built from a uclibc-ng chroot. In my netboot, I need to include xfsprogs, but this has a dependency on the 'valloc' function call. So in uclibc-ng, I enabled CONFIG_UCLIBC_SUSV2_LEGACY to enable that function, and rebuilt uclibc-ng. This fixes the xfsprogs build, but it very subtly breaks busybox's ash shell.
After rebuilding uclibc-ng, then rebuilding busybox statically/multicall, if you run /bin/ash with a malformed argument or give it a script to execute that doesn't have the execute bit set, you get a SIGSEGV:
Fudging up the argument syntax to /bin/ash: octane / # /bin/ash "-c" /bin/ash: -c requires an argument Segmentation fault
Via a non-executable script "x.sh", we start with this sample: octane / # cat ./x.sh #!/bin/ash echo "foo!"
If "x.sh" has the executable bit set, we're all good: octane / # ls -l ./x.sh -rwxr-xr-x 1 root root 24 Oct 12 01:57 ./x.sh octane / # /bin/ash -c ./x.sh foo!
But if we turn off the executable bit... octane / # chmod -x ./x.sh octane / # ls -l ./x.sh -rw-r--r-- 1 root root 24 Oct 12 01:57 ./x.sh octane / # /bin/ash -c ./x.sh /bin/ash: ./x.sh: Permission denied Segmentation fault
After sorting out my last bootup problems (missing N32/O32 binary support in the kernel), I can confirm that the bug is fixed in uCLibc-ng 1.0.19:
root@openadk:/root # ash -c /tmp/c.sh ash: /tmp/c.sh: Permission denied root@openadk:/root # chmod 755 . root@openadk:/root # chmod 755 /tmp/c.sh root@openadk:/root # ash -c /tmp/c.sh foo! root@openadk:/root # ash -c ash: -c requires an argument root@openadk:/root # ls /lib
Please update to 1.0.19, thanks Waldemar
On 10/23/2016 23:55, Waldemar Brodkorb wrote:
Hi Joshua, Joshua Kinard wrote,
I think I've run into a rather odd bug on a big-endian MIPS platform while trying to hand-assemble a MIPS-II ISA netboot image built from a uclibc-ng chroot. In my netboot, I need to include xfsprogs, but this has a dependency on the 'valloc' function call. So in uclibc-ng, I enabled CONFIG_UCLIBC_SUSV2_LEGACY to enable that function, and rebuilt uclibc-ng. This fixes the xfsprogs build, but it very subtly breaks busybox's ash shell.
After rebuilding uclibc-ng, then rebuilding busybox statically/multicall, if you run /bin/ash with a malformed argument or give it a script to execute that doesn't have the execute bit set, you get a SIGSEGV:
Fudging up the argument syntax to /bin/ash: octane / # /bin/ash "-c" /bin/ash: -c requires an argument Segmentation fault
Via a non-executable script "x.sh", we start with this sample: octane / # cat ./x.sh #!/bin/ash echo "foo!"
If "x.sh" has the executable bit set, we're all good: octane / # ls -l ./x.sh -rwxr-xr-x 1 root root 24 Oct 12 01:57 ./x.sh octane / # /bin/ash -c ./x.sh foo!
But if we turn off the executable bit... octane / # chmod -x ./x.sh octane / # ls -l ./x.sh -rw-r--r-- 1 root root 24 Oct 12 01:57 ./x.sh octane / # /bin/ash -c ./x.sh /bin/ash: ./x.sh: Permission denied Segmentation fault
After sorting out my last bootup problems (missing N32/O32 binary support in the kernel), I can confirm that the bug is fixed in uCLibc-ng 1.0.19:
root@openadk:/root # ash -c /tmp/c.sh ash: /tmp/c.sh: Permission denied root@openadk:/root # chmod 755 . root@openadk:/root # chmod 755 /tmp/c.sh root@openadk:/root # ash -c /tmp/c.sh foo! root@openadk:/root # ash -c ash: -c requires an argument root@openadk:/root # ls /lib
Please update to 1.0.19, thanks Waldemar
Sorry for the delay, got tied up with things.
I'd already switched the busybox build to a shared library from a static one, which worked around the issue for me, but I am building 1.0.19 now. I'll let you know if any additional issues crop up.
And for the record, on your last e-mail, an RM52XX O2 needs -march=rm5200 to gcc. Stock -mips4 or -march=r5000 won't hurt, either.
Now to just figure out the libtirpc bit...
Hi Joshua, Joshua Kinard wrote,
On 10/23/2016 23:55, Waldemar Brodkorb wrote:
Hi Joshua, Joshua Kinard wrote,
I think I've run into a rather odd bug on a big-endian MIPS platform while trying to hand-assemble a MIPS-II ISA netboot image built from a uclibc-ng chroot. In my netboot, I need to include xfsprogs, but this has a dependency on the 'valloc' function call. So in uclibc-ng, I enabled CONFIG_UCLIBC_SUSV2_LEGACY to enable that function, and rebuilt uclibc-ng. This fixes the xfsprogs build, but it very subtly breaks busybox's ash shell.
After rebuilding uclibc-ng, then rebuilding busybox statically/multicall, if you run /bin/ash with a malformed argument or give it a script to execute that doesn't have the execute bit set, you get a SIGSEGV:
Fudging up the argument syntax to /bin/ash: octane / # /bin/ash "-c" /bin/ash: -c requires an argument Segmentation fault
Via a non-executable script "x.sh", we start with this sample: octane / # cat ./x.sh #!/bin/ash echo "foo!"
If "x.sh" has the executable bit set, we're all good: octane / # ls -l ./x.sh -rwxr-xr-x 1 root root 24 Oct 12 01:57 ./x.sh octane / # /bin/ash -c ./x.sh foo!
But if we turn off the executable bit... octane / # chmod -x ./x.sh octane / # ls -l ./x.sh -rw-r--r-- 1 root root 24 Oct 12 01:57 ./x.sh octane / # /bin/ash -c ./x.sh /bin/ash: ./x.sh: Permission denied Segmentation fault
After sorting out my last bootup problems (missing N32/O32 binary support in the kernel), I can confirm that the bug is fixed in uCLibc-ng 1.0.19:
root@openadk:/root # ash -c /tmp/c.sh ash: /tmp/c.sh: Permission denied root@openadk:/root # chmod 755 . root@openadk:/root # chmod 755 /tmp/c.sh root@openadk:/root # ash -c /tmp/c.sh foo! root@openadk:/root # ash -c ash: -c requires an argument root@openadk:/root # ls /lib
Please update to 1.0.19, thanks Waldemar
Sorry for the delay, got tied up with things.
I'd already switched the busybox build to a shared library from a static one, which worked around the issue for me, but I am building 1.0.19 now. I'll let you know if any additional issues crop up.
And for the record, on your last e-mail, an RM52XX O2 needs -march=rm5200 to gcc. Stock -mips4 or -march=r5000 won't hurt, either.
Okay.
Now to just figure out the libtirpc bit...
Buildroot or OpenADK is always a good source for cross-compile issues. Take a look at the patches.
Another possibility would be to ty the internal ipv4 only RPC implementation in uClibc-ng, but I have'nt used it for a long time.
Not sure if rpcbind works with it. I think last time I sed it was with good old portmap.
best regards Waldemar
On 10/31/2016 03:13, Waldemar Brodkorb wrote:
Hi Joshua, Joshua Kinard wrote,
On 10/23/2016 23:55, Waldemar Brodkorb wrote:
Hi Joshua, Joshua Kinard wrote,
[snip]
After sorting out my last bootup problems (missing N32/O32 binary support in the kernel), I can confirm that the bug is fixed in uCLibc-ng 1.0.19:
root@openadk:/root # ash -c /tmp/c.sh ash: /tmp/c.sh: Permission denied root@openadk:/root # chmod 755 . root@openadk:/root # chmod 755 /tmp/c.sh root@openadk:/root # ash -c /tmp/c.sh foo! root@openadk:/root # ash -c ash: -c requires an argument root@openadk:/root # ls /lib
Please update to 1.0.19, thanks Waldemar
Sorry for the delay, got tied up with things.
I'd already switched the busybox build to a shared library from a static one, which worked around the issue for me, but I am building 1.0.19 now. I'll let you know if any additional issues crop up.
And for the record, on your last e-mail, an RM52XX O2 needs -march=rm5200 to gcc. Stock -mips4 or -march=r5000 won't hurt, either.
Okay.
Now to just figure out the libtirpc bit...
Buildroot or OpenADK is always a good source for cross-compile issues. Take a look at the patches.
Another possibility would be to ty the internal ipv4 only RPC implementation in uClibc-ng, but I have'nt used it for a long time.
Not sure if rpcbind works with it. I think last time I sed it was with good old portmap.
best regards Waldemar
I can get NFS to work with just portmap running, but it throws an odd, yet non-fatal, error when mounting the remote share (don't have the actual error text available at the moment). I was hoping that having rpcbind running as well as portmap would eliminate that. However, current rpcbind needs libtirpc to build, and that fails on uclibc-ng because of a missing header file "rpcsvc/yp_prot.h".
Buildroot posted a patch to fix this in libtirpc itself, just haven't had some free time to test it yet: http://lists.busybox.net/pipermail/buildroot/2015-July/133890.html
Can't use the internal (minimal?) RPC mechanism, either. I had to enable the full RPC stack in order for portmap to actually build.