On 10/13/2016 15:23, Waldemar Brodkorb wrote:
Hi Joshua, Joshua Kinard wrote,
On 10/12/2016 04:27, Waldemar Brodkorb wrote:
hi,
i am visiting elce, be back on monday. can you try 1.0.17 please. 1.0.18 introduced some interesting regressions i habe not covered in my tests. i have solved most of them, but not all is pushed, yet.
looks related to the patch i sent out to lance last week on the list. best regards Waldemar
Von meinem iPhone gesendet
Am 12.10.2016 um 09:59 schrieb Joshua Kinard kumba@gentoo.org:
On 10/12/2016 03:53, Joshua Kinard wrote: Hello,
I think I've run into a rather odd bug on a big-endian MIPS platform while trying to hand-assemble a MIPS-II ISA netboot image built from a uclibc-ng chroot.
[snip]
PS, I forgot to add, this is using uclibc-ng-1.0.18 and busybox-1.24.2.
(Resending to the actual list, sorry Waldemar!)
Unfortunately, my base root for building the netboot is built on 1.0.18 at the moment. It'd take about two days to do a full rebuild.
So you are natively compiling the netboot system? Are you using static linked binaries, otherwise you could may be just change the shared library and ld.so.
I am doing native compiles on an SGI Octane (which I currently maintain the patchset for out-of-tree). I was only using static linking with Busybox, which is why ash was producing the flaw. I tried implementing the fix described in old uClibc Bug #3919, but that had no effect and the SIGSEGV is still reproducible.
For now, I've simply switched Busybox to use shared linking to resolve the problem, which should be fine with the netboot, since all of its utilities are built from the same chroot. Just trying to work up a fix for compiling rpcbind now, since a dependent library, libtirpc requires a non-existant header "rpcsvc/yp_prot.h", but there's a patch on the OpenWRT ML that might fix this.
Does uclibc-ng have a working Bugzilla yet? Might seem prudent to copy the details of Bug #3919 from old uClibc since it might be the same bug or related.
That said, I think I might have an idea. The bit of code cited in Bug #3919 for old uclibc only defines and uses null_not_ptr in __uClibc_main.c, but it looks like the code in jmp-unwind.c does not. So I am going to try moving the null_not_ptr definition to a header somewhere, mark it non-static (maybe inline?), then try using it on the __pthread_cleanup_upto test and see if that might resolve the issue.
Sound sane?
I pushed the other open regression fixes. May be you could try with latest git master. On what hardware I could reproduce the issue? (I have some old SGI mips devices in my lab..)
I am running Gentoo for my builds, so testing master isn't easy for me at the moment, since to be sure of things, I'd have to run a full rebuild and that would take a day or two due to gcc's compile time (~16 hours on a dual 600MHz R14000 CPU).
What kind of SGI gear do you have available and what CPUs are in them? I can vouch that SGI O2 (IP32) with R5K and RM7K CPUs work (not R10K/R12K), SGI Octane (IP30), and marginally, SGI Origin 2000/Onyx2 (IP27) should all at least work with current Linux, although IP27 and IP30 will require an external set of patches I have (and IP27 may lock up at random).
The older SGI Indy and Indog2 series w/ R4K/R5K CPUs should also still work, but I have not tested those recently due to a bad RTC chip in my Indy. Other Indigo2 variants may or may not work depending on CPU.