Hello there!
After successfuly building my OpenADK image and launching into an mksh shell, I noticed that some programs behaved very, very weirdly. Especially in terms of time.
For instance, take this `ps` (coreutils) output:
# ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 99 Mar02 pts/0 1158050441-07:00:15 /bin/mksh -il root 40 1 99 Mar02 pts/0 1158050441-07:00:15 ps -ef
It never changes, ever. However, `date` reports the time correctly:
root@d6571a5cd187:/ # date "+%s" 1709916714
But Git and even GCC all hang and behave weirdly. Git can't even fetch a basic repo, because it believes the time in the index to be totally wrong.
Do you have an idea what this is caused by?
And yes, I did change my timezone with the TZ environment variable, too:
root@d6571a5cd187:/ # export TZ="Europe/Berlin" root@d6571a5cd187:/ # date "+%s" 1709916972 root@d6571a5cd187:/ # ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 99 Mar02 pts/0 1158050441-07:00:15 /bin/mksh -il root 42 1 99 Mar02 pts/0 1158050441-07:00:15 ps -ef
The container runs with --previleged, so I thought to just change the time:
root@d6571a5cd187:/ # ntpclient -h ptbtime1.ptb.de -s 45357 61036.066 22575.0 24.0 -16383.9 15.3 4056424 root@d6571a5cd187:/ # ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 99 Mar02 pts/0 1158050441-07:00:15 /bin/mksh -il root 44 1 99 Mar02 pts/0 1158050441-07:00:15 ps -ef
My only lead is that part of the time is wrong...
I attached my OpenADK and Kernel config, and even extracted the generated uClibc-ng config from the build too (well, from the toolchain build).
Also, I did try to just mount the /etc files into the container as well, but that did not help apparently.
Any idea what's going wrong here or how I could debug this? Thanks!
Kind regards, Ingwie
Hi Kevin, Kevin Ingwersen wrote,
Hello there!
After successfuly building my OpenADK image and launching into an mksh shell, I noticed that some programs behaved very, very weirdly. Especially in terms of time.
For instance, take this `ps` (coreutils) output:
# ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 99 Mar02 pts/0 1158050441-07:00:15 /bin/mksh -il root 40 1 99 Mar02 pts/0 1158050441-07:00:15 ps -ef
It never changes, ever. However, `date` reports the time correctly:
root@d6571a5cd187:/ # date "+%s" 1709916714
But Git and even GCC all hang and behave weirdly. Git can't even fetch a basic repo, because it believes the time in the index to be totally wrong.
Do you have an idea what this is caused by?
And yes, I did change my timezone with the TZ environment variable, too:
root@d6571a5cd187:/ # export TZ="Europe/Berlin" root@d6571a5cd187:/ # date "+%s" 1709916972 root@d6571a5cd187:/ # ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 99 Mar02 pts/0 1158050441-07:00:15 /bin/mksh -il root 42 1 99 Mar02 pts/0 1158050441-07:00:15 ps -ef
The container runs with --previleged, so I thought to just change the time:
What container technology do you use? Is your base system still a Debian system?
root@d6571a5cd187:/ # ntpclient -h ptbtime1.ptb.de -s 45357 61036.066 22575.0 24.0 -16383.9 15.3 4056424 root@d6571a5cd187:/ # ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 99 Mar02 pts/0 1158050441-07:00:15 /bin/mksh -il root 44 1 99 Mar02 pts/0 1158050441-07:00:15 ps -ef
My only lead is that part of the time is wrong...
I attached my OpenADK and Kernel config, and even extracted the generated uClibc-ng config from the build too (well, from the toolchain build).
Also, I did try to just mount the /etc files into the container as well, but that did not help apparently.
Any idea what's going wrong here or how I could debug this? Thanks!
You could install strace and see what might be wrong. We recently added time64 support for 32Bit architectures, but this should not influence any 64 Bit architecture like riscv64.
I try to reproduce it in Qemu.
best regards Waldemar
Hello Waldemar,
Waldemar Brodkorb wbx@openadk.org schrieb am Freitag, 8. März 2024 um 18:32:
Hi Kevin, Kevin Ingwersen wrote,
Hello there!
After successfuly building my OpenADK image and launching into an mksh shell, I noticed that some programs behaved very, very weirdly. Especially in terms of time.
For instance, take this `ps` (coreutils) output:
# ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 99 Mar02 pts/0 1158050441-07:00:15 /bin/mksh -il root 40 1 99 Mar02 pts/0 1158050441-07:00:15 ps -ef
It never changes, ever. However, `date` reports the time correctly:
root@d6571a5cd187:/ # date "+%s" 1709916714
But Git and even GCC all hang and behave weirdly. Git can't even fetch a basic repo, because it believes the time in the index to be totally wrong.
Do you have an idea what this is caused by?
And yes, I did change my timezone with the TZ environment variable, too:
root@d6571a5cd187:/ # export TZ="Europe/Berlin" root@d6571a5cd187:/ # date "+%s" 1709916972 root@d6571a5cd187:/ # ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 99 Mar02 pts/0 1158050441-07:00:15 /bin/mksh -il root 42 1 99 Mar02 pts/0 1158050441-07:00:15 ps -ef
The container runs with --previleged, so I thought to just change the time:
What container technology do you use? Is your base system still a Debian system?
I use Podman/crun as I couldn't get the normal docker/containerd to run. As far as I am aware, this effectively spawns but a namespace and configures cgroups to run the processes in. That said, I am not too familiar with the full internals of neither Podman nor Docker.
root@d6571a5cd187:/ # ntpclient -h ptbtime1.ptb.de -s 45357 61036.066 22575.0 24.0 -16383.9 15.3 4056424 root@d6571a5cd187:/ # ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 99 Mar02 pts/0 1158050441-07:00:15 /bin/mksh -il root 44 1 99 Mar02 pts/0 1158050441-07:00:15 ps -ef
My only lead is that part of the time is wrong...
I attached my OpenADK and Kernel config, and even extracted the generated uClibc-ng config from the build too (well, from the toolchain build).
Also, I did try to just mount the /etc files into the container as well, but that did not help apparently.
Any idea what's going wrong here or how I could debug this? Thanks!
You could install strace and see what might be wrong. We recently added time64 support for 32Bit architectures, but this should not influence any 64 Bit architecture like riscv64.
Ah yes, strace! Thanks for the reminder, I will add that and see what happens.
I try to reproduce it in Qemu.
best regards Waldemar
Looking forward to it! The chip I use, JH7110, has these extensions enabled:
# /nvme/opt/cpu_features/out/list_cpu_features arch : risc-v vendor : sifive microarchitecture : u74-mc flags : A,C,D,F,M,RV64I,Zicsr,Zifencei (via Google's cpu_features)
This might aid in setting up the Qemu box.
Kind regards and thanks for looking into this, Kevin Ingwersen
Hello Waldemar,
while adding strace to my rootfs, I also decided to add the uClibc-ng testsuite as well and let that run. I noticed a whole host of failing tests which are likely the result of misconfigured floating point operations ... i think. Honestly, this is quite bizzare.
For one, tst-barrier3 completely hangs and gets killed by strace entirely:
execve("./tst-barrier3", ["./tst-barrier3"], 0x3fc2524b30 /* 15 vars */) = 0 readlinkat(AT_FDCWD, "/proc/self/exe", "/usr/lib/uclibc-ng-test/test/npt"..., 4096) = 46 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3fa22bd000 newfstatat(AT_FDCWD, "/etc/ld.so.cache", 0x3fe198c280, 0) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/libc.so.0", O_RDONLY) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/lib/libc.so.0", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0755, st_size=750184, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3fa22bc000 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\363\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 mmap(NULL, 790528, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3fa21fb000 mmap(0x3fa21fb000, 742396, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0x3fa21fb000 mmap(0x3fa22b1000, 6980, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0xb5000) = 0x3fa22b1000 mmap(0x3fa22b3000, 35608, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x3fa22b3000 close(3) = 0 munmap(0x3fa22bc000, 4096) = 0 newfstatat(AT_FDCWD, "/lib/ld-uClibc.so.0", {st_mode=S_IFREG|0755, st_size=30080, ...}, 0) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3fa22bc000 openat(AT_FDCWD, "/dev/urandom", O_RDONLY) = 3 read(3, "e;\232\252\220}u\350", 8) = 8 close(3) = 0 mprotect(0x13000, 4096, PROT_READ) = 0 mprotect(0x3fa22b1000, 4096, PROT_READ) = 0 mprotect(0x3fa22c9000, 4096, PROT_READ) = 0 set_tid_address(0x3fa22bc0d0) = 1847 set_robust_list(0x3fa22bc0e0, 24) = 0 rt_sigaction(SIGRTMIN, {sa_handler=0x3fa2264766, sa_mask=[ILL], sa_flags=SA_SIGINFO}, NULL, 8) = 0 rt_sigaction(SIGRT_1, {sa_handler=0x3fa22647f2, sa_mask=[ILL], sa_flags=SA_RESTART|SA_SIGINFO}, NULL, 8) = 0 rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0 prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0 ioctl(0, TCGETS, {c_iflag=ICRNL|IXON, c_oflag=NL0|CR0|TAB0|BS0|VT0|FF0|OPOST|ONLCR, c_cflag=B38400|CS8|CREAD, c_lflag=ISIG|ICANON|ECHO|ECHOE|ECHOK|IEXTEN|ECHOCTL|ECHOKE, ...}) = 0 ioctl(1, TCGETS, {c_iflag=ICRNL|IXON, c_oflag=NL0|CR0|TAB0|BS0|VT0|FF0|OPOST|ONLCR, c_cflag=B38400|CS8|CREAD, c_lflag=ISIG|ICANON|ECHO|ECHOE|ECHOK|IEXTEN|ECHOCTL|ECHOKE, ...}) = 0 brk(NULL) = 0x15000 brk(0x16000) = 0x16000 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=NULL) = 1848 rt_sigaction(SIGALRM, {sa_handler=0x11974, sa_mask=[], sa_flags=SA_RESTART}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0 setitimer(ITIMER_REAL, {it_interval={tv_sec=0, tv_usec=0}, it_value={tv_sec=60, tv_usec=0}}, {it_interval={tv_sec=0, tv_usec=0}, it_value={tv_sec=0, tv_usec=0}}) = 0 rt_sigaction(SIGINT, {sa_handler=0x11974, sa_mask=[], sa_flags=SA_RESTART}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0 wait4(1848, 0x3fe198d978, 0, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- kill(-1848, SIGKILL) = 0 kill(1848, SIGKILL) = 0 wait4(1848, 0x3fe198d4c4, WNOHANG|WSTOPPED, NULL) = 0 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=100000000}, NULL) = ? ERESTART_RESTARTBLOCK (Interrupted by signal) --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=1848, si_uid=0, si_status=SIGKILL, si_utime=0, si_stime=5 /* 0.05 s */} --- restart_syscall(<... resuming interrupted clock_nanosleep ...>) = 0 wait4(1848, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGKILL}], WNOHANG|WSTOPPED, NULL) = 1848 write(2, "Timed out: killed the child proc"..., 36Timed out: killed the child process ) = 36 exit_group(1) = ? +++ exited with 1 +++
And the other weird ones were:
FAIL test-ifloat got 1 expected 0 testing float (inline functions) Failure: Test: cos (M_PI_6l * 4.0) == -0.5 Result: is: -5.00000059604644775391e-01 -0.50000005960464477539 should be: -5.00000000000000000000e-01 -0.5 difference: 5.96046447753906250000e-08 5.9604644775390625e-08 ulp : 1.0000 max.ulp : 0.0000 Maximal error of `cos' is : 1 ulp accepted: 0 ulp
There's a lot more like that - but I have a hunch that my toolchain got misconfigured - so, I will look into that again in more detail.
I will attach the log to this mail; whilst it seems the process got stuck, it is still a nice indicator. And, as for git, it just gets stuck. No additional info is produced by strace; it just hangs...
Hopefuly this helps!
Kind regards, Ingwie
Waldemar Brodkorb wbx@openadk.org schrieb am Freitag, 8. März 2024 um 18:32:
Hi Kevin, Kevin Ingwersen wrote,
Hello there!
After successfuly building my OpenADK image and launching into an mksh shell, I noticed that some programs behaved very, very weirdly. Especially in terms of time.
For instance, take this `ps` (coreutils) output:
# ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 99 Mar02 pts/0 1158050441-07:00:15 /bin/mksh -il root 40 1 99 Mar02 pts/0 1158050441-07:00:15 ps -ef
It never changes, ever. However, `date` reports the time correctly:
root@d6571a5cd187:/ # date "+%s" 1709916714
But Git and even GCC all hang and behave weirdly. Git can't even fetch a basic repo, because it believes the time in the index to be totally wrong.
Do you have an idea what this is caused by?
And yes, I did change my timezone with the TZ environment variable, too:
root@d6571a5cd187:/ # export TZ="Europe/Berlin" root@d6571a5cd187:/ # date "+%s" 1709916972 root@d6571a5cd187:/ # ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 99 Mar02 pts/0 1158050441-07:00:15 /bin/mksh -il root 42 1 99 Mar02 pts/0 1158050441-07:00:15 ps -ef
The container runs with --previleged, so I thought to just change the time:
What container technology do you use? Is your base system still a Debian system?
root@d6571a5cd187:/ # ntpclient -h ptbtime1.ptb.de -s 45357 61036.066 22575.0 24.0 -16383.9 15.3 4056424 root@d6571a5cd187:/ # ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 99 Mar02 pts/0 1158050441-07:00:15 /bin/mksh -il root 44 1 99 Mar02 pts/0 1158050441-07:00:15 ps -ef
My only lead is that part of the time is wrong...
I attached my OpenADK and Kernel config, and even extracted the generated uClibc-ng config from the build too (well, from the toolchain build).
Also, I did try to just mount the /etc files into the container as well, but that did not help apparently.
Any idea what's going wrong here or how I could debug this? Thanks!
You could install strace and see what might be wrong. We recently added time64 support for 32Bit architectures, but this should not influence any 64 Bit architecture like riscv64.
I try to reproduce it in Qemu.
best regards Waldemar
Hi Kevin,
see here for some known to fail testcases: https://downloads.uclibc-ng.org/reports/1.0.46/REPORT.riscv64.libc.uClibc-ng...
22 test failures. And that is without locale.
The riscv64 port is not perfect, yet. It is only tested with rv64imac. Init from busybox segfaults with rv64imadc.
I can reproduce your problems with gcc and git inside Qemu. Need to dig deeper.
best regards Waldemar
Hello Waldemar,
great to hear that this isn't just an issue on my end, phew! :) If there is anything I can test with my JH7110/VisionFive2, do let me know. In the meantime, I will, in parallel, construct a musl-based system instead; that seems to be working quite well. Though I would prefer to use uclibc in the long term, since I have now spent a good amount of time reading the code while trying to find hints of the root of the issues. No luck so far though. That said, code is well structured and easily readable - good job!
Kind regards, Ingwie
PS. I just noticed Protonmail continiously used the wrong default email address... Sorry to be bothering the bouncer so much!
Waldemar Brodkorb wbx@openadk.org schrieb am Samstag, 9. März 2024 um 16:02:
Hi Kevin,
see here for some known to fail testcases: https://downloads.uclibc-ng.org/reports/1.0.46/REPORT.riscv64.libc.uClibc-ng...
22 test failures. And that is without locale.
The riscv64 port is not perfect, yet. It is only tested with rv64imac. Init from busybox segfaults with rv64imadc.
I can reproduce your problems with gcc and git inside Qemu. Need to dig deeper.
best regards Waldemar
Hi Kevin, Ingwie wrote,
Hello Waldemar,
great to hear that this isn't just an issue on my end, phew! :) If there is anything I can test with my JH7110/VisionFive2, do let me know. In the meantime, I will, in parallel, construct a musl-based system instead; that seems to be working quite well. Though I would prefer to use uclibc in the long term, since I have now spent a good amount of time reading the code while trying to find hints of the root of the issues. No luck so far though. That said, code is well structured and easily readable - good job!
Yeah, I rechecked with aarch64 and there gcc and git is working. Risc64 TLS support seems broken. Musl is a good choice.
PS. I just noticed Protonmail continiously used the wrong default email address... Sorry to be bothering the bouncer so much!
Thanks.
best regards Waldemar
Hi Kevin, Ingwie wrote,
Hello Waldemar,
great to hear that this isn't just an issue on my end, phew! :) If there is anything I can test with my JH7110/VisionFive2, do let me know. In the meantime, I will, in parallel, construct a musl-based system instead; that seems to be working quite well. Though I would prefer to use uclibc in the long term, since I have now spent a good amount of time reading the code while trying to find hints of the root of the issues. No luck so far though. That said, code is well structured and easily readable - good job!
Kind regards, Ingwie
You could try uclibc-ng git master when you want. We (sorear and I) fixed all of the riscv64 issues! gcc and git works now in Qemu.
Most of the hard work to make riscv64 stable where made by sorear.
Thanks to him!
best regards Waldemar