Bring Latest Linux Kernel to MINI2440

17 minute read

Poring Linux kernel to board is much easier than the age of version 2.6, I want to make an update my school projects, part of this procedure involves the kernel upgrade, the uImage created by mkimage now stated legacy, the new image format FIT (Flattened Image Tree) will be adopted.

Minimal Support for MINI2440

From version 2.6.31, mini2440 is supported by mainline kernel, in order to refresh my knowledge on kernel from learning the changes since 2.6, I decided not to use it, and started with board support for smdk2440, in fact, this method still makes it pretty simple, and not like a real porting work.

The first step is to copy board support of smdk2440, two files are involved:

arch/arm/mach-s3c24xx/Kconfig
arch/arm/mach-s3c24xx/mach-smdk2440.c

Add an entry to Kconfig:

+config MACH_MINI2440
+       bool "MINI2440 board"
+       select S3C2440_XTAL_12000000
+

and rename mach-smdk2440.c to mach-mini2440.c, replace all smdk2440 to mini2440, pay attention to MACHINE_START part:

-MACHINE_START(S3C2440, "SMDK2440")
+MACHINE_START(MINI2440, "MINI2440")
        .atag_offset    = 0x100,

        .init_irq       = s3c2440_init_irq,
-       .map_io         = smdk2440_map_io,
-       .init_machine   = smdk2440_machine_init,
-       .init_time      = smdk2440_init_time,
+       .map_io         = mini2440_map_io,
+       .init_machine   = mini2440_machine_init,
+       .init_time      = mini2440_init_time,
 MACHINE_END

If the machine id does not match with the one passed by U-Boot in r2, kernel will stuck on Starting kernel …, enable EARLY_PRINTK tells what’s going on here:

Error: invalid dtb and unrecognized/unsupported machine ID
  r1=0x000007cf, r2=0x30000100
  r2[]=05 00 00 00 01 00 41 54 00 00 00 00 00 00 00 00
Available machine support:

ID (hex)	NAME
0000016a	MINI2440

Please check your kernel config and/or bootloader.

The ID list here is for ARCH_S3C2440, the macro MACHINE_START is defined in file arch/arm/include/asm/mach/arch.h:

/*
 * Set of macros to define architecture features.  This is built into
 * a table by the linker.
 */
#define MACHINE_START(_type,_name)			\
static const struct machine_desc __mach_desc_##_type	\
 __used							\
 __attribute__((__section__(".arch.info.init"))) = {	\
	.nr		= MACH_TYPE_##_type,		\
	.name		= _name,

#define MACHINE_END				\
};

This is pretty much of needed to bring mini2440 up in code level, the remaining part is configurations, enable below options at least:

+CONFIG_ARCH_S3C24XX=y
+CONFIG_CPU_S3C2440=y
+CONFIG_MACH_MINI2440=y
+CONFIG_AEABI=y
+CONFIG_SERIAL_SAMSUNG=y
+CONFIG_SERIAL_SAMSUNG_CONSOLE=y
+CONFIG_BLK_DEV_INITRD=y
+CONFIG_TMPFS=y

BLK_DEV_INITRD is needed as we need ramdisk as rootfs.

The repository is in gitlab.

Before build kernel image let’s get ramdisk image ready, at first, we build ramdisk into kernel image for simplicity, this is done in next section.

Make RAMDisk Image

This step is only required for board bring-up, in real project, the root file system will be created by buildroot or yoctoproject, here we use a formerly baked rootfs to keep it simple:

git clone https://gitlab.com/fudongbai/rootfs.git
sudo mknod -m 660 dev/null c 1 3
sudo mknod -m 660 dev/console c 5 1
find . | cpio --quiet -H newc -o | gzip -9 -n > ../ramdisk.cpio.img

Build Kernel Image

Clone kernel code from gitlab:

git clone https://gitlab.com/fudongbai/linux-5.9.11.git

To build ramdisk into uImage, first copy the ramdisk file to the root of kernel code, and add below to defconfig:

CONFIG_INITRAMFS_SOURCE="ramdisk.cpio.img"
ARCH=arm make mini2440_dbg_defconfig
ARCH=arm make CROSS_COMPILE=arm-linux-gnueabi- uImage

The final image will be sit in arch/arm/boot directory.

U-Boot

To support ramdisk as root file system, the bootargs need to specified correctly, as we are using legacy image, support for that should also be enabled in default configuration:

diff --git a/configs/mini2440_defconfig b/configs/mini2440_defconfig
+CONFIG_USE_BOOTARGS=y
+CONFIG_BOOTARGS="console=ttySAC0,115200 root=/dev/ram0 rdinit=linuxrc"

diff --git a/include/configs/mini2440.h b/include/configs/mini2440.h
+#define CONFIG_BOOTFILE		"uImage"

+#define	CONFIG_EXTRA_ENV_SETTINGS								\
+	"ethaddr=" __stringify(CONFIG_ETHADDR) "\0"					\
+	"tftpboot="													\
+		"tftpboot $loadaddr $bootfile; bootm $loadaddr\0"

Boot uImage with separate ramdisk

This is really optional, I do this because I want to keep Linux kernel as clean as possible, to achieve this, remove ramdisk in kernel by removing below option:

-CONFIG_INITRAMFS_SOURCE="ramdisk.cpio.img"

And change EXTRA_ENV_SETTINGS in U-Boot:

-#define        CONFIG_EXTRA_ENV_SETTINGS                                                               \
-       "ethaddr=" __stringify(CONFIG_ETHADDR) "\0"                                     \
-       "tftpboot="                                                                                                     \
-               "tftpboot $loadaddr $bootfile; bootm $loadaddr\0"
+#define        CONFIG_EXTRA_ENV_SETTINGS                                               \
+       "ethaddr=" __stringify(CONFIG_ETHADDR) "\0"                     \
+       "ramdisk=ramdisk.cpio.img.uboot\0"                                      \
+       "kernel_addr=0x31000000\0"                                                      \
+       "ramdisk_addr=0x33000000\0"                                                     \
+       "tftpboot="                                                                                     \
+               "tftpboot $kernel_addr $bootfile; "                             \
+               "tftpboot $ramdisk_addr $ramdisk; "                             \
+               "bootm $kernel_addr $ramdisk_addr\0"

Note the variable passed to bootm command, please do NOT use $loadaddr here, as the value of this variable will be changed to the last load address, in this case the final $loadaddr is 33000000, and thus cause U-Boot complain cannot find kernel image:

## Current stack ends at 0x33bbcac0 ## Booting kernel from Legacy Image at 33000000 ...
   Image Name:   Root Filesystem
   Image Type:   ARM Linux RAMDisk Image (uncompressed)
   Data Size:    2691520 Bytes = 2.6 MiB
   Load Address: 33000000
   Entry Point:  33000000
   Verifying Checksum ... OK
Wrong Image Type for bootm command
ERROR: can't get kernel image!

Notice the address in the above address, full log can be found in troubleshooting section.

I find the root cause of this issue when I reading docs in doc/uImage.FIT.

Now create legacy image for U-Boot with mkimage:

mkimage -A arm -O linux -T ramdisk -C none \
        -a 0x33000000 \
        -n "Root Filesystem for MINI2440" \
        -d ramdisk.cpio.img ramdisk.cpio.img.uboot

Copy all the image to /tftpboot/ and we are ready to run.

Boot FIT Image

No kernel changes are needed to boot FIT image, all the jobs are done in U-Boot, there is another post regarding this with much details.

Troubleshooting

Wrong Image Type for bootm command

When pass $loadaddr to bootm, U-Boot cannot boot kernel

U-Boot SPL 2020.10-g50e3361-dirty (Dec 28 2020 - 10:23:28 +0800)
Trying to boot from RAMC�

U-Boot 2020.10-g50e3361-dirty (Dec 28 2020 - 10:23:28 +0800)

CPUID: 32440001
FCLK:      405 MHz
HCLK:  101.250 MHz
PCLK:   50.625 MHz
DRAM:  64 MiB
In:    serial
Out:   serial
Err:   serial
Net:   dm9000
Hit any key to stop autoboot:  0
dm9000 i/o: 0x20000000, id: 0x90000a46
DM9000: running in 16 bit mode
MAC: 0c:96:e6:15:09:12
could not establish link
Using dm9000 device
TFTP from server 192.168.0.3; our IP address is 192.168.0.89
Filename 'uImage'.
## Current stack ends at 0x33bbca10 Load address: 0x30800000
Loading: #T ################################################################
	 #################################################################
	 #################################################################
	 #################################################################
	 ##################################################
	 262.7 KiB/s
done
Bytes transferred = 1584928 (182f20 hex)
dm9000 i/o: 0x20000000, id: 0x90000a46
DM9000: running in 16 bit mode
MAC: 0c:96:e6:15:09:12
could not establish link
Using dm9000 device
TFTP from server 192.168.0.3; our IP address is 192.168.0.89
Filename 'ramdisk.cpio.img.uboot'.
## Current stack ends at 0x33bbca10 Load address: 0x33000000
Loading: #T ################################################################
	 #################################################################
	 #################################################################
	 #################################################################
	 #################################################################
	 #################################################################
	 #################################################################
	 #################################################################
	 ######
	 404.3 KiB/s
done
Bytes transferred = 2691584 (291200 hex)
## Current stack ends at 0x33bbcac0 ## Booting kernel from Legacy Image at 33000000 ...
   Image Name:   Root Filesystem
   Image Type:   ARM Linux RAMDisk Image (uncompressed)
   Data Size:    2691520 Bytes = 2.6 MiB
   Load Address: 33000000
   Entry Point:  33000000
   Verifying Checksum ... OK
Wrong Image Type for bootm command
ERROR: can't get kernel image!
MINI2440 #

Fixed by not using $loadaddr in tftpboot command, as it will get overridden by later commands.

Wrong Ramdisk Image Format

MINI2440 #
U-Boot SPL 2020.10-g50e3361-dirty (Dec 28 2020 - 11:03:15 +0800)
Trying to boot from RAMC�

U-Boot 2020.10-g50e3361-dirty (Dec 28 2020 - 11:03:15 +0800)

CPUID: 32440001
FCLK:      405 MHz
HCLK:  101.250 MHz
PCLK:   50.625 MHz
DRAM:  64 MiB
In:    serial
Out:   serial
Err:   serial
Net:   dm9000
Hit any key to stop autoboot:  0
dm9000 i/o: 0x20000000, id: 0x90000a46
DM9000: running in 16 bit mode
MAC: 0c:96:e6:15:09:12
could not establish link
Using dm9000 device
TFTP from server 192.168.0.3; our IP address is 192.168.0.89
Filename 'uImage'.
## Current stack ends at 0x33bbca10 Load address: 0x31000000
Loading: #T ################################################################
	 #################################################################
	 #################################################################
	 #################################################################
	 ##################################################
	 262.7 KiB/s
done
Bytes transferred = 1584928 (182f20 hex)
dm9000 i/o: 0x20000000, id: 0x90000a46
DM9000: running in 16 bit mode
MAC: 0c:96:e6:15:09:12
could not establish link
Using dm9000 device
TFTP from server 192.168.0.3; our IP address is 192.168.0.89
Filename 'ramdisk.cpio.img'.
## Current stack ends at 0x33bbca10 Load address: 0x33000000
Loading: #T ################################################################
	 #################################################################
	 #################################################################
	 #################################################################
	 #################################################################
	 #################################################################
	 #################################################################
	 #################################################################
	 ######
	 405.3 KiB/s
done
Bytes transferred = 2691520 (2911c0 hex)
## Current stack ends at 0x33bbcac0 ## Booting kernel from Legacy Image at 31000000 ...
   Image Name:   Linux-5.9.11-gec59da56b36b
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:    1584864 Bytes = 1.5 MiB
   Load Address: 30008000
   Entry Point:  30008000
   Verifying Checksum ... OK
Wrong Ramdisk Image Format
Ramdisk image is corrupt or invalid
MINI2440 #

To make ramdisk recognized by U-Boot, a header is needed, add it with mkimage:

mkimage -A arm -O linux -T ramdisk -C none \
        -a 0x33000000 \
        -n "Root Filesystem for MINI2440" \
        -d ramdisk.cpio.img ramdisk.cpio.img.uboot

No kernel message output in serial console, but already in __log_buf

After issue bootm command, kernel stuck on Starting kernel …

## Current stack ends at 0x33bbcac0 ## Booting kernel from Legacy Image at 31000000 ...
   Image Name:   Linux-5.9.11-gec59da56b36b-dirty
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:    1581576 Bytes = 1.5 MiB
   Load Address: 30008000
   Entry Point:  30008000
   Verifying Checksum ... OK
## Loading init Ramdisk from Legacy Image at 33000000 ...
   Image Name:   Root Filesystem
   Image Type:   ARM Linux RAMDisk Image (uncompressed)
   Data Size:    2691520 Bytes = 2.6 MiB
   Load Address: 33000000
   Entry Point:  33000000
   Verifying Checksum ... OK
   Loading Kernel Image
   Loading Ramdisk to 3392a000, end 33bbb1c0 ... OK
using: ATAGS
## Transferring control to Linux (at address 30008000)...

Starting kernel ...

I find a way for dump the printk buffer in Embedded Linux Wiki, first get the buffer address in System.map with grep(rg), and convert it to physical address:

rg __log_buf System.map
16885:c0319c28 b __log_buf

For physical address conversion, refer to Debugging Mini2440 with OpenOCD.

After reset, dump memory content in U-Boot:

MINI2440 # md 30319c28
30319c28: 00000000 00000000 00210038 c2000000    ........8.!.....
30319c38: 746f6f42 20676e69 756e694c 6e6f2078    Booting Linux on
30319c48: 79687020 61636973 5043206c 78302055     physical CPU 0x
30319c58: 00000030 00000000 00000000 00000000    0...............
30319c68: 00c900e0 a2000000 756e694c 65762078    ........Linux ve
30319c78: 6f697372 2e35206e 31312e39 6365672d    rsion 5.9.11-gec
30319c88: 61643935 33623635 642d6236 79747269    59da56b36b-dirty
30319c98: 64662820 40696162 61626466 65642d69     (fdbai@fdbai-de
30319ca8: 6f746b73 28202970 2d6d7261 756e696c    sktop) (arm-linu
30319cb8: 6e672d78 62616575 63672d69 55282063    x-gnueabi-gcc (U
30319cc8: 746e7562 694c2f75 6f72616e 352e3720    buntu/Linaro 7.5
30319cd8: 332d302e 6e756275 7e317574 302e3831    .0-3ubuntu1~18.0
30319ce8: 37202934 302e352e 4e47202c 646c2055    4) 7.5.0, GNU ld
30319cf8: 4e472820 69422055 6974756e 6620736c     (GNU Binutils f
30319d08: 5520726f 746e7562 32202975 2930332e    or Ubuntu) 2.30)
30319d18: 32362320 6e6f4d20 63654420 20383220     #62 Mon Dec 28
...
MINI2440 #
3031b428: 2d303134 00746477 a07b3ebe 00000000    410-wdt..>{.....
3031b438: 002b0040 62000000 6e726157 3a676e69    @.+....bWarning:
3031b448: 616e7520 20656c62 6f206f74 206e6570     unable to open
3031b458: 69206e61 6974696e 63206c61 6f736e6f    an initial conso
3031b468: 002e656c 00000000 a0a36861 00000000    le......ah......
3031b478: 00220038 c2000000 65657246 20676e69    8.".....Freeing
3031b488: 73756e75 6b206465 656e7265 656d206c    unused kernel me
3031b498: 79726f6d 3431203a 00004b34 00000000    mory: 144K......
3031b4a8: a0a46702 00000000 00370048 82000000    .g......H.7.....
3031b4b8: 6e72654b 6d206c65 726f6d65 72702079    Kernel memory pr
3031b4c8: 6365746f 6e6f6974 746f6e20 6c657320    otection not sel
3031b4d8: 65746365 79622064 72656b20 206c656e    ected by kernel
3031b4e8: 666e6f63 002e6769 a0a60f64 00000000    config..d.......
3031b4f8: 001b0030 c2000000 206e7552 756e696c    0.......Run linu
3031b508: 20637278 69207361 2074696e 636f7270    xrc as init proc
3031b518: 00737365 00000000 a0a6dfba 00000000    ess.............

There is a message in printk buffer, saying:

Running linuxrc as init process

This stats the completion of kernel boot process, so the most possible reason is the console driver is not enabled, it is true after I reviewed the config file, it’s working after add below option:

+CONFIG_SERIAL_SAMSUNG_CONSOLE=y

Another way to dump memory content without reset is define a gdb macro:

define xxd
    dump binary memory dump.bin $arg0 ((void *)$arg0)+$arg1
    shell xxd dump.bin
end

Using this macro does not need to convert to physical address:

>>> xxd 0xc03adbe4 256
00000000: 0000 0000 0000 0000 3400 2100 0000 00c2  ........4.!.....
00000010: 426f 6f74 696e 6720 4c69 6e75 7820 6f6e  Booting Linux on
00000020: 2070 6879 7369 6361 6c20 4350 5520 3078   physical CPU 0x
00000030: 3000 0000 0000 0000 0000 0000 dc00 ca00  0...............
00000040: 0000 00a2 4c69 6e75 7820 7665 7273 696f  ....Linux versio
00000050: 6e20 352e 392e 3131 2d67 6237 6635 6636  n 5.9.11-gb7f5f6
00000060: 3431 6632 3338 2d64 6972 7479 2028 6664  41f238-dirty (fd
00000070: 6261 6940 6664 6261 692d 6465 736b 746f  bai@fdbai-deskto
00000080: 7029 2028 6172 6d2d 6c69 6e75 782d 676e  p) (arm-linux-gn
00000090: 7565 6162 692d 6763 6320 2855 6275 6e74  ueabi-gcc (Ubunt
000000a0: 752f 4c69 6e61 726f 2037 2e35 2e30 2d33  u/Linaro 7.5.0-3
000000b0: 7562 756e 7475 317e 3138 2e30 3429 2037  ubuntu1~18.04) 7
000000c0: 2e35 2e30 2c20 474e 5520 6c64 2028 474e  .5.0, GNU ld (GN
000000d0: 5520 4269 6e75 7469 6c73 2066 6f72 2055  U Binutils for U
000000e0: 6275 6e74 7529 2032 2e33 3029 2023 3130  buntu) 2.30) #10
000000f0: 3120 5468 7520 4465 6320 3331 2031 313a  1 Thu Dec 31 11:

Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004

Running linuxrc as init process results to this panic with exit code=4:

[    3.117860] Run linuxrc as init process
[    3.126997] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004
[    3.129115] CPU: 0 PID: 1 Comm: linuxrc Not tainted 5.9.11-gec59da56b36b-dirty #64
[    3.136493] Hardware name: MINI2440
[    3.139902] Backtrace:
[    3.142416] [<c000cb18>] (dump_backtrace) from [<c000cde4>] (show_stack+0x18/0x1c)
[    3.149778]  r7:ffffe000 r6:c02a4248 r5:00000000 r4:c02ae4f8
[    3.155342] [<c000cdcc>] (show_stack) from [<c0251840>] (dump_stack+0x28/0x30)
[    3.162406] [<c0251818>] (dump_stack) from [<c0250414>] (panic+0xe8/0x2e8)
[    3.169102]  r4:c03144e8
[    3.171661] [<c025032c>] (panic) from [<c0019ac0>] (do_exit+0x790/0x9b0)
[    3.113462]  r3:00000001 r2:60000013 r1:00000004 r0:c02a4248
[    3.118966]  r7:ffffe000
[    3.121525] [<c0019330>] (do_exit) from [<c0019d60>] (do_group_exit+0x44/0xbc)
[    3.128524]  r7:c3438000
[    3.131109] [<c0019d1c>] (do_group_exit) from [<c0024af0>] (get_signal+0x124/0x7c0)
[    3.138509]  r4:0830009f
[    3.141060] [<c00249cc>] (get_signal) from [<c000c324>] (do_work_pending+0x1b4/0x558)
[    3.148739]  r10:b6f777d8 r9:c02f300c r8:c342ff48 r7:c02f3008 r6:c342ffb0 r5:b6f777d4
[    3.156342]  r4:ffffe000
[    3.158877] [<c000c170>] (do_work_pending) from [<c00082c8>] (slow_work_pending+0xc/0x20)
[    3.166844] Exception stack(0xc342ffb0 to 0xc342fff8)
[    3.171827] ffa0:                                     00000000 b6f87000 fffffe00 fffffe00
[    3.115146] ffc0: b6f87000 b6f86e00 00008034 0000002d 00000007 00000000 00000000 00000007
[    3.123153] ffe0: 00000000 beff8c1c b6f76d3c b6f777d8 60000010 00000000
[    3.129663]  r10:00000000 r9:c342e000 r8:c0008464 r7:00900000 r6:00008034 r5:b6f86e00
[    3.137269]  r4:b6f87000
[    3.139771] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004 ]---

First, we set a temporary breakpoint at slow_work_pending to see what happens there:

 103  slow_work_pending:
!104      mov    r0, sp                @ 'regs'
 105      mov    r2, why                @ 'syscall'
 106      bl    do_work_pending
arch/arm/kernel/entry-common.S      arch/arm/kernel/signal.c          arch/arm/kernel/signal.c      kernel/signal.c
slow_work_pending              ==>  do_work_pending             ==>   do_signal                 ==> get_signal                    ==> do_group_exit(int exit_code)    ==> do_exit(long code)    ==> panic
                                    arg: thread_flags = 1                                           signal = current->signal          arg: exit_code = 4                  code = 4
                                         syscall = 0xc0008464                                       dequeue_synchronous_signal=4      sig->group_exit_code = 4
                                                                                                                                      flags = 4
                                                                                                    do_group_exit(
                                                                                                    ksig->info.si_signo)

This issue seems related to syscall.

Tracking down the code path of slow_work_pending, we can find do_signal only be called when the TIF_SIGPENDING bit was set in thread_flags, TIF_SIGPENDING was defined in arch/arm/include/asm/thread_info.h:

#define TIF_SIGPENDING		0	/* signal pending */
#define _TIF_SIGPENDING		(1 << TIF_SIGPENDING)

And thread_flags was passed to do_work_pending by r1, the bit was set before entering slow_work_pending, we need to know where this bit get set, from below stack, we can set breakpoint to ret_fast_syscall to track upwards:

[0] from 0xc00082bc in ret_fast_syscall at arch/arm/kernel/entry-common.S:104
[1] from 0xc003a2f4 in wake_up_state+20 at kernel/sched/core.c:3059

Step in ret_fast_syscall, we know r1 was set with:

 55      ldr    r1, [tsk, #TI_FLAGS]        @ re-check for syscall tracing

TI_FLAGS is the offset of flags in the struct of thread_info, which was defined in arch/arm/kernel/asm-offsets.c:

42:  DEFINE(TI_FLAGS,		offsetof(struct thread_info, flags));

tsk was defined in arch/arm/kernel/entry-header.S

tsk	.req	r9		@ current thread_info

The instruction above was translated into assembly as:

 0xc0008270  ? ldr    r1, [r9]                       @ r9=0xc342e000

And ret_fast_syscall was called in vector_swi, set break on at vector_swi, inspecting the content at 0xc342e000, we can see the bit in flags is not set when it breaks:

>>> p *(struct thread_info *)0xc342e000
$1 = {
  flags = 0,
  preempt_count = 0,
  addr_limit = 3204448256,
  task = 0xc3430000,
  cpu = 0,
  cpu_domain = 83,
  cpu_context = {
    r4 = 3224952512,
    r5 = 3275948032,
    r6 = 3224915976,
    r7 = 3276849152,
    r8 = 3224994948,
    r9 = 0,
    sl = 0,
    fp = 3275947500,
    sp = 3275947444,
    pc = 3224162360,
    extra = {[0] = 0, [1] = 0}
  },
  syscall = 0,
  ...

This means the bit was set during or after the vector_swi, to find out when this bit get set, set a watchpoint at that address:

watch *0xc342e000

NOTE:
DO NOT set watchpoint before entering vector_swi.

We see the following stack when flags was set:

─── Stack ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
[0] from 0xc0021b34 in signal_wake_up_state+44 at kernel/signal.c:770
[1] from 0xc0021c74 in signal_wake_up+20 at ./include/linux/sched/signal.h:409
[2] from 0xc0021c74 in complete_signal+264 at kernel/signal.c:1062
[3] from 0xc00226c0 in __send_signal+396 at kernel/signal.c:1181
[4] from 0xc002359c in send_signal+104 at kernel/signal.c:1242
[5] from 0xc0024510 in force_sig_info_to_task+208 at kernel/signal.c:1334
[6] from 0xc0024988 in force_sig_fault_to_task+76 at kernel/signal.c:1673
[7] from 0xc00249c8 in force_sig_fault+32 at kernel/signal.c:1680
[8] from 0xc000d0dc in arm_notify_die+88 at arch/arm/kernel/traps.c:381
[9] from 0xc000d504 in bad_syscall+80 at arch/arm/kernel/traps.c:555

This seems interesting, at the very beginning, I made an assumption, this issue maybe syscall related, this seems true at this point, let’s prove this assumption. Do a backtrace tells us that bad_syscall was called by arm_syscall, which happens in vector_swi.

If DEBUG_USER is configured and UDBG_SYSCALL bit was enabled, bad_syscall will print more info, enable it by append the following to bootargs:

user_debug=2

bit definition can be found in arch/arm/include/asm/system_misc.h, put it here for later reference:

#define UDBG_UNDEFINED	(1 << 0)
#define UDBG_SYSCALL	(1 << 1)
#define UDBG_BADABORT	(1 << 2)
#define UDBG_SEGV	(1 << 3)
#define UDBG_BUS	(1 << 4)

Rebuild kernel and run the new kernel will report the below message:

[    3.245135] [1] linuxrc: obsolete system call 00000000.
[    3.245330] Code: e08f1001 e1a0c000 e3a0702d ef000000 (e3700a01)

Syscall 0 was made by linuxrc, the no passed to arm_syscall is incorrect.

The syscall was invoked in vector_swi by invoke_syscall which is defined in arch/arm/kernel/entry-header.S as follows:

	.macro	invoke_syscall, table, nr, tmp, ret
	cmp	\nr, #NR_syscalls		@ check upper syscall limit
	badr	lr, \ret			@ return address
	ldrcc	pc, [\table, \nr, lsl #2]	@ call sys_* routine
										@ load pc from contents of r8 + r7 *4
	.endm

This is a simplified version of invoke_syscall, with reload ignored, because it is not used here, let’s take a look at the parameters:

	invoke_syscall tbl, scno, r10, __ret_fast_syscall
  • tbl: alias of r8 (0xc0008464)
  • scno: alias of r7 (0x00900000)
  • r10: ignored
  • __ret_fast_syscall: return address (0xc0008260)

Set breakpoint at vector_swi when it hit, we can see the syscall number in r7 which is 0x2d. The syscall number passed from user space is different from the one passed to invoke_syscall.

There are two ABIs (Application Binary Interface ) for arm architecture, EABI and legacy ABI with syscall number started from zero and 0x900000 respectively, step in vector_swi and pay attention to the parameters passed to invoke_syscall, we can see the syscall number was wrong, apparently this case is incompatible ABI between Linux kernel and linuxrc, while kernel using legacy ABI and the user space program using the EABI, which cause the scno was treated incorrectly.

We can confirm this in ELF header with readelf:

arm-linux-gnueabi-readelf -h vmlinux
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 61 00 00 00 00 00 00 00 00
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            ARM
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           ARM
  Version:                           0x1
  Entry point address:               0xc0008000
  Start of program headers:          52 (bytes into file)
  Start of section headers:          51675948 (bytes into file)
  Flags:                             0x600, GNU EABI, software FP, VFP
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         6
  Size of section headers:           40 (bytes)
  Number of section headers:         34
  Section header string table index: 33


arm-linux-gnueabi-readelf -h ~/rootfs/bin/busybox
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           ARM
  Version:                           0x1
  Entry point address:               0xd51c
  Start of program headers:          52 (bytes into file)
  Start of section headers:          817516 (bytes into file)
  Flags:                             0x5000002, Version5 EABI, <unknown>
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         7
  Size of section headers:           40 (bytes)
  Number of section headers:         25
  Section header string table index: 24

To fix this, enable AEABI in kernel configuration.

The openocd script used in this section:

target extended-remote localhost:3333
monitor reset
monitor halt
load
tb _main
c
monitor load_image u-boot.img 0x337fffc0
monitor bp 0x30008200 4 hw
set confirm off
c
add-symbol-file ~/linux-5.9.11/vmlinux
monitor step
monitor step
monitor rbp all

tb do_work_pending
tb slow_work_pending
tb ret_fast_syscall
tb vector_swi

udevd failed to start

I see the following error message every time kernel boot complete, after udevd start:

[    3.014711] Run linuxrc as init process
Starting udevd ...
error initializing control socketifconfig: socket: Function not implemented

Welcome to the Embedded World.

Default board ip: 192.168.2.15
ifconfig: socket: Function not implemented
route: socket: Function not implemented
chmod: /dev/ptmx: No such file or directory
chmod: /dev/android_adb: No such file or directory
Starting adb daemon
can't open /dev/tty2: No such file or directory
can't open /dev/tty3: No such file or directory
can't open /dev/tty4: No such file or directory

I’ve checked kernel config file CONFIG_TTY is enabled, and there is tty device in sysfs, so it could be a issue in userspace:

[root@mini2440]$ ls -l /sys/class/tty/tty[2-4]
lrwxrwxrwx    1 0        0               0 Jan  1 00:00 /sys/class/tty/tty2 -> ../../devices/virtual/tty/tty2
lrwxrwxrwx    1 0        0               0 Jan  1 00:00 /sys/class/tty/tty3 -> ../../devices/virtual/tty/tty3
lrwxrwxrwx    1 0        0               0 Jan  1 00:00 /sys/class/tty/tty4 -> ../../devices/virtual/tty/tty4

After a close look at the boot up message, I noticed this message is most suspected which seems something network socket related:

error initializing control socketifconfig: socket: Function not implemented

There is no device node under /dev/ directory, that explains linuxrc cannot open tty devices, one way to fix this is to switch udev daemon to mdev with mdev -s, without any changes in kernel, or add the following configs to kernel to make udevd works:

+CONFIG_NET=y
+CONFIG_UNIX=y