Summary: In kernel 6.6.28, e1000 (Intel wired Ethernet) has a bug that freezes the kernel.
If your hardware has e1000, stay on 6.6.25
After booting with 6.6.28-pclos1 , having wifi network working (wlan1) and eth0 eth1 up but not plugged in, and machine idle for couple hours, it freezes. Looking in /var/log/messages I see
Apr 25 17:32:56 kernel: [ 8641.828619] e1000e 0000:00:1f.6 eth1: NIC Link is Up 10 Mbps Half Duplex, Flow Control: None
Apr 25 17:32:56 kernel: [ 8641.828671] BUG: scheduling while atomic: kworker/2:0/5258/0x00000002
Apr 25 17:32:56 kernel: [ 8641.828926] CPU: 2 PID: 5258 Comm: kworker/2:0 Tainted: P O 6.6.28-pclos1 #1
Apr 25 17:32:56 kernel: [ 8641.828934] Hardware name: ASRock Z790M-ITX WiFi/Z790M-ITX WiFi, BIOS 11.06 03/20/2024
Apr 25 17:32:56 kernel: [ 8641.828938] Workqueue: events e1000_watchdog_task [e1000e]
Apr 25 17:32:56 kernel: [ 8641.828981] Call Trace:
Apr 25 17:32:56 kernel: [ 8641.828986] <TASK>
Apr 25 17:32:56 kernel: [ 8641.828989] dump_stack_lvl+0x32/0x50
Apr 25 17:32:56 kernel: [ 8641.829005] __schedule_bug+0x4d/0x60
Apr 25 17:32:56 kernel: [ 8641.829014] __schedule+0xeb0/0x11a0
Apr 25 17:32:56 kernel: [ 8641.829021] ? try_to_wake_up+0x1cb/0x3f0
Apr 25 17:32:56 kernel: [ 8641.829028] schedule+0x52/0xa0
Apr 25 17:32:56 kernel: [ 8641.829032] schedule_hrtimeout_range_clock+0xa4/0x120
Apr 25 17:32:56 kernel: [ 8641.829042] ? __pfx_hrtimer_wakeup+0x10/0x10
Apr 25 17:32:56 kernel: [ 8641.829051] usleep_range_state+0x4b/0x60
Apr 25 17:32:56 kernel: [ 8641.829060] e1000e_read_phy_reg_mdic+0x7a/0x160 [e1000e]
Apr 25 17:32:56 kernel: [ 8641.829106] e1000e_update_stats+0x4e2/0x700 [e1000e]
Apr 25 17:32:56 kernel: [ 8641.829144] e1000_watchdog_task+0xb4/0xa10 [e1000e]
Apr 25 17:32:56 kernel: [ 8641.829177] process_one_work+0x15b/0x280
Apr 25 17:32:56 kernel: [ 8641.829188] worker_thread+0x2ec/0x410
Apr 25 17:32:56 kernel: [ 8641.829198] ? __pfx_worker_thread+0x10/0x10
Apr 25 17:32:56 kernel: [ 8641.829207] kthread+0xdc/0x110
Apr 25 17:32:56 kernel: [ 8641.829214] ? __pfx_kthread+0x10/0x10
Apr 25 17:32:56 kernel: [ 8641.829220] ret_from_fork+0x28/0x40
Apr 25 17:32:56 kernel: [ 8641.829230] ? __pfx_kthread+0x10/0x10
Apr 25 17:32:56 kernel: [ 8641.829235] ret_from_fork_asm+0x1b/0x30
Apr 25 17:32:56 kernel: [ 8641.829244] </TASK>
Then a minute later
Apr 25 17:33:57 kernel: [ 8702.002511] rcu: INFO: rcu_preempt self-detected stall on CPU
Apr 25 17:33:57 kernel: [ 8702.002514] rcu: 2-....: (59999 ticks this GP) idle=cf6c/1/0x4000000000000000 softirq=10297/10297 fqs=15000
Apr 25 17:33:57 kernel: [ 8702.002519] rcu: (t=60000 jiffies g=38245 q=119 ncpus=20)
Apr 25 17:33:57 kernel: [ 8702.002521] CPU: 2 PID: 1367 Comm: kworker/2:2 Tainted: P W O 6.6.28-pclos1 #1
Apr 25 17:33:57 kernel: [ 8702.002523] Hardware name: ASRock Z790M-ITX WiFi/Z790M-ITX WiFi, BIOS 11.06 03/20/2024
Apr 25 17:33:57 kernel: [ 8702.002524] Workqueue: events linkwatch_event
Apr 25 17:33:57 kernel: [ 8702.002531] RIP: 0010:queued_spin_lock_slowpath+0x3f/0x1a0
The later message repeats many times until the time I rebooted.
Searching the internet for the initial BUG message, there are many recent reports for example here
https://www.reddit.com/r/voidlinux/comments/1c9s8ut/bug_scheduling_while_atomic/?rdt=46272
In this post it is claimed that 6.6.25 does not have this problem.
Also see
https://patchwork.ozlabs.org/project/intel-wired-lan/patch/[email protected]/