The hunt for a kernel bug, part 5

Armed with the information from my previous research on a possible kernel bug, I opened a bug report on the Ubuntu bug tracker: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1963555.

It wasn’t long until my bug got confirmed. Someone else chimed in that they had also experienced USB issues. In their case it were external drive devices. Definitely a showstopper!

As of this date, there is a beta for Ubuntu 22.04, and my hope is that this version will either include a new enough kernel (5.16 or up), or that Ubuntu developers have manually cherry-picked the commit that fixes the issue. Let’s check with the Ubuntu Kernel Team:

Ubuntu Kernel Team

Oops… based on upstream 5.15… that’s not good. Maybe they cherry-picked upstream commits? I checked https://packages.ubuntu.com/jammy/linux-generic and the kernel is currently at 5.15.0.25.27. The changelog doesn’t mention anything about xhci or usb. I guess I still have to wait a bit longer…

I have a ridiculous amount of kernels

In previous blogposts I wrote about how I found a possible bug in the Linux kernel, or more precisely, in the kernel that Ubuntu derived from the mainline kernel.

To be able to install any kernel version 5.15.7 or higher, I also had to install libssl3.

The result is that I now have 37 kernels installed, taking up little over 2 GiB disk space:

$ (cd /boot ; ls -hgo initrd.img-* ; ls /boot/initrd.img-* | wc -l)
-rw-r--r-- 1 39M mrt  9 09:54 initrd.img-5.13.0-051300-generic
-rw-r--r-- 1 40M mrt  9 09:58 initrd.img-5.13.0-19-generic
-rw-r--r-- 1 40M mrt  9 09:58 initrd.img-5.13.0-20-generic
-rw-r--r-- 1 40M mrt  9 09:57 initrd.img-5.13.0-21-generic
-rw-r--r-- 1 44M mrt 30 17:46 initrd.img-5.13.0-22-generic
-rw-r--r-- 1 40M mrt  9 09:56 initrd.img-5.13.0-23-generic
-rw-r--r-- 1 40M mrt  9 09:56 initrd.img-5.13.0-25-generic
-rw-r--r-- 1 40M mrt  9 09:56 initrd.img-5.13.0-27-generic
-rw-r--r-- 1 40M mrt  9 09:55 initrd.img-5.13.0-28-generic
-rw-r--r-- 1 40M mrt  9 09:55 initrd.img-5.13.0-30-generic
-rw-r--r-- 1 45M mrt  9 12:02 initrd.img-5.13.0-35-generic
-rw-r--r-- 1 45M mrt 24 23:17 initrd.img-5.13.0-37-generic
-rw-r--r-- 1 45M mrt 30 17:49 initrd.img-5.13.0-39-generic
-rw-r--r-- 1 39M mrt  9 09:54 initrd.img-5.13.1-051301-generic
-rw-r--r-- 1 39M mrt  9 09:54 initrd.img-5.13.19-051319-generic
-rw-r--r-- 1 37M mrt  9 09:53 initrd.img-5.13.19-ubuntu-5.13.0-22.22
-rw-r--r-- 1 37M mrt  9 09:53 initrd.img-5.13.19-ubuntu-5.13.0-22.22-0-g3ab15e228151
-rw-r--r-- 1 37M mrt  9 09:52 initrd.img-5.13.19-ubuntu-5.13.0-22.22-317-g398351230dab
-rw-r--r-- 1 37M mrt  9 09:52 initrd.img-5.13.19-ubuntu-5.13.0-22.22-356-g8ac4e2604dae
-rw-r--r-- 1 37M mrt  9 09:52 initrd.img-5.13.19-ubuntu-5.13.0-22.22-376-gfab6fb5e61e1
-rw-r--r-- 1 37M mrt  9 09:51 initrd.img-5.13.19-ubuntu-5.13.0-22.22-386-gce5ff9b36bc3
-rw-r--r-- 1 37M mrt  9 09:51 initrd.img-5.13.19-ubuntu-5.13.0-22.22-387-g0fc979747dec
-rw-r--r-- 1 37M mrt  9 09:50 initrd.img-5.13.19-ubuntu-5.13.0-22.22-388-g20210d51e24a
-rw-r--r-- 1 37M mrt  9 09:50 initrd.img-5.13.19-ubuntu-5.13.0-22.22-388-gab2802ea6621
-rw-r--r-- 1 37M mrt  9 09:50 initrd.img-5.13.19-ubuntu-5.13.0-22.22-391-ge24e59fa409c
-rw-r--r-- 1 37M mrt  9 09:49 initrd.img-5.13.19-ubuntu-5.13.0-22.22-396-gc3d35f3acc3a
-rw-r--r-- 1 37M mrt  9 09:49 initrd.img-5.13.19-ubuntu-5.13.0-22.22-475-g79b62d0bba89
-rw-r--r-- 1 37M mrt  9 09:48 initrd.img-5.13.19-ubuntu-5.13.0-23.23
-rw-r--r-- 1 40M mrt  9 09:48 initrd.img-5.14.0-051400-generic
-rw-r--r-- 1 40M mrt  9 10:31 initrd.img-5.14.21-051421-generic
-rw-r--r-- 1 44M mrt  9 12:39 initrd.img-5.15.0-051500-generic
-rw-r--r-- 1 46M mrt  9 12:16 initrd.img-5.15.0-22-generic
-rw-r--r-- 1 46M mrt 28 23:27 initrd.img-5.15.32-051532-generic
-rw-r--r-- 1 46M mrt 17 21:12 initrd.img-5.16.0-051600-generic
-rw-r--r-- 1 48M mrt 28 23:19 initrd.img-5.16.16-051616-generic
-rw-r--r-- 1 45M mrt 28 23:11 initrd.img-5.17.0-051700-generic
-rw-r--r-- 1 46M apr  8 17:02 initrd.img-5.17.2-051702-generic
37
  • Versions 5.xx.yy-zz-generic are installed with apt.
  • Versions 5.xx.yy-05xxyy-generic are installed with the Ubuntu Mainline Kernel Installer.
  • Versions 5.xx.yy-ubuntu-5.13.0-zz.zz-nnn-g<commithash> are compiled from source, where <commithash> is the commit of the kernel repository that I compiled.

The kernels in bold are the kernels where something unexpected happens with my USB devices:

  • Ubuntu kernels 5.13.23 and up – including 5.15 kernels of Ubuntu 22.04 LTS (Jammy Jellyfish).
  • Ubuntu compiled kernels, starting 387 commits after kernel 5.13.22.
  • Mainline kernels 5.15.xx.

When Ubuntu finally bases their kernel on mainline 5.16 or higher, then the USB bug will be solved.

Install libssl3 on Ubuntu versions before Jammy

Ubuntu mainline kernel packages 5.15.7 and later bump a dependency from libssl1.1 (>= 1.1.0) to libssl3 (>= 3.0.0~~alpha1).

However, package libssl3 is not available for Ubuntu 21.10 Impish Indri. It’s only available for Ubuntu 22.04 Jammy Jellyfish (which is still in beta as of time of writing) and later.

libssl3 further depends on libc6>=2.34 and debconf, but they are available in 21.10 repositories.

Here are a few different ways to resolve the dependency:

Option 1

Use apt pinning to install libssl3 from a Jammy repo, without pulling in everything else from Jammy.

This is more complicated, but it allows the libssl3 package to receive updates automatically.
Do all the following as root.

  • Create an apt config file to specify your system’s current release as the default release for installing packages, instead of simply the highest version number found. We are about to add a Jammy repo to apt, which will contain a lot of packages with higher version numbers, and we want apt to ignore them all.
$ echo 'APT::Default-Release "impish";' \
    | sudo tee /etc/apt/apt.conf.d/01ubuntu
  • Add the Jammy repository to the apt sources. If your system isn’t “impish”, change that below.
$ awk '($1$3$4=="debimpishmain"){$3="jammy" ;print}' /etc/apt/sources.list \
    | sudo tee /etc/apt/sources.list.d/jammy.list
  • Pin libssl3 to the jammy version in apt preferences. This overrides the Default-Release above, just for the libssl3 package.
$ sudo tee /etc/apt/preferences.d/libssl3 >/dev/null <<%%EOF
Package: libssl3
Pin: release n=jammy
Pin-Priority: 900
%%EOF
  • Install libssl3:
$ sudo apt update
$ sudo apt install libssl3

Later, when Jammy is officially released, delete all 3 files created above

$ sudo rm --force \
    /etc/apt/apt.conf.d/01ubuntu \
    /etc/apt/sources.list.d/jammy.list \
    /etc/apt/preferences.d/libssl3

Option 2

Download the libssl3 deb package for Jammy and install it manually with dpkg -i filename.deb.

This only works if there aren’t any additional dependencies, which you would also have to install, with a risk of breaking your system. Here Be Dragons…

The hunt for a kernel bug, part 4: git bisect

Now that I have a way to compile kernels from source, I want to find the exact commit where my input devices stop working. That means doing a git checkout of a certain commit, build the kernel, install it, reboot, select the new kernel in Grub, and see if my keyboard works. I am quite sure that I need to search between 5.13.0-22 and 5.13.0-23, but that’s still 634 commits!

$ git rev-list Ubuntu-5.13.0-22.22..Ubuntu-5.13.0-23.23 | wc --lines
634

This is where git bisect comes in. It’s sort of a wizard that guides you to find a bad commit. You tell it on which commit your software was known to work ok, and a commit where it doesn’t. It then picks a commit somewhere in the middle, you build your software and do your tests, and then tell git bisect if the result was good or bad. It will then give you a new commit to inspect, each time narrowing the search.

git-bisect-sumo-logic
git bisect

Let’s do this!

$ git bisect start
$ git bisect good Ubuntu-5.13.0-22.22
$ git bisect bad Ubuntu-5.13.0-23.23
Bisecting: 316 revisions left to test after this (roughly 8 steps)
[398351230dab42d654036847a49a5839705abdcb] powerpc/bpf ppc32: Fix BPF_SUB when imm == 0x80000000
$ git describe --long 
Ubuntu-5.13.0-22.22-317-g398351230dab

In this first step, I get the 317th commit after 5.13.0-22. Let’s compile that commit:

$ time make clean olddefconfig bindeb-pkg \
    --jobs=$(getconf _NPROCESSORS_ONLN) \
    LOCALVERSION=-$(git describe --long | tr '[:upper:]' '[:lower:]')

This creates 3 .deb packages in the directory above:

$ ls -1 ../*$(git describe --long | tr '[:upper:]' '[:lower:]')*.deb
../linux-headers-5.13.19-ubuntu-5.13.0-22.22-317-g398351230dab_5.13.19-ubuntu-5.13.0-22.22-317-g398351230dab-10_amd64.deb
../linux-image-5.13.19-ubuntu-5.13.0-22.22-317-g398351230dab_5.13.19-ubuntu-5.13.0-22.22-317-g398351230dab-10_amd64.deb
../linux-libc-dev_5.13.19-ubuntu-5.13.0-22.22-317-g398351230dab-10_amd64.deb

I only need to install the headers and the image, libc-dev isn’t needed.

$ sudo dpkg --install ../linux-{headers,image}-*$(git describe --long | tr '[:upper:]' '[:lower:]')*.deb

Verify that the kernel files are in the /boot directory:

$ ls -1 /boot/*$(git describe --long | tr '[:upper:]' '[:lower:]')*
/boot/config-5.13.19-ubuntu-5.13.0-22.22-317-g398351230dab
/boot/initrd.img-5.13.19-ubuntu-5.13.0-22.22-317-g398351230dab
/boot/System.map-5.13.19-ubuntu-5.13.0-22.22-317-g398351230dab
/boot/vmlinuz-5.13.19-ubuntu-5.13.0-22.22-317-g398351230dab

Now I can reboot, select the new kernel in Grub, and test the keyboard. With commit 317, the keyboard worked, so the first bad commit has to be somewhere between commit 317 and 634:

$ git bisect good ; git describe --long
Bisecting: 158 revisions left to test after this (roughly 7 steps)
[79b62d0bba892e8367cb46ca09b623c885852c29] drm/msm/a4xx: fix error handling in a4xx_gpu_init()
Ubuntu-5.13.0-22.22-475-g79b62d0bba89

Now it’s time again for make clean olddefconfig bindeb-pkg, dpkg --install and reboot. Turns out that commit 475 was a “bad” commit (one where the keyboard didn’t work):

$ git bisect bad ; git describe --long
Bisecting: 78 revisions left to test after this (roughly 6 steps)
[c3d35f3acc3a11b726959c7b2c25ab9e46310273] USB: serial: option: add Telit LE910Cx composition 0x1204
Ubuntu-5.13.0-22.22-396-gc3d35f3acc3a

I’m not going to describe all the steps in full detail, by now you should get the gist of it. This was the sequence of steps that git bisect gave me:

  • 317: good
  • 475: bad
  • 396: bad
  • 356: good
  • 376: good
  • 386: good
  • 391: bad
  • 388: bad
  • 387: bad

And then we finally get the first bad commit, the 387th commit after 5.13.0-22, Ubuntu-5.13.0-22.22-387-g0fc979747dec:

$ git bisect bad ; git describe --long
0fc979747dece96c189bc29ef604e61afbddfa2a is the first bad commit
commit 0fc979747dece96c189bc29ef604e61afbddfa2a
Author: Pavankumar Kondeti <pkondeti@codeaurora.org>
Date:   Fri Oct 8 12:25:46 2021 +0300

    xhci: Fix command ring pointer corruption while aborting a command
    
    BugLink: https://bugs.launchpad.net/bugs/1951880
    
    commit ff0e50d3564f33b7f4b35cadeabd951d66cfc570 upstream.
    
    The command ring pointer is located at [6:63] bits of the command
    ring control register (CRCR). All the control bits like command stop,
    abort are located at [0:3] bits. While aborting a command, we read the
    CRCR and set the abort bit and write to the CRCR. The read will always
    give command ring pointer as all zeros. So we essentially write only
    the control bits. Since we split the 64 bit write into two 32 bit writes,
    there is a possibility of xHC command ring stopped before the upper
    dword (all zeros) is written. If that happens, xHC updates the upper
    dword of its internal command ring pointer with all zeros. Next time,
    when the command ring is restarted, we see xHC memory access failures.
    Fix this issue by only writing to the lower dword of CRCR where all
    control bits are located.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
    Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
    Link: https://lore.kernel.org/r/20211008092547.3996295-5-mathias.nyman@linux.intel.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Signed-off-by: Kamal Mostafa <kamal@canonical.com>
    Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

 drivers/usb/host/xhci-ring.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)
Ubuntu-5.13.0-22.22-387-g0fc979747dec

At first sight the commit description is quite cryptic, and the actual code change doesn’t tell me a lot either. But it’s a change in drivers/usb/host/xhci-ring.c, and xhci stands for eXtensible Host Controller Interface, and interface specification for USB host controllers. If it’s an issue with the USB host controller, then it makes sense that if I use 2 keyboards from different brands, neither of them would work. It also suggests that other USB devices, like external hard drives, wouldn’t work either, but that’s a bit harder to test. A keyboard is easy. Just look at NumLock LED, if it doesn’t go on then there’s an issue.

The first link in the commit description is just a long list of patches that were taken from upstream and integrated in the Ubuntu kernel, so that doesn’t help me. The second link is a thread on the kernel.org mailing list, and there it gets interesting.

kernel.org mailing list thread

Some excerpts from the thread:

This patch cause suspend to disk resume usb not work, xhci_hcd 0000:00:14.0: Abort failed to stop command ring: -110.

youling257

Thanks for the report, this is odd.
Could you double check that by reverting this patch resume start working again.
If this is the case maybe we need to write all 64bits before this xHC hardware reacts to CRCR register changes.
Maybe following changes on top of current patch could help:

Mathias Nyman

Every time a developer says “this is odd”, my alarm bells go off. 😀

Further down in the thread there is a proposed update to the change. I’m going to try that patch, but that’s for another blog post.

The hunt for a kernel bug, part 3: compiling a kernel

Compiling a Linux kernel sounds scary and complicated, but I found out it actually isn’t.

The first thing to do, is to install some prerequisites:

$ sudo apt install --yes asciidoc binutils-dev bison build-essential ccache \
    crash dwarves fakeroot flex git git-core git-doc git-email kernel-package \
    kernel-wedge kexec-tools libelf-dev libncurses5 libncurses5-dev libssl-dev \
    makedumpfile zstd
$ sudo apt-get --yes build-dep linux

Next I cloned the Ubuntu Impish repository. This takes a while…

$ git clone git://kernel.ubuntu.com/ubuntu/ubuntu-impish.git
$ cd ubuntu-impish

Now let’s see which versions are in the repository:

$ git tag --list
Ubuntu-5.11.0-16.17
Ubuntu-5.11.0-18.19+21.10.1
Ubuntu-5.11.0-20.21+21.10.1
Ubuntu-5.13.0-11.11
Ubuntu-5.13.0-12.12
Ubuntu-5.13.0-13.13
Ubuntu-5.13.0-14.14
Ubuntu-5.13.0-15.15
Ubuntu-5.13.0-16.16
Ubuntu-5.13.0-17.17
Ubuntu-5.13.0-18.18
Ubuntu-5.13.0-19.19
Ubuntu-5.13.0-20.20
Ubuntu-5.13.0-21.21
Ubuntu-5.13.0-22.22
Ubuntu-5.13.0-23.23
Ubuntu-5.13.0-24.24
Ubuntu-5.13.0-25.26
Ubuntu-5.13.0-26.27
Ubuntu-5.13.0-27.29
Ubuntu-5.13.0-28.31
Ubuntu-5.13.0-29.32
Ubuntu-5.13.0-30.33
Ubuntu-5.13.0-31.34
Ubuntu-5.13.0-32.35
freeze-20211018
freeze-20211108
freeze-20220131
freeze-20220221
v5.11
v5.13

The two tags that interest me, are Ubuntu-5.13.0-22.22 and Ubuntu-5.13.0-23.23. I’m starting with the former.

git checkout Ubuntu-5.13.0-22.22

First I copy the configuration of the current running kernel to the working directory:

$ cp /boot/config-$(uname --kernel-release) .config

I don’t want or need full debugging. That makes an enormous kernel and it takes twice as long to compile, so I turn debugging off:

$ scripts/config --disable DEBUG_INFO

I need to disable certificate stuff:

$ scripts/config --disable SYSTEM_TRUSTED_KEYS
$ scripts/config --disable SYSTEM_REVOCATION_KEYS

Next: update the kernel config and set all new symbols to their default value.

$ make olddefconfig

Then the most exciting thing can start: actually compiling the kernel!

$ make clean
$ time make --jobs=$(getconf _NPROCESSORS_ONLN) bindeb-pkg \
    LOCALVERSION=-$(git describe --long | tr '[:upper:]' '[:lower:]')
  • time is to see how long the compilation took.
  • getconf _NPROCESSORS_ONLN queries the number of processors on the computer. make will then try to run that many jobs in parallel.
  • bindeb-pkg will create .deb packages in the directory above.
  • LOCALVERSION appends a string to the kernel name.
  • git describe --long shows how far after a tag a certain commit is. In this case: Ubuntu-5.13.0-22.22-0-g3ab15e228151
    • Ubuntu-5.13.0-22.22 is the tag.
    • 0 is how many commits after the tag. In this case it’s the tag itself.
    • 3ab15e228151 is the abbreviated hash of the current commit.
  • tr '[:upper:]' '[:lower:]' is needed because .deb packages can’t contain upper case letters (I found out the hard way).

Now go grab a coffee, tea or chai latte. Compilation took 22 minutes on my computer.

Chai latte

When the compilation is done, there are 3 .deb packages in the directory above:

$ ls -1 ../*.deb
../linux-headers-5.13.19-ubuntu-5.13.0-22.22-0-g3ab15e228151_5.13.19-ubuntu-5.13.0-22.22-0-g3ab15e228151-21_amd64.deb
../linux-image-5.13.19-ubuntu-5.13.0-22.22-0-g3ab15e228151_5.13.19-ubuntu-5.13.0-22.22-0-g3ab15e228151-21_amd64.deb
../linux-libc-dev_5.13.19-ubuntu-5.13.0-22.22-0-g3ab15e228151-21_amd64.deb

Install the linux-headers and the linux-image packages, you don’t need the libc-dev package.

$ sudo dpkg --install \
    ../linux-{headers,image}-*$(git describe --long | tr '[:upper:]' '[:lower:]')*.deb

The kernel is now installed in the /boot directory, and it’s available in the Grub menu after reboot.

$ ls -1 /boot/*$(git describe --long | tr '[:upper:]' '[:lower:]')*
/boot/config-5.13.19-ubuntu-5.13.0-22.22-0-g3ab15e228151
/boot/initrd.img-5.13.19-ubuntu-5.13.0-22.22-0-g3ab15e228151
/boot/System.map-5.13.19-ubuntu-5.13.0-22.22-0-g3ab15e228151
/boot/vmlinuz-5.13.19-ubuntu-5.13.0-22.22-0-g3ab15e228151

Kernel ubuntu-5.13.0-22.22-0-g3ab15e228151 is, for all intents and purposes, the same as kernel 5.13.0-22-generic, so I expected it to be a “good” kernel, and it was.

For kernel Ubuntu-5.13.0-23.23 I did the same thing: starting from the git checkout. I skipped copying and editing the config file, because between minor releases I don’t expect there to be much change. I did run make olddefconfig for good measure, though. As expected, the keyboard and mouse didn’t work with the compiled ...-23 kernel.

Next up: using git bisect to find the exact commit where it went wrong. It’s got to be somewhere between ...-22 and ...-23!

The hunt for a kernel bug, part 2: an easy way to install mainline kernels

As I wrote previously, I’m suspecting a Linux kernel bug somewhere between versions 5.13.0-22 and 5.13.0-23, in the Ubuntu kernels. I wanted to know if the issue only surfaced in Ubuntu-flavored kernels, or also in the upstream (mainline) kernels from kernel.org.

There is an Ubuntu Mainline PPA with all the upstream kernels, but I found it a bit too opaque to use. Fortunately I found the Ubuntu Mainline Kernel Installer (UMKI), a tool for installing the latest Linux kernels on Ubuntu-based distributions.

Ubuntu Mainline Kernel Installer (UMKI)

The UMKI is pretty straightforward. It fetches a list of kernels from the Ubuntu Mainline PPA and a GUI displays available and installed kernels, regardless of how they were installed. It installs the kernel, headers and modules. There is also a CLI client.

To install the UMKI:

sudo add-apt-repository ppa:cappelikan/ppa
sudo apt update
sudo apt install mainline

With that out of the way, there’s the matter of deciding which kernels to try. The “interesting” Ubuntu kernels are 5.13.0-22 and 5.13.0-23, so the mainline kernels I definitely want to test, are around those versions. That means 5.13.0 and 5.13.1. I also want to try the latest 5.13.x kernel, so that’s 5.13.19, and the most recent stable kernel, 5.16.11 (as of 2022-03-01).

To summarize, I have tested these mainline kernels:

  • 5.13.0
  • 5.13.1
  • 5.13.19
  • 5.16.11

The result (after several reboots)? With all of them, my keyboard and mouse worked without a hitch. That means the issue most likely doesn’t occur in (stable) mainline kernels, only in kernels with additional patches from Ubuntu.

Up next: compiling kernels from source.

Lasciate ogne speranza, voi ch’intrate.

Dante Alighieri

The hunt for a kernel bug, part 1

The operating system on my computer is Ubuntu Linux, version 21.10 (Impish Indri). Recently I had an issue that, after a kernel update (and reboot), my USB keyboard and mouse didn’t work any more in the login screen. Huh, that’s unexpected.
The issue was:

  • At the Grub boot menu, the keyboard works: I can use the keys, the numlock led lights up, the LCD of the Logitech G19 displays a logo.
  • At the Ubuntu login screen, the keyboard (and the mouse) went dark: no backlight of the keys, no numlock led, no logo on the display. And the mouse cursor didn’t move on screen.

Must be a problem at my end, I initially thought, because surely, something so essential as input devices wouldn’t break by a simple kernel update? So I did some basic troubleshooting:

  • Have you tried to turn it off and on again?
Have you tried to turn it off and on again?
Have you tried to turn it off and on again?
  • Plug the keyboard in another USB port.
  • Try a different keyboard.
  • Start with the older kernel, which was still in the Grub menu. And indeed, this gave me back control over my input devices!

So if the only thing I changed was the kernel, then maybe it’s a kernel bug after all?

I know that Ubuntu 21.10 uses kernel 5.something, and I know that I use the generic kernels. So which kernels are we talking about, actually?

$ apt-cache show linux-image-5*-generic | grep Package: | sed 's/Package: //g'
linux-image-5.13.0-19-generic
linux-image-5.13.0-20-generic
linux-image-5.13.0-21-generic
linux-image-5.13.0-22-generic
linux-image-5.13.0-23-generic
linux-image-5.13.0-25-generic
linux-image-5.13.0-27-generic
linux-image-5.13.0-28-generic
linux-image-5.13.0-30-generic

9 kernels, that’s not too bad. All of them 5.13.0-XX-generic. So I just installed all the kernels:

$ sudo apt install --yes \
    linux-{image,headers,modules,modules-extra,tools}-5.13.0-*-generic
One Eternity Later

My /boot directory is quite busy now:

$  ls -hl /boot
total 1,2G
drwxr-xr-x  4 root root  12K mrt  1 18:11 .
drwxr-xr-x 20 root root 4,0K mrt  1 18:11 ..
-rw-r--r--  1 root root 252K okt  7 11:09 config-5.13.0-19-generic
-rw-r--r--  1 root root 252K okt 15 15:53 config-5.13.0-20-generic
-rw-r--r--  1 root root 252K okt 19 10:41 config-5.13.0-21-generic
-rw-r--r--  1 root root 252K nov  5 10:21 config-5.13.0-22-generic
-rw-r--r--  1 root root 252K nov 26 12:14 config-5.13.0-23-generic
-rw-r--r--  1 root root 252K jan  7 16:16 config-5.13.0-25-generic
-rw-r--r--  1 root root 252K jan 12 15:43 config-5.13.0-27-generic
-rw-r--r--  1 root root 252K jan 13 18:13 config-5.13.0-28-generic
-rw-r--r--  1 root root 252K feb  4 17:40 config-5.13.0-30-generic
drwx------  4 root root 4,0K jan  1  1970 efi
drwxr-xr-x  5 root root 4,0K mrt  1 18:11 grub
lrwxrwxrwx  1 root root   28 feb 28 04:26 initrd.img -> initrd.img-5.13.0-22-generic
-rw-r--r--  1 root root  40M mrt  1 16:02 initrd.img-5.13.0-19-generic
-rw-r--r--  1 root root  40M mrt  1 17:39 initrd.img-5.13.0-20-generic
-rw-r--r--  1 root root  40M mrt  1 17:38 initrd.img-5.13.0-21-generic
-rw-r--r--  1 root root  40M feb 26 13:55 initrd.img-5.13.0-22-generic
-rw-r--r--  1 root root  40M mrt  1 17:40 initrd.img-5.13.0-23-generic
-rw-r--r--  1 root root  40M mrt  1 17:40 initrd.img-5.13.0-25-generic
-rw-r--r--  1 root root  40M mrt  1 17:41 initrd.img-5.13.0-27-generic
-rw-r--r--  1 root root  40M mrt  1 17:41 initrd.img-5.13.0-28-generic
-rw-r--r--  1 root root  40M mrt  1 17:38 initrd.img-5.13.0-30-generic
-rw-------  1 root root 5,7M okt  7 11:09 System.map-5.13.0-19-generic
-rw-------  1 root root 5,7M okt 15 15:53 System.map-5.13.0-20-generic
-rw-------  1 root root 5,7M okt 19 10:41 System.map-5.13.0-21-generic
-rw-------  1 root root 5,7M nov  5 10:21 System.map-5.13.0-22-generic
-rw-------  1 root root 5,7M nov 26 12:14 System.map-5.13.0-23-generic
-rw-------  1 root root 5,7M jan  7 16:16 System.map-5.13.0-25-generic
-rw-------  1 root root 5,7M jan 12 15:43 System.map-5.13.0-27-generic
-rw-------  1 root root 5,7M jan 13 18:13 System.map-5.13.0-28-generic
-rw-------  1 root root 5,7M feb  4 17:40 System.map-5.13.0-30-generic
lrwxrwxrwx  1 root root   25 feb 28 04:27 vmlinuz -> vmlinuz-5.13.0-22-generic
-rw-------  1 root root 9,8M okt  7 19:37 vmlinuz-5.13.0-19-generic
-rw-------  1 root root 9,8M okt 15 15:56 vmlinuz-5.13.0-20-generic
-rw-------  1 root root 9,8M okt 19 10:43 vmlinuz-5.13.0-21-generic
-rw-------  1 root root 9,8M nov  5 13:51 vmlinuz-5.13.0-22-generic
-rw-------  1 root root 9,8M nov 26 11:52 vmlinuz-5.13.0-23-generic
-rw-------  1 root root 9,8M jan  7 16:19 vmlinuz-5.13.0-25-generic
-rw-------  1 root root 9,8M jan 12 16:19 vmlinuz-5.13.0-27-generic
-rw-------  1 root root 9,8M jan 13 18:10 vmlinuz-5.13.0-28-generic
-rw-------  1 root root 9,8M feb  4 17:46 vmlinuz-5.13.0-30-generic

I tried all these kernels. The last kernel where my input devices still worked, was 5.13.0-22-generic, and the first where they stopped working, was 5.13.0-23-generic. Which leads me to assume that some unintended change was introduced between those two versions, and it hasn’t been fixed since.

For now, I’m telling Ubuntu to keep kernel 5.13.0-22-generic and not upgrade to a more recent version.

$ sudo apt-mark hold linux-image-5.13.0-22-generic
linux-image-5.13.0-22-generic set on hold.

I also want Grub to show me the known working kernel as the default change. To do that, I’ve put this in /etc/default/grub:

GRUB_DEFAULT="Advanced options for Ubuntu>Ubuntu, with Linux 5.13.0-22-generic"

followed by sudo update-grub.

I’ll do the following things next, to get to the bottom of this: