Debian testing kernel update snafu
I have a few Linux machines (one Linux box at home, and a few VPS) running Debian testing (currently Forky). In the past few days, its version of Linux kernel was upgraded to 6.19.6+deb14, then a few days later upgraded again to 6.19.6+deb14+1. And that caused some problems.
Usually kernel upgrades on Debian are trivial. After the new kernel is installed, grub will be updated to have both, with the newer one being the default one. Then you remote reboot the machine, and after it rebooted into the new kernel, you can then remove the old kernel.
But here, grub somehow thought 6.19.6+deb14 is newer than 6.19.6+deb14+1 (likely because how string sorting worked), so after installed the new kernel version, 6.19.6+deb14 was still the default, and for remote headless machines (VPS) you can’t manually select the non-default one during boot, which makes the “booting into the new kernel” step annoying.
For one of them, I even didn’t notice that it didn’t boot into the new kernel version (I checked uname -a after booting, but the 2 versions are too similar), and removed the old kernel while on the old kernel (It did warn me that I’m trying to remove the kernel version that’s currently running, but because the Yes and No buttons are not in their conventional order I accidentally selected the wrong one). For these 2 particular kernel versions, after removed one the apt trigger also seems to have a bug that it doesn’t remove the removed one from grub.cfg correctly (manually run update-grub will update it correctly), made that VPS actually unbootable (because the default selected kernel version in grub menu no longer exists) and I have to use rescue mode to manually change grub.cfg to a usable state.
In the rescue mode provided by the VPS provider, it booted into a machine with the original disk of my VPS attached (but not mounted). So I first used lsblk to get the device id of the disk (which is /dev/vda1), then mounted it via mount /dev/vda1 /mnt).
Then I try to edit /mnt/boot/grub/grub.cfg, but realized that the only editor on the rescue image is nano which is not something I’m used to, and while apt is available on the rescue image, I cannot actually install vim or neovim on it.
So I did chroot /mnt, which actually made the tools (nvim) on that VPS available again. But update-grub still doesn’t work in a chroot environment.
So inside /boot/grub/grub.cfg, I found the entries with 6.19.6+deb14+1 in the submenu section, and copied the lines (the most important ones are the 2 lines starting with linux and initrd respectively) to the default menuentry section. After that, I exited rescue mode, and the machine booted into the new kernel successfully.
On Debian systems /boot/grub/grub.cfg file is auto generated by update-grub and in most cases you shouldn’t edit it manually. So after it successfully booted into the new kernel, I made sure the old kernel version is cleaned up in apt correctly, and manually run update-grub to make sure /boot/grub/grub.cfg is in a good state, and it’s finally back to normal.
For the other machines that’s still bootable (just into the old kernel version by default), I just repeated the steps to manually edit /boot/grub/grub.cfg so that the new kernel is the default one, reboot, cleanup old kernel versions, run update-grub.