Issue detected after updating Ubuntu kernel: action required

sabrinasadik · February 28, 2024, 9:17pm

As of February 28, 2024 this advisory appears to be no longer relevant.

Ubuntu 22.04 full VMs deployed on the ThreeFold Grid will now update to the 5.15.0-1051-kvm kernel when a system update is triggered. This kernel does not have the issue described below.

Updating full VMs normally should now be okay.

Hi there,

When updating some VM’s to Ubuntu kernel version 5.15.0-88, we noticed that rebooting the VM resulted in the VM getting stuck and failing to reboot.

After some investigation, a bug in the Ubuntu kernel has been discovered. The bug has been fixed in the new version, but when the Ubuntu kernel is updated, the latest version is not installed.

This means that if a VM is updated and rebooted, the VM will be lost and there is no way to get it back.

The only way to avoid this is to either avoid the Ubuntu kernel update or follow these steps to get to a better kernel:

Please note that the steps shown when expanding this section only apply to VMs that are already running a generic kernel. If you're using an official ThreeFold full VM image, please see the alternate instructions in the next post below.


add-apt-repository ppa:canonical-kernel-team/ppa


apt update


apt install linux-image-5.15.0-90-generic

Do NOT upgrade AND reboot existing deployments, if needed deploy the above kernel.
Faulty kernel after an apt upgrade is linux-image-5.15.0-88-generic
Resolved with: linux-image-5.15.0-90-generic

For any questions or assistance, please feel free to reach out to our live chat support.

Kind regards,
Sabrina
On behalf of the ThreeFold Team

scott · November 27, 2023, 11:06pm

TL:DR - If you have an Ubuntu 22.04 full VM deployed from our official image, it’s liable to stop working the next time it’s rebooted (including if the node it’s running on loses power), regardless of whether you trigger a system update. Please follow the steps in the fix section below asap.

Additional notes

Having done some research on this, there’s a couple more important points to note:

Ubuntu has automatic updates enabled by default. This means that affected VMs can be broken with just a reboot and avoiding a manual update is not enough to protect from this issue
This seems to only affect Ubuntu 22.04 full VMs, by default (20.04 can be upgraded manually to the 5.15.x kernel versions, but it won’t happen automatically)
Our official image for Ubuntu 22.04 full VM ships with the linux-kvm kernel and the latest version, linux-image-5.15.0-1046-kvm, seems to have the same problem

Fix based on freezing kernel version

Rather than update to an early release build as shown above, I’ll show instead how to freeze the kernel version in our official Ubuntu 22.04 full VM images.

Friendly reminder: this would be a great time to backup any valuable data you might have on your VMs.

Undo the fix for generic kernels

If you already did the steps from the original post above, just undo them as follows:

add-apt-repository --remove ppa:canonical-kernel-team/ppa
apt remove linux-image-5.15.0-90-generic

Remove broken kernel version

Now remove the bad kernel version if it’s already installed and also remove the kernel meta packages:

apt remove linux-image-5.15.0-1047-kvm linux-headers-5.15.0-1047-kvm linux-modules-5.15.0-1047-kvm linux-kvm-headers-5.15.0-1047 linux-kvm linux-image-kvm linux-headers-kvm

Since I originally wrote this post, the kernel version was updated from 1046 to 1047 but the issue is still there in the new version. If the command above produces messages saying that the packages weren’t found, you might want to check the installed kernel version like this:

apt list --installed | grep linux

If you’re seeing version numbers including 1046, you should instead remove them like this:

apt remove linux-image-5.15.0-1046-kvm linux-headers-5.15.0-1046-kvm linux-modules-5.15.0-1046-kvm linux-kvm-headers-5.15.0-1046 linux-kvm linux-image-kvm linux-headers-kvm

The version that ships in the image is 1002. There’s no need to remove that version, it will automatically be removed during the next step.

Install known good kernel

Next, we can install the latest kernel that’s available from the more conservative “release (main)” line. This makes sure that a working kernel is used on the next boot:

apt update
apt install linux-image-5.15.0-1004-kvm linux-headers-5.15.0-1004-kvm linux-modules-5.15.0-1004-kvm linux-kvm-headers-5.15.0-1004

You can then go ahead and reboot it to make sure everything is working properly, but that’s not required.

Resume updates at a later time

When you’re ready to resume updates to the kernel (after a new version without the issue has been released), just reinstall the kernel meta packages:

apt install linux-kvm linux-image-kvm linux-headers-kvm

This will pull in the latest kernel and headers, and will also cause automatic updates to resume on the kernel.

Mik · November 11, 2023, 2:04pm

Did a test on a farmerbot full VM running ubuntu 22.04.

The reboot worked.

isyal · November 12, 2023, 8:22am

How to connect VM when its not reachable via planetarny network ;/? Should i delete farmerbot and set up new one ?

Mik · November 12, 2023, 2:55pm

If you can’t SSH into the VM because of the problem mentioned in this post, it can be quicker indeed to just deploy the farmerbot on another VM. I did this in the past.

But maybe we can troubleshoot the issue with your current VM.

Is the farmerbot properly running now?
Can you ping it with planetary?
Did you succeed in the past to connect/SSH into the VM with planetary?

isyal · November 15, 2023, 10:01pm

I had have to redeploy my VM - i couldn’t connect with it in any possible way. After new deploy everything works fine.

I used guide form post below to make sure it won’t happen again