Bug in ZFS 0.7.3-1 on CentOS - the zfs module is missing and the system is not booting (kmod-zfs-0.7.3-1.el7_3.x86_64) [WORKAROUND, NOT FIXED!]
Don't panic if you are using zfs-dkms, it looks like the bug affects only kmod-zfs.
Update: I was wrong - did not changed the version of the repo from 7.3 to 7.4. When updating your CentOS system with kmod-zfs, please update also your zfs.repo.
If you don't install the correct kernel like described here, you might see this error message (the system would not boot if it is installed on a ZFS partition):
This bug is due to ZFS module compiled for another kernel.
(A simple workaround is to migrate from kmod-zfs to zfs-dkms like described here.)
The current kernel is 3.10.0-514.10.2.el7.x86_64.
But the ZFS kernel module is for 3.10.0-514.26.2.el7.x86_64:
[root@localhost ~]# find /lib/modules/ | grep zfs /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/avl /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/avl/zavl.ko /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/nvpair /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/nvpair/znvpair.ko /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/unicode /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/unicode/zunicode.ko /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/icp /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/icp/icp.ko /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/zfs /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/zfs/zfs.ko /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/zpios /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/zpios/zpios.ko /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/zcommon /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/zcommon/zcommon.ko [root@localhost ~]#
Let's assume we have an old CentOS install (installed like described in this guide):
[root@localhost ~]# cat /proc/version Linux version 3.10.0-514.10.2.el7.x86_64 (builder@kbuilder.dev.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Fri Mar 3 00:04:05 UTC 2017 [root@localhost ~]# cat /etc/redhat-release CentOS Linux release 7.3.1611 (Core) [root@localhost ~]#Here are the installed kernels:
[root@localhost ~]# ls /boot/ config-3.10.0-514.10.2.el7.x86_64 config-3.10.0-514.el7.x86_64 grub grub2 initramfs-0-rescue-263ede4315d14b8ba4b3a93abc833aea.img initramfs-3.10.0-514.10.2.el7.x86_64.img initramfs-3.10.0-514.el7.x86_64.img initrd-plymouth.img symvers-3.10.0-514.10.2.el7.x86_64.gz symvers-3.10.0-514.el7.x86_64.gz System.map-3.10.0-514.10.2.el7.x86_64 System.map-3.10.0-514.el7.x86_64 vmlinuz-0-rescue-263ede4315d14b8ba4b3a93abc833aea vmlinuz-3.10.0-514.10.2.el7.x86_64 vmlinuz-3.10.0-514.el7.x86_64 [root@localhost ~]#
Let's make a backup of the 3.10.0-514.10.2.el7.x86_64 just in case:
[root@localhost ~]# cd /boot/ [root@localhost boot]# cp initramfs-3.10.0-514.10.2.el7.x86_64.img initramfs-3.10.0-514.10.2.bak1.el7.x86_64.img [root@localhost boot]# cp vmlinuz-3.10.0-514.10.2.el7.x86_64 vmlinuz-3.10.0-514.10.2.bak1.el7.x86_64 [root@localhost boot]# cd [root@localhost ~]#
When updating the system we exclude the kernel:
# yum update --exclude=kernel* -y
We confirm the bug – the ZFS module is for another kernel version, not the installed:
[root@localhost ~]# find /lib/modules/ | grep zfs /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/avl /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/avl/zavl.ko /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/nvpair /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/nvpair/znvpair.ko /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/unicode /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/unicode/zunicode.ko /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/icp /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/icp/icp.ko /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/zfs /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/zfs/zfs.ko /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/zpios /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/zpios/zpios.ko /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/zcommon /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/zfs/zcommon/zcommon.ko [root@localhost ~]#
We install the wget and downloading the correct kernel:
# yum install wget # wget https://buildlogs.centos.org/c7.1611.u/kernel/20170704132018/3.10.0-514.26.2.el7.x86_64/kernel-3.10.0-514.26.2.el7.x86_64.rpm
We install the kernel we just downloaded:
[root@localhost ~]# rpm -Uvh --oldpackage kernel-3.10.0-514.26.2.el7.x86_64.rpm Preparing... ################################# [100%] Updating / installing... 1:kernel-3.10.0-514.26.2.el7 ################################# [ 33%] grubby fatal error: unable to find a suitable template Cleaning up / removing... 2:kernel-3.10.0-514.10.2.el7 ################################# [ 67%] 3:kernel-3.10.0-514.el7 ################################# [100%] [root@localhost ~]#
Before to run the grub we need to do this (workaround for another bug):
[root@localhost ~]# cd /dev [root@localhost dev]# ln -s /dev/disk/by-id/* . -i [root@localhost dev]# cd [root@localhost ~]#
We make a backup for the recently installed kernel:
[root@localhost ~]# cd /boot/ [root@localhost boot]# ls config-3.10.0-514.26.2.el7.x86_64 efi grub grub2 initramfs-0-rescue-263ede4315d14b8ba4b3a93abc833aea.img initramfs-3.10.0-514.10.2.bak1.el7.x86_64.img initramfs-3.10.0-514.26.2.el7.x86_64.img initrd-plymouth.img symvers-3.10.0-514.26.2.el7.x86_64.gz System.map-3.10.0-514.26.2.el7.x86_64 vmlinuz-0-rescue-263ede4315d14b8ba4b3a93abc833aea vmlinuz-3.10.0-514.10.2.bak1.el7.x86_64 vmlinuz-3.10.0-514.26.2.el7.x86_64 [root@localhost boot]# cp vmlinuz-3.10.0-514.26.2.el7.x86_64 vmlinuz-3.10.0-514.26.2.bak1.el7.x86_64 [root@localhost boot]# cp initramfs-3.10.0-514.26.2.el7.x86_64.img initramfs-3.10.0-514.26.2.bak1.el7.x86_64.img [root@localhost boot]# cd [root@localhost ~]#
We update the Grub menu:
[root@localhost ~]# grub2-mkconfig -o /boot/grub2/grub.cfg Generating grub configuration file ... Found linux image: /boot/vmlinuz-3.10.0-514.26.2.el7.x86_64 Found initrd image: /boot/initramfs-3.10.0-514.26.2.el7.x86_64.img Found linux image: /boot/vmlinuz-3.10.0-514.26.2.bak1.el7.x86_64 Found initrd image: /boot/initramfs-3.10.0-514.26.2.bak1.el7.x86_64.img Found linux image: /boot/vmlinuz-3.10.0-514.10.2.bak1.el7.x86_64 Found initrd image: /boot/initramfs-3.10.0-514.10.2.bak1.el7.x86_64.img Found linux image: /boot/vmlinuz-0-rescue-263ede4315d14b8ba4b3a93abc833aea Found initrd image: /boot/initramfs-0-rescue-263ede4315d14b8ba4b3a93abc833aea.img done [root@localhost ~]#
Rebooting the system...
# reboot
And now it should work:
[root@localhost ~]# cat /proc/version Linux version 3.10.0-514.26.2.el7.x86_64 (mockbuild@c1bm.rdu2.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Jul 4 13:29:22 UTC 2017 [root@localhost ~]# cat /etc/redhat-release CentOS Linux release 7.4.1708 (Core) [root@localhost ~]#
My bug report is here: https://github.com/zfsonlinux/zfs/issues/6834.
* * *
Comment from Reddit:
Just boot into a previous kernel. If you don't have one, just boot a live medium.
In some cases there is no previous kernel. And the 'live medium' does not contain the ZFS kernel module. And the system is VPS and there is no way to run a 'live medium' with ZFS support.
This was the case with my old install (not updated since first install).
It rebuilds (overwrites) the initramfs with no zfs kernel module in it (after the 'yum update') and there is no way to boot into a previous kernel, because there is no previous kernel.
If your previous kernel is not the exact version (3.10.0-514.26.2.el7.x86_64) it will not boot. I tested it on my 'production' install, it boots only with 3.10.0-514.26.2.el7.x86_64. Other initramfs's are without zfs module.
The correct kernel version is not available with 'yum install kernel-3.10.0-514.26.2.el7.x86_64' (error: 'No package kernel-3.10.0-514.26.2.el7.x86_64 available.'). It can be installed only from the RPM package now.
Luckily, on my 'production' install, I ran 'yum update' several times and there was one working 'previous' kernel. I was lucky. Many people just use their systems for months without a single 'yum update' and don't have working 'previous' kernel (their initramfs is overwritten after the 'yum update').
This is why we should make backups of the kernels before 'yum install'.
Thanks Microsoft for shipping a broken update AGAIN!
ReplyDelete