From a78b9682b77cf0fb2d301f82d55ca5d887cbcafb Mon Sep 17 00:00:00 2001 From: Alois Wohlschlager Date: Sun, 24 Sep 2023 13:30:36 +0200 Subject: [PATCH] docs: Improve troubleshooting documentation Due to the temporarily doubled ESP space usage, it is now easier to run into the out of space issue (once). Document how to proceed in this case without having to delete any generations. Furthermore, recovery in case of ESP corruption is now slightly more involved, because not all files are rewritten all the time. Adjust the documentation accordingly. --- docs/TROUBLESHOOTING.md | 24 +++++++++++++++++------- 1 file changed, 17 insertions(+), 7 deletions(-) diff --git a/docs/TROUBLESHOOTING.md b/docs/TROUBLESHOOTING.md index 0e136fc..d9ab13c 100644 --- a/docs/TROUBLESHOOTING.md +++ b/docs/TROUBLESHOOTING.md @@ -14,6 +14,10 @@ It is recommended run a garbage collection regularly, and monitor the ESP usage **Warning:** It is recommended to not delete the currently booted kernel and initrd, and to not reboot the system before running `nixos-rebuild boot` again, to minimize the risk of accidentally rendering the system unbootable. +**Note:** When upgrading Lanzaboote from version 0.3.0, or from git master prior to the merge of PR #204, ESP space usage is temporarily doubled. +Hence it is possible for this error to occur even if there was plenty (but less than half) free space available prior to the installation. +In this case, it is not necessary to delete any generations, and you can proceed directly to deleting some kernels and initrds before running the installation again. + ## Power failed during bootloader installation, and now the system does not boot any more Due to the shortcomings of the FAT32 filesystem, in rare cases, it is possible for the ESP to become corrupted after power loss. @@ -26,10 +30,15 @@ In this case, the steps below will not help, and standard rollback procedures sh ### The system can still boot an older generation In case an older generation still works, the recovery can be carried out from within the booted system. +Run `nix-shell -p openssl sbctl` to ensure the tools required for recovery are available. -1. Run `nixos-rebuild boot`. - This should reinstall all generations and thus overwrite the corrupted files. -2. Reboot the system, it should now work again. +1. Run `sudo sbctl verify /boot/EFI/Linux/nixos-generation-*.efi` to check the Lanzaboote stubs. + Files that have a crossmark on their left are corrupted and must be deleted. +2. Run `for file in /boot/EFI/nixos/*.efi; do hash=$(openssl dgst -sha256 -binary "$file" | base32 | tr -d = | LC_ALL=C tr [:upper:] [:lower:]); if [[ $file != *$hash.efi ]]; then echo $file; fi; done` to check the kernels and initrds. + Any files that are printed are corrupted and must be deleted. +3. Run `nixos-rebuild boot`. + This should reinstate all files that are required for the newer generations to boot. +4. Reboot the system, it should now work again. ### The system cannot boot any generation anymore @@ -45,11 +54,12 @@ A more recent medium must be used for the recovery procedure to work reliably. 3. Mount all partitions belonging to the system to be recovered under `/mnt`, just like you would for installation. 1. In case the ESP does not mount, or only mounts in read-only mode, due to corruption, try `fsck.fat` first. If that fails as well or the ESP still does not mount, it needs to be reformatted using `mkfs.fat`. -4. Enter the recovery shell by running `nixos-enter`. +4. Delete the corrupted files on the ESP, using `rm -fr /mnt/boot/EFI/Linux/nixos-generation-*.efi /mnt/boot/efi/nixos`. +5. Enter the recovery shell by running `nixos-enter`. Then, run `nixos-rebuild boot` to install the bootloader again. -5. Exit the recovery shell and unmount all filesystems. -6. Reboot the system to verify that everything works again. -7. Enable Secure Boot again in the firmware settings. +6. Exit the recovery shell and unmount all filesystems. +7. Reboot the system to verify that everything works again. +8. Enable Secure Boot again in the firmware settings. ## The system doesn't boot with Secure Boot enabled