-
Notifications
You must be signed in to change notification settings - Fork 177
kola: Add soft-reboot support for external tests #4119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Skipping CI for Draft Pull Request. |
depends on: coreos/coreos-assembler#4119
Is there a case where a soft-reboot would be a better test than a hard reboot? Is the goal here to reduce test time? |
Right now soft-reboots are not really supported by ostree but we are implementing soft-reboots for ostree on : ostreedev/ostree#3420 |
Ok, Just comment here when you move this out of draft |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me overall! Main issue is code duplication with the "hard" reboot path but that's probably hard to fix and not worth doing.
echo "test beginning" | ||
# Check that boot_id stays the same across soft-reboot | ||
INITIAL_BOOT_ID=$(cat /proc/sys/kernel/random/boot_id) | ||
echo "Initial boot ID: $INITIAL_BOOT_ID" | logger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need the |logger
, we always log tests to the journal
# Verify boot_id is the same (soft-reboot should not change it) | ||
CURRENT_BOOT_ID=$(cat /proc/sys/kernel/random/boot_id) | ||
echo "Current boot ID after soft-reboot: $CURRENT_BOOT_ID" | logger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this verified though? I think we need to save the boot id to a persistent file like /var/cache/kola-boot-id
or something in the first boot and then compare it here.
mark2) | ||
echo "test in mark2" | ||
FINAL_BOOT_ID=$(cat /proc/sys/kernel/random/boot_id) | ||
echo "Final boot ID after forced soft-reboot: $FINAL_BOOT_ID" | logger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto
OK test now dies with a timeout which is the same behavior I see when I do edit: looking at the console log the VM appears to come back. It's just cosa that loses the ability to ssh back in, even during a |
OK needed to modify qemu.go to get the test to comeback too. Now need to figureout this failed service.
|
Implements soft-reboot capabilities for Kola, it enables tests to use systemd's soft-reboot functionality. The implementation follows the same pattern as regular reboots but for `systemctl soft-reboot`, tracks systemd boot timestamps rather than kernel boot IDs for state detection.
I am not sure of the story of that service but it looks like we could ignore that failed service without any real issues. |
My offhand guess as to what's happening here is that service isn't prepared for soft reboots. https://github.com/coreos/fedora-coreos-config/blob/20feb176f19c3142b7256c1eb5bf1cb7c53b29b9/overlay.d/05core/usr/lib/systemd/system/coreos-ignition-firstboot-complete.service#L4 is still triggered because we didn't change the kernel commandline across the soft reboot. I think CoreOS probably does need to fix this because soft rebooting from the initial boot is a sane thing to do and I see real world use cases for it. It is perhaps as simple as |
Implements soft-reboot capabilities for Kola,
it enables tests to use systemd's soft-reboot functionality.
The implementation follows the same pattern as regular reboots but for
systemctl soft-reboot
, tracks systemd boottimestamps rather than kernel boot IDs for state detection.