-
Notifications
You must be signed in to change notification settings - Fork 445
machine pool surge, max unavailable and instance delete #1105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
4a05556
to
25ae039
Compare
bb0c14e
to
07e5a09
Compare
@@ -154,7 +154,7 @@ def capz(): | |||
local_resource( | |||
"manager", | |||
cmd = 'mkdir -p .tiltbuild;CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -ldflags \'-extldflags "-static"\' -o .tiltbuild/manager', | |||
deps = ["api", "cloud", "config", "controllers", "exp", "feature", "pkg", "go.mod", "go.sum", "main.go"] | |||
deps = ["api", "azure", "config", "controllers", "exp", "feature", "pkg", "go.mod", "go.sum", "main.go"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should probably pull this out into another PR, but what's 1 line on 4k+ 😞
// NewCoalescingReconciler returns a reconcile wrapper that will delay new reconcile.Requests | ||
// after the cache expiry of the request string key. | ||
// A successful reconciliation is defined as as one where no error is returned | ||
func NewCoalescingReconciler(upstream reconcile.Reconciler, cache *CoalescingRequestCache, log logr.Logger) reconcile.Reconciler { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wraps the AzureMachinePool
and AzureMachinePoolMachine
reconcilers so that they debounce, the reconcilers rate limit the incoming events so they only do so many within a window of time to not overwhelm Azure API limits.
There is no good way to do this in controller-runtime. I'll intro a proposal over there for middleware or incoming rate limiting.
// inform the controller that if the parent MachinePool.Spec.Template.Spec.Version field is updated, this image | ||
// will be updated with the corresponding default image. If Defaulted is set to false, the controller will not | ||
// update the image reference when the MachinePool.Spec.Template.Spec.Version changes. | ||
AzureDefaultingImage struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought we should be explicit about image defaulting. This enables users to use default image versions for K8s versions specified on MachinePools while being safe to specify their own without updates to MachinePools overriding their image reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds like a good idea. Should these AzureDefaultingImage
changes be part of a different PR instead?
// 2) over-provisioned machines prioritized by out of date models first | ||
// 3) over-provisioned ready machines | ||
// 4) ready machines within budget which have out of date models | ||
func (m *MachinePoolScope) selectMachinesToDelete(machinesByProviderID map[string]infrav1exp.AzureMachinePoolMachine) map[string]infrav1exp.AzureMachinePoolMachine { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This contains the logic for selecting machines to delete when over-provisioned or upgrading. The AzureMachinePool
informs the AzureMachinePoolMachine
to delete and the AzureMachinePoolMachine
is expected to safely remove itself from the pool.
return ampml.Items, nil | ||
} | ||
|
||
func (m *MachinePoolScope) applyAzureMachinePoolMachines(ctx context.Context) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This contains the logic to compare the state of Azure VMSS with the state of AzureMachinePool(Machine)s
. If a machine exists in Azure, a AzureMachinePoolMachine
is created. If an AzureMachinePoolMachine
exists, but doesn't have an Azure counterpart, it is asked to delete. At the end, the upgrade / over-provision state is evaluated and machines can be deleted if the state requires.
@nader-ziada and @CecileRobertMichon I believe this PR is now ready for review. I'm so sorry of the enormity of the change set. With that in mind, I'm going to introduce One design aspect I've vacillated on is how to represent |
@devigned I've been thinking about this and I think we need a proposal to explain what this does, why, what future work is planned and what alternatives were considered (and why they don't work). This is a pretty big PR and it's not obvious why we're doing upgrade this way, so we should document it for the record. Also, it can serve as developer-facing documentation once the feature is merged. What do you think? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went through the changes and it all makes sense, but I have a comment about the structure of the code, the main logic that figures out what machines to create/delete is in the scope instead of in the controller, even setting the conditions, which is different that how we have done things in other places, was this by design?
It was a conscious decision. It was a bit of an experiment to see how it would turn out.
That was a perceptive review. Thank you. WDYT? |
@devigned: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I don't feel strongly about it, but I expected the controller / reconciler to be responsible for also updating the status on the k8s resources, which would include setting the condition. |
|
||
// Reconcile idempotently gets, creates, and updates a scale set. | ||
func (s *Service) Reconcile(ctx context.Context) error { | ||
ctx, span := tele.Tracer().Start(ctx, "scalesets.Service.Reconcile") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ctx, span := tele.Tracer().Start(ctx, "scalesets.Service.Reconcile") | |
ctx, span := tele.Tracer().Start(ctx, "scalesetvms.Service.Reconcile") |
} | ||
} | ||
|
||
// Reconcile idempotently gets, creates, and updates a scale set. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be changed. Maybe something like "Reconcile fetches the latest data about the scaleset instance".
) | ||
|
||
type ( | ||
// ScaleSetVMScope defines the scope interface for a scale sets service. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// ScaleSetVMScope defines the scope interface for a scale sets service. | |
// ScaleSetVMScope defines the scope interface for a scaleset vms service. |
|
||
// Delete deletes a scaleset instance asynchronously returning a future which encapsulates the long running operation. | ||
func (s *Service) Delete(ctx context.Context) error { | ||
ctx, span := tele.Tracer().Start(ctx, "scalesets.Service.Delete") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ctx, span := tele.Tracer().Start(ctx, "scalesets.Service.Delete") | |
ctx, span := tele.Tracer().Start(ctx, "scalesetvms.Service.Delete") |
@@ -271,3 +317,108 @@ func AzureClusterToAzureMachinePoolsFunc(kClient client.Client, log logr.Logger) | |||
return result | |||
} | |||
} | |||
|
|||
// AzureMachinePoolToAzureMachinePoolMachines maps an AzureMachinePool to it's child AzureMachinePoolMachines through |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// AzureMachinePoolToAzureMachinePoolMachines maps an AzureMachinePool to it's child AzureMachinePoolMachines through | |
// AzureMachinePoolToAzureMachinePoolMachines maps an AzureMachinePool to its child AzureMachinePoolMachines through |
// Defaulted informs the controller that the image reference was defaulted so that it can be updated by changes | ||
// to the MachinePool.Spec.Template.Spec.Version field by default. | ||
// +optional | ||
Defaulted bool `json:"defaulted,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case Defaulted
means not only that the value was defaulted but that the Image
will be managed by the controller. What do you think about calling this field Managed
instead? IMHO the word better conveys the behavior. I believe managed is also how we refer to resources managed by the controllers in the code base.
c, err := ctrl.NewControllerManagedBy(mgr). | ||
WithOptions(options.Options). | ||
For(&infrav1exp.AzureMachinePoolMachine{}). | ||
WithEventFilter(predicates.ResourceNotPaused(log)). // don't queue reconcile if resource is paused |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WithEventFilter(predicates.ResourceNotPaused(log)). // don't queue reconcile if resource is paused | |
WithEventFilter(predicates.ResourceNotPausedAndHasFilterLabel(ctrl.LoggerFrom(ctx), r.WatchFilterValue)). |
func NewAzureMachinePoolMachineController(c client.Client, log logr.Logger, recorder record.EventRecorder, reconcileTimeout time.Duration) *AzureMachinePoolMachineController { | ||
return &AzureMachinePoolMachineController{ | ||
Client: c, | ||
Log: log, | ||
Recorder: recorder, | ||
ReconcileTimeout: reconcileTimeout, | ||
reconcilerFactory: newAzureMachinePoolMachineReconciler, | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func NewAzureMachinePoolMachineController(c client.Client, log logr.Logger, recorder record.EventRecorder, reconcileTimeout time.Duration) *AzureMachinePoolMachineController { | |
return &AzureMachinePoolMachineController{ | |
Client: c, | |
Log: log, | |
Recorder: recorder, | |
ReconcileTimeout: reconcileTimeout, | |
reconcilerFactory: newAzureMachinePoolMachineReconciler, | |
} | |
} | |
func NewAzureMachinePoolMachineController(c client.Client, log logr.Logger, recorder record.EventRecorder, reconcileTimeout time.Duration, watchFilterValue string) *AzureMachinePoolMachineController { | |
return &AzureMachinePoolMachineController{ | |
Client: c, | |
Log: log, | |
Recorder: recorder, | |
ReconcileTimeout: reconcileTimeout, | |
reconcilerFactory: newAzureMachinePoolMachineReconciler, | |
WatchFilterValue: watchFilterValue, | |
} | |
} |
Log logr.Logger | ||
Scheme *runtime.Scheme | ||
Recorder record.EventRecorder | ||
ReconcileTimeout time.Duration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ReconcileTimeout time.Duration | |
ReconcileTimeout time.Duration | |
WatchFilterValue string |
machineScope.SetFailureReason(capierrors.UpdateMachineError) | ||
machineScope.SetFailureMessage(errors.Errorf("Azure VM state is %s", state)) | ||
case infrav1.VMStateDeleting: | ||
// for some reason, the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like the comment got truncated
@@ -97,6 +97,30 @@ spec: | |||
location: | |||
description: Location is the Azure region location e.g. westus2 | |||
type: string | |||
maxSurge: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The maxSurge
field in the k8s Deployment
object accepts an absolute number but also a percentage. Should we support percentage as well? I believe it improves the UX as the user doesn't have to adapt the maxSurge
value when scaling up/down their VMSS. If we decide to support percentages, that would mean changing the type of the field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return vm, nil | ||
} | ||
|
||
// DeleteAsync is the operation to delete a virtual machine scale set asynchronously. DeleteAsync sends a DELETE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// DeleteAsync is the operation to delete a virtual machine scale set asynchronously. DeleteAsync sends a DELETE | |
// DeleteAsync is the operation to delete a virtual machine scale set vm asynchronously. DeleteAsync sends a DELETE |
// | ||
// Parameters: | ||
// resourceGroupName - the name of the resource group. | ||
// vmssName - the name of the VM scale set to create or update. parameters - the scale set object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// vmssName - the name of the VM scale set to create or update. parameters - the scale set object. | |
// vmssName - the name of the VM scale set to create or update. parameters - the scale set object. | |
// instanceID - the ID of the VM scale set VM. |
// resourceGroupName - the name of the resource group. | ||
// vmssName - the name of the VM scale set to create or update. parameters - the scale set object. | ||
func (ac *azureClient) DeleteAsync(ctx context.Context, resourceGroupName, vmssName, instanceID string) (*infrav1.Future, error) { | ||
ctx, span := tele.Tracer().Start(ctx, "scalesets.AzureClient.DeleteAsync") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ctx, span := tele.Tracer().Start(ctx, "scalesets.AzureClient.DeleteAsync") | |
ctx, span := tele.Tracer().Start(ctx, "scalesetvms.AzureClient.DeleteAsync") |
|
||
// Get retrieves the Virtual Machine Scale Set Virtual Machine | ||
func (ac *azureClient) Get(ctx context.Context, resourceGroupName, vmssName, instanceID string) (compute.VirtualMachineScaleSetVM, error) { | ||
ctx, span := tele.Tracer().Start(ctx, "scalesets.AzureClient.Get") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ctx, span := tele.Tracer().Start(ctx, "scalesets.AzureClient.Get") | |
ctx, span := tele.Tracer().Start(ctx, "scalesetvms.AzureClient.Get") |
I'm going to close this PR and open a new PR with an updated branch based off or the proposal in #1191. I think it will help us to apply fresh eyes and a clean slate. /close |
@devigned: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What type of PR is this?
/kind feature
What this PR does / why we need it:
This PR introduces API changes to AzureMachinePool to enable users of CAPZ to perform safe, fast rolling upgrades building off of #1067. In this change set
MaxSurge
andMaxUnavailable
fields are introduced to theAzureMachinePool
spec.MaxSurge
enables machine pools to over-provision machines, increase the number of machines above the desired count, during an upgrade, which would allow faster upgrades at the cost of Azure compute core quota.MaxUnavailable
enables one to specify how many nodes must be available during a rolling upgrade.Instance delete
enables an individual to delete Azure Machine Pool Machines individually and controller initiated node drain. As part of the instance delete / node drain state tracking forAzureMachinePoolMachines
, it is advantageous to track state on these resources individually. That is why in the PR, theAzureMachinePool.Status.Instances
array is removed in favor ofAzureMachinePoolMachine
resources.AzureMachinePool
instances into individualAzureMachinePoolMachine
resources will be a breaking change in the experimental API.This PR is work in progress. Please provide early feedback.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #819
Special notes for your reviewer:
Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.
TODOs:
Release note: