Skip to content

fix/e2e/loadbalancer: added nlb preventing binding on privileged ports #1153

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

mtulio
Copy link

@mtulio mtulio commented May 30, 2025

What type of PR is this?

/kind bug
/kind failing-test
/kind flake

What this PR does / why we need it:

This change fixes the e2e loadbalancer by preventing running tests in privileged ports, making tests to fail in distributions which restricts that configuration.

The issue details #1150

Furthermore, this also makes the test cases more generically, including two more scenarios not covered yet:

  • NLB
  • NLB with node selectors

Which issue(s) this PR fixes:

Fixes #1150

Special notes for your reviewer:

The test framework implements mostly the code, but unfortunately the port of the target/pod is tied to the library (jig) - not exported, causing some code duplication to provide one off fix.

There is a discussion here with some options.

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. kind/bug Categorizes issue or PR as related to a bug. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. kind/flake Categorizes issue or PR as related to a flaky test. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 30, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign kishorj for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label May 30, 2025
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If cloud-provider-aws contributors determine this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot
Copy link
Contributor

Welcome @mtulio!

It looks like this is your first PR to kubernetes/cloud-provider-aws 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/cloud-provider-aws has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 30, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @mtulio. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 30, 2025
@mtulio
Copy link
Author

mtulio commented May 30, 2025

$ ./e2e.test  --ginkgo.focus="loadbalancer" 
  W0530 00:08:35.130637 2533371 test_context.go:478] Unable to find in-cluster config, using default host : https://127.0.0.1:6443
  May 30 00:08:35.130: INFO: The --provider flag is not set. Continuing as if --provider=skeleton had been used.
Running Suite: AWS Cloud Provider End-to-End Tests - /devel/rosa
==============================================================================================
Random Seed: 1748574515 - will randomize all specs

Will run 3 of 5 specs
------------------------------
[cloud-provider-aws-e2e] loadbalancer should configure the loadbalancer based on annotations
(...)
• [134.221 seconds]
------------------------------
S
------------------------------
[cloud-provider-aws-e2e] loadbalancer NLB should configure the loadbalancer with target-node-labels
(....)
 May 30 00:11:16.617: INFO: Found 2 worker nodes
(...)

• [188.525 seconds]
------------------------------
S
------------------------------
[cloud-provider-aws-e2e] loadbalancer NLB should configure the loadbalancer based on annotations
(...)
• [178.780 seconds]
------------------------------

Ran 3 of 5 Specs in 501.527 seconds
SUCCESS! -- 3 Passed | 0 Failed | 0 Pending | 2 Skipped
PASS

@mtulio mtulio changed the title fix/e2e: preventing binding on privileged ports and intro nlb tests fix/e2e: added nlb tests preventing binding on privileged ports May 30, 2025
@mtulio mtulio changed the title fix/e2e: added nlb tests preventing binding on privileged ports fix/e2e/loadbalancer: added nlb preventing binding on privileged ports May 30, 2025
@kmala
Copy link
Member

kmala commented May 30, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 30, 2025
@mtulio
Copy link
Author

mtulio commented May 30, 2025

@mtulio: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cloud-provider-aws-e2e fdc9e65 link true /test pull-cloud-provider-aws-e2e
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

oops, missed local changes. retesting:

/test pull-cloud-provider-aws-e2e

@mtulio mtulio force-pushed the e2e-issue-1150 branch 2 times, most recently from 65ebb0d to 9dc6148 Compare May 30, 2025 20:55
@mtulio
Copy link
Author

mtulio commented May 30, 2025

Fixing node label discovery to support different k8s distributions - e.g. CI infra uses node-role.kubernetes.io/Node

@mtulio
Copy link
Author

mtulio commented May 30, 2025

Fixing node label discovery to support different k8s distributions - e.g. CI infra uses node-role.kubernetes.io/Node

/test pull-cloud-provider-aws-e2e

@mtulio
Copy link
Author

mtulio commented May 31, 2025

@mtulio: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cloud-provider-aws-e2e 9dc6148 link unknown /test pull-cloud-provider-aws-e2e
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Learning with CI infra here, the node-role tag shows different from reported by get nodes, fixing to lower case and trying again:

/test pull-cloud-provider-aws-e2e

@mtulio
Copy link
Author

mtulio commented May 31, 2025

/test all

@mtulio mtulio marked this pull request as ready for review May 31, 2025 20:22
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 31, 2025
@k8s-ci-robot k8s-ci-robot requested review from dims and kmala May 31, 2025 20:22
@mtulio
Copy link
Author

mtulio commented Jun 1, 2025

/retest-required

This change enhance test scenarios by:
- supporting more distributions which does not allow pods to bind on
  privileged ports (default behavior of libjig, see issue
- refact tests to allow adding more cases
- introduce tests to NLB, including advanced tests to validate the node
  selector annotation. AWS SDK is added to satisfy this validatoin.
@mtulio
Copy link
Author

mtulio commented Jun 2, 2025

This PR is ready for review. I just added minimum changes to improve readiness.

Looks like the failures in the last run was CI infra flake. Checking again in the next round.

Hi @kmala , would you mind taking a look here? Thanks!

@kmala
Copy link
Member

kmala commented Jun 2, 2025

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 2, 2025
@mtulio
Copy link
Author

mtulio commented Jun 3, 2025

Assigning the github recommendations, PTAL?:
/assign dims nckturner

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. kind/flake Categorizes issue or PR as related to a flaky test. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

e2e tests for 'loadbalancer' is failing in distributions which restricts privileged ports
5 participants