Skip to content

Make STSCredentialsProvider prefetch and stale times configurable #1995

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 15, 2020

Conversation

Helmsdown
Copy link
Contributor

Description

Allow clients to specify the time (relative to expiration) for when to refresh STS session tokens.

Motivation and Context

The default STSCredentialsProvider's strategy for refreshing STS sessions is to wait until the session is almost stale and to refresh it either 5 minutes (prefetch) or 1 minute (stale) before it expires. This works great for most use cases. However, it doesn't work well for pre-signed S3 urls. For example, let's say I want to pre-sign an s3 URL that will be good for 90 minutes AND I'm signing the URL with an assumed STS role's credentials. The URL's effective expiration is the min(url expiration, sts session expiration). In other words, if you have STS session credentials that last for 60 minutes, and you want to hand out a URL that will last for 30 minutes, any URLs signed with said session credentials from minutes 31 to 60 will not actually be valid for the full 30 minutes.

The solution is to allow the client to specify the stale and prefetch times of the to use when determining when a session is ready for prefetch and when it's stale. I expose these values as Java Function objects because I intend to reference runtime changeable configuration properties to compute the stale and prefetch times (e.g. prefetch = actual - ${url_length} - ${prefetch_relative_to_url_expiration})

Testing

I modified the StsCredentialsProviderTestBase. For each current test case I replicated it and specified stale and prefetch time functions that provide the same values they did before.

Screenshots (if appropriate)

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

Checklist

  • I have read the CONTRIBUTING document

  • [Not quite] Local run of mvn install succeeds
    I've seen several failures for classes that are not related to anything I've touched. Re-running the tests in intellij succeeds
    For example: aws-sdk-java-v2/test/codegen-generated-classes-test/target/surefire-reports

  • My code follows the code style of this project

  • My change requires a change to the Javadoc documentation

  • I have updated the Javadoc documentation accordingly

  • I have read the README document

  • I have added tests to cover my changes

  • All new and existing tests passed

  • A short description of the change has been added to the CHANGELOG

  • My change is to implement 1.11 parity feature and I have updated LaunchChangelog

License

  • I confirm that this pull request can be released under the Apache 2 license

@codecov-commenter
Copy link

codecov-commenter commented Aug 21, 2020

Codecov Report

Merging #1995 into master will increase coverage by 0.06%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #1995      +/-   ##
============================================
+ Coverage     76.62%   76.69%   +0.06%     
  Complexity      225      225              
============================================
  Files          1113     1113              
  Lines         33679    33696      +17     
  Branches       2608     2623      +15     
============================================
+ Hits          25806    25842      +36     
+ Misses         6594     6579      -15     
+ Partials       1279     1275       -4     
Flag Coverage Δ Complexity Δ
#unittests 76.69% <100.00%> (+0.06%) 225.00 <0.00> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ Complexity Δ
...ssdk/services/sts/auth/StsCredentialsProvider.java 88.57% <100.00%> (+4.57%) 0.00 <0.00> (ø)
...nhanced/dynamodb/extensions/WriteModification.java 52.00% <0.00%> (ø) 0.00% <0.00%> (ø%)
...d/dynamodb/internal/extensions/ChainExtension.java 100.00% <0.00%> (ø) 0.00% <0.00%> (ø%)
...amodb/internal/operations/UpdateItemOperation.java 99.04% <0.00%> (+<0.01%) 0.00% <0.00%> (ø%)
...dynamodb/internal/operations/PutItemOperation.java 96.29% <0.00%> (+0.06%) 0.00% <0.00%> (ø%)
.../software/amazon/awssdk/codegen/AddOperations.java 85.33% <0.00%> (+0.19%) 0.00% <0.00%> (ø%)
...ssdk/auth/credentials/HttpCredentialsProvider.java 69.44% <0.00%> (+0.43%) 0.00% <0.00%> (ø%)
...tials/WebIdentityTokenFileCredentialsProvider.java 37.77% <0.00%> (+0.56%) 0.00% <0.00%> (ø%)
...signer/internal/AwsChunkedEncodingInputStream.java 47.29% <0.00%> (+0.67%) 0.00% <0.00%> (ø%)
.../awssdk/core/util/DefaultSdkAutoConstructList.java 39.39% <0.00%> (+3.03%) 0.00% <0.00%> (ø%)
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4ee2d29...d0d2d3e. Read the comment docs.

@bmaizels
Copy link
Contributor

bmaizels commented Sep 9, 2020

Hi @Helmsdown , thank you very much for putting this PR together. Taking a closer look at your use-case, we have concerns around using custom Functions to calculate the refresh thresholds. Primarily this is breaking the concept of immutability in the behavior of these objects which could easily lead to bad practices and unexpected race conditions, especially as these functions will have no context around why they are being invoked. We have no objections to exposing these thresholds as configurable values, but we would like to see them statically configured (with a duration of some kind) and thus immutable. What are your thoughts?

@Helmsdown
Copy link
Contributor Author

Using a java Function was a way for me to tie this to runtime changeable configuration properties. If this is undesirable to you for immutability reasons then I'll take static values (over nothing) defined once and know that I'll have to reboot my servers should I wish to have new values take effect.

Would you like me to amend my PR or can you make the changes and get them merged?

@bmaizels
Copy link
Contributor

bmaizels commented Sep 10, 2020

@Helmsdown

Using a java Function was a way for me to tie this to runtime changeable configuration properties.

Understood, but we would prefer you to construct and switch to providers with different configuration on demand.

Would you like me to amend my PR or can you make the changes and get them merged?

If you're able to amend your PR, then we can move forward with it in principle. I would still need to review the actual implementation, I wanted to having this more directional discussion first so that we're all on the same page around the goals of this change.

If you don't have time or inclination to make the change, we'd be happy for you to move this to an issue (as a feature request) and we'll work on it in priority order with other features.

@Helmsdown
Copy link
Contributor Author

I will amend my PR. How do you feel about specifying the static values for prefetch and stale time as java Duration objects?

@bmaizels
Copy link
Contributor

@Helmsdown

I will amend my PR. How do you feel about specifying the static values for prefetch and stale time as java Duration objects?

Yep I think that's what I would do. Thanks!

@Helmsdown Helmsdown force-pushed the sts-refresh-timing branch 3 times, most recently from 462e891 to b6a51b3 Compare September 11, 2020 19:25
@Helmsdown
Copy link
Contributor Author

@bmaizels I made the changes as per our discussion and rebased to the tip of master.

@Helmsdown
Copy link
Contributor Author

Additionally, the travis CI build errors don't seem to be related to my changes. Can you confirm?

Copy link
Contributor

@bmaizels bmaizels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your testing strategy adds the new [override] codepath to existing tests, but I think we should also add tests that explicitly verify the behavior of the override. For instance we should have a test that proves that when I change the duration with an override that changed duration is respected and the behavior of the credentials provider changes from the default. I believe that if I removed the code from the credentials provider that actually stored and used the override, all these tests would still pass.

@bmaizels
Copy link
Contributor

Additionally, the travis CI build errors don't seem to be related to my changes. Can you confirm?

@Helmsdown Thanks for continuing to work on this! I can confirm that this error has nothing to do with your changes. We're looking into it separately.

@bmaizels
Copy link
Contributor

@Helmsdown Thanks! Did you see my note about testing? I'll restate it in case you missed it in the review:

Your testing strategy adds the new [override] codepath to existing tests, but I think we should also add tests that explicitly verify the behavior of the override. For instance we should have a test that proves that when I change the duration with an override that changed duration is respected and the behavior of the credentials provider changes from the default. I believe that if I removed the code from the credentials provider that actually stored and used the override, all these tests would still pass.

@Helmsdown
Copy link
Contributor Author

Helmsdown commented Sep 15, 2020

Your testing strategy adds the new [override] codepath to existing tests, but I think we should also add tests that explicitly verify the behavior of the override. For instance we should have a test that proves that when I change the duration with an override that changed duration is respected and the behavior of the credentials provider changes from the default. I believe that if I removed the code from the credentials provider that actually stored and used the override, all these tests would still pass.

@bmaizels so there are a few things to do discuss here. Would it be a fair characterization to say that the testing of the StsCredentialsProvider implementations on master is somewhat light?

  1. No tests actually set the asyncCredentialUpdateEnabled flag when constructing the providers.
  2. The test distantExpiringCredentialsUpdatedInBackground on master doesn't really test the async behavior. Partially because of item 1 but partially because it doesn't actually get ahold of the credential provider and try and invoke it AFTER initializing it with callClientWithCredentialsProvider. In fact, the expiration time it gives, 90 seconds, means the first invocation of the provider will result in an invocation (because the provider was "empty") and the second call that happens immediately after will result in an immediate prefetch call because 90 seconds is within the default 5 minute prefetch window.

So if we were to improve upon this it would mean that the default distantExpiringCredentialsUpdatedInBackground would need to do several things:

  1. Set asyncCredentialUpdateEnabled to true in at least one concrete test class
  2. Get access to the provider and invoke it inside the time boxed while loop to "prove" that nothing is happening while the the credentials are in neither the prefetch or stale time windows.
  3. Modify the code such that we don't close down the provider in the try with resource block in the base test (at least in this use case)
  4. Set the credential expiration to be greater than 5 minutes

So are you ok with a ~5 minute unit test testing the default behavior? A test using the overridden stale/prefetch times could be faster given we can modify the prefetch time.

I'm ok do the following if you concur its right direction. If my read of the current test code is correct, and I am willing to be educated otherwise, it is disheartening to have to re-imagine the SDK test code in this case.

@Helmsdown
Copy link
Contributor Author

@bmaizels additionally, if you want to jump on a Google Meet to discuss this in a shorter feedback loop I am more than willing to set one up.

@Helmsdown
Copy link
Contributor Author

I'm already rethinking part of what I said. The test does not need to be 5 minutes.

@bmaizels
Copy link
Contributor

Your comments are fair, I definitely don't want to put paying down pre-existing technical debt on you (unless you think you can easily write those missing tests and want to take a stab at it anyway). I think at a minimum I would propose the following :

  1. Add getters to the new properties.
  2. Write tests to assert a) default values are correct, b) overridden values are correct

Anything beyond that we'll treat as a bonus. Happy to meet and discuss in person if you want to do that too!

@Helmsdown
Copy link
Contributor Author

Could you jump in here for a moment to chat:
https://meet.google.com/jux-fbug-kmb

@Helmsdown Helmsdown force-pushed the sts-refresh-timing branch 2 times, most recently from 9931dc8 to e1e851f Compare September 15, 2020 21:14
@Helmsdown
Copy link
Contributor Author

@bmaizels updated per our discussion in the video call. Let me know if I missed anything.

Copy link
Contributor

@bmaizels bmaizels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Just a couple of doc things to take care of. Sorry I didn't flag these the first time.

Copy link
Contributor

@bmaizels bmaizels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, our static code analysis flagged a couple of trivial things.

@bmaizels bmaizels self-requested a review September 15, 2020 22:32
@bmaizels bmaizels merged commit ff3f0ae into aws:master Sep 15, 2020
@bmaizels
Copy link
Contributor

Merged. Thanks for your contribution!

@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities (and Security Hotspot 0 Security Hotspots to review)
Code Smell A 1 Code Smell

100.0% 100.0% Coverage
0.0% 0.0% Duplication

warning The version of Java (1.8.0_252) you have used to run this analysis is deprecated and we will stop accepting it from October 2020. Please update to at least Java 11.
Read more here

aws-sdk-java-automation added a commit that referenced this pull request Apr 21, 2022
…0816b4ecd

Pull request: release <- staging/0d9f813b-e310-4c7e-8973-f9d0816b4ecd
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants