Description
Describe the bug
At my company, we download and process millions of files per day from S3, so reducing download times is a key priority. Our tech stack for this service is written in .net8 using version 3.7.305.30
of the AWSSDK.S3
package. A couple of months ago, we noticed through experimentation that the CLI was able to produce much quicker download times than the SDK and decided to write a small class that would execute the S3 CLI to download larger files.
I ran a small test to compare the S3 CLI to the TransferUtility and got the following averages for download times. Version 1.0.0.542 is the one with the TransferUtility, while the other 4 are using the CLI downloader.
I later performed a more precise test on a single file 5.8GB in size, in a bucket in us-east-1, ran on our most common EC2 instance size of c5.xlarge running in us-east-1c AZ. Receiving the following download times:
- From CLI: 39s
- From GetObjectAsync: 147s
- From TransferUtility: 144s
I tried creating an AWS Support case, but the rep was asking for specific S3 RequestIDs, which seemed irrelevant since these download times impact all of our downloads.
Let me know if there's anything obviously wrong with the implementation of each of the SDK downloaders; if not, hopefully, this is an issue that can be resolved within the dotnet SDK. Thanks.
Regression Issue
- Select this option if this issue appears to be a regression.
Expected Behavior
S3 SDK should have comparable download times to the S3 CLI.
Current Behavior
S3 SDK is between 4-6 times slower to download large files than the S3 CLI.
Reproduction Steps
For the S3 CLI, I used a simple cp
command:
aws s3 cp s3://<bucket>/<key> .
Code:
using System.Diagnostics;
using Amazon.Runtime.CredentialManagement;
using Amazon.S3;
using Amazon.S3.Transfer;
var sharedFile = new SharedCredentialsFile();
sharedFile.TryGetProfile("default", out var profile);
if (!AWSCredentialsFactory.TryGetAWSCredentials(profile, sharedFile, out var credentials))
{
throw new UnauthorizedAccessException();
}
var s3Client = new AmazonS3Client(credentials);
var transferUtility = new TransferUtility(s3Client);
// TODO: populate these fields with S3 Object target
var bucket = "";
var key = "";
Console.WriteLine($"Starting GetObjectAsync download...");
var timer = Stopwatch.StartNew();
var getFilename = "get-download.mp4";
using var res = await s3Client.GetObjectAsync(new () {BucketName=bucket, Key=key});
{
await res.WriteResponseStreamToFileAsync(getFilename, false, CancellationToken.None);
}
timer.Stop();
Console.WriteLine($"GetObjectAsync downloaded {new FileInfo(getFilename).Length} in {timer.Elapsed.TotalSeconds} seconds.");
timer.Reset();
timer.Start();
Console.WriteLine($"Starting TransferUtility download...");
var transferFilename = "transfer-download.mp4";
transferUtility.Download(transferFilename, bucket, key);
timer.Stop();
Console.WriteLine($"TransferUtility downloaded {new FileInfo(transferFilename).Length} in {timer.Elapsed.TotalSeconds} seconds.");
Possible Solution
Unsure.
Additional Information/Context
No response
AWS .NET SDK and/or Package version used
AWSSDK.S3 3.7.305.30
Targeted .NET Platform
net8.0
Operating System and version
Ubuntu