Skip to content

S3 SDK is far slower than S3 CLI #3806

Open
@alehechka

Description

@alehechka

Describe the bug

At my company, we download and process millions of files per day from S3, so reducing download times is a key priority. Our tech stack for this service is written in .net8 using version 3.7.305.30 of the AWSSDK.S3 package. A couple of months ago, we noticed through experimentation that the CLI was able to produce much quicker download times than the SDK and decided to write a small class that would execute the S3 CLI to download larger files.

I ran a small test to compare the S3 CLI to the TransferUtility and got the following averages for download times. Version 1.0.0.542 is the one with the TransferUtility, while the other 4 are using the CLI downloader.
Image

I later performed a more precise test on a single file 5.8GB in size, in a bucket in us-east-1, ran on our most common EC2 instance size of c5.xlarge running in us-east-1c AZ. Receiving the following download times:

  • From CLI: 39s
  • From GetObjectAsync: 147s
  • From TransferUtility: 144s

I tried creating an AWS Support case, but the rep was asking for specific S3 RequestIDs, which seemed irrelevant since these download times impact all of our downloads.

Let me know if there's anything obviously wrong with the implementation of each of the SDK downloaders; if not, hopefully, this is an issue that can be resolved within the dotnet SDK. Thanks.

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

S3 SDK should have comparable download times to the S3 CLI.

Current Behavior

S3 SDK is between 4-6 times slower to download large files than the S3 CLI.

Reproduction Steps

For the S3 CLI, I used a simple cp command:

aws s3 cp s3://<bucket>/<key> . 

Code:

using System.Diagnostics;
using Amazon.Runtime.CredentialManagement;
using Amazon.S3;
using Amazon.S3.Transfer;

var sharedFile = new SharedCredentialsFile();
sharedFile.TryGetProfile("default", out var profile);
if (!AWSCredentialsFactory.TryGetAWSCredentials(profile, sharedFile, out var credentials))
{
    throw new UnauthorizedAccessException();
}

var s3Client = new AmazonS3Client(credentials);
var transferUtility = new TransferUtility(s3Client);

// TODO: populate these fields with S3 Object target
var bucket = "";
var key = "";

Console.WriteLine($"Starting GetObjectAsync download...");

var timer = Stopwatch.StartNew();
var getFilename = "get-download.mp4";
using var res = await s3Client.GetObjectAsync(new () {BucketName=bucket, Key=key});
{
    await res.WriteResponseStreamToFileAsync(getFilename, false, CancellationToken.None);
}
timer.Stop();
Console.WriteLine($"GetObjectAsync downloaded {new FileInfo(getFilename).Length} in {timer.Elapsed.TotalSeconds} seconds.");

timer.Reset();
timer.Start();

Console.WriteLine($"Starting TransferUtility download...");

var transferFilename = "transfer-download.mp4";
transferUtility.Download(transferFilename, bucket, key);
timer.Stop();
Console.WriteLine($"TransferUtility downloaded {new FileInfo(transferFilename).Length} in {timer.Elapsed.TotalSeconds} seconds.");

Possible Solution

Unsure.

Additional Information/Context

No response

AWS .NET SDK and/or Package version used

AWSSDK.S3 3.7.305.30

Targeted .NET Platform

net8.0

Operating System and version

Ubuntu

Metadata

Metadata

Assignees

No one assigned

    Labels

    feature-requestA feature should be added or improved.p1This is a high priority issuequeueds3

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions