Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve TileSingle image performance #945

Open
mzur opened this issue Oct 10, 2024 · 0 comments
Open

Improve TileSingle image performance #945

mzur opened this issue Oct 10, 2024 · 0 comments

Comments

@mzur
Copy link
Member

mzur commented Oct 10, 2024

A tiled image can easily produce 100.000 files. It is extremely slow to upload them sequentially to an S3 backend here:

public function uploadToStorage()
{
// +1 for the connecting slash.
$prefixLength = strlen($this->tempPath) + 1;
$iterator = $this->getIterator($this->tempPath);
$disk = Storage::disk(config('image.tiles.disk'));
$fragment = fragment_uuid_path($this->image->uuid);
try {
foreach ($iterator as $pathname => $fileInfo) {
$disk->putFileAs($fragment, $fileInfo, substr($pathname, $prefixLength));
}
} catch (Exception $e) {
$disk->deleteDirectory($fragment);
throw $e;
}
}

To speed this up we could use asynchronous requests if an S3 backend is used. This circumvents Laravel's file system abstraction and uses the S3 SDK directly. Basically in the place shown above, detect if an S3 adapter is used for the storage disk. Then extract the S3 client from it and upload the files asynchronously as described here: https://stackoverflow.com/a/65365224/1796523

Make the number of parallel connections configurable. The default number should be 10.

Code from StackOverflow:

$files = glob('/path/to/your/files/*'); // This will return an array of all files in your folder

try {
    // Init of S3 client
    $s3Client = new \Aws\S3\S3Client([
        'version'       => 'latest',
        'region'        => '', // Desired AWS region
        'credentials'   => [
            'key'       => '', // Your AWS key
            'secret'    => '', // Your AWS key secret
        ],
    ]);
    
    // Logic about your requests and how to execute them
    $uploads = function($files) use ($s3Client) {
        foreach ($files as $file) {
            yield $s3Client->putObjectAsync([
                'Bucket'        => '', // Name of your bucket
                'Key'           => basename($file),
                'SourceFile'    => $file,
            ]);
        }
    };
    
    // Execute your requests with Guzzle because $s3Client->putObjectAsync() returns \GuzzleHttp\Promise\Promise
    \GuzzleHttp\Promise\Each::ofLimit(
        $uploads($files),
        3, // How much concurrent request to start
        function($response, $index) { // Callback on success
            var_dump('Success: ' . $index); 
        },
        function($reason, $index) { // Callback on failure
            var_dump('Error: ' . $index);
        }
    )->wait();
} catch (\Aws\S3\Exception\S3Exception $e) {
    var_dump($e->getMessage());
}
@mzur mzur moved this to Medium Priority in BIIGLE Roadmap Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Medium Priority
Development

No branches or pull requests

1 participant