Is it possible to use different S3 buckets for cache/storage and just copy between them? For instance, let’s say I’m using an EU bucket for fast upload for users in the EU, but after the cache phase, during processing prior to promoting from cache to store (store being a US bucket as the final destination, so that it can feed the CDN), I download the file to my servers, generate some versions, then upload the original and the versions to the US store bucket. However, it’d be nice to copy the original from the EU bucket to the US bucket, while also uploading the new versions from my server to the US bucket, in order to prevent having to re-upload the original file that already exists in the cache bucket (EU) just to get it into the store bucket (US).
Shrine’s S3 storage automatically does a copy when the input file is also uploaded to S3. So, in your case, a copy should automatically happen between temporary and permanent storages, provided you’re using same S3 credentials for both buckets.
Awesome. Good to know
But wait, how would it know that the file that’s being promoted is the exact same as the original? Let’s say I processed and transformed the original file on my server, and DID want to upload that iteration of it as the main file along with any new versions/derivatives, rather than just copying over the original and uploading it’s new derivatives, how would it know whether to upload a transformed original file from my server during promotion or just copy over the original file from the other bucket if transforming didn’t happen?
processing plugins, Shrine will upload any files returned by the
process(:store) block to permanent storage. If the original file is included in the result hash, it will get uploaded to permanent storage and saved, otherwise it will be left out.
Note that it’s highly recommended to keep the original file, as is shown in documentation examples. In Shrine 3.0, the
versions plugin will be replaced by the new
derivatives plugin, which separates processing from promotion, so there you won’t have the option to leave out the original file (though you still could if you really wanted to).
Maybe I’m misunderstanding, but per the last reply, I thought it would copy the originally cached file from bucket to bucket if the cache location was also S3? If that’s the case, in Shrine 2.x, how would it know if the original file has been untouched since we can add it back to any key in the hash during processing, and even manipulate it before doing that? If it just copies the original from bucket to bucket, it would obviously lose whatever changes the server made to the downloaded original copy during promotion.
I think you’re assuming there is more going on than it actually is. When processing, the server doesn’t make any “changes” to the original file, it just generates a new file on disk. The original file always stays untouched. When you’re done generating new processed files, you decide at the end of the block what you want to upload to permanent storage and save.
The original file yielded to the process block is a
Shrine::UploadedFile object, which points to the file on the temporary S3 bucket. If you return it as part of the result hash, when Shrine’s S3 storage beings “uploading” that
Shrine::UploadedFile object, it will detect that it’s uploaded to S3 and do a copy.
If you choose to do processing, you’ll additionally return processed files at the end of the block (usually as
Tempfile objects). Shrine is completely agnostic as to how you’ve created those files, it doesn’t know that you’ve processed them from the original file. So, for Shrine the result hash is just a hash of files it should upload, most of which are
Tempfile objects, with one being a
Shrine::UploadedFile object (the original cached file).
I hope that clears it up. If it’s still unclear, I would recommend reading the source code.
That was super helpful and totally cleared it up for me. Thanks so much!