Hi there!
Sidenote I had to jump through quite some hoops because I wanted to be helpful and link to various source files and documentation files, but discource won’t allow me to use more than 2 links in a post. I’ve spend more time trying to figure out how to get around this, or how to become trusted than it took to write this entire post.
Here are the URLs, I’m trying to link to, annotated with [label]* in the post
- The guide:
https://shrinerb.com/docs/changing-location#1-update-the-location-generation
- Direct upload guide:
https://shrinerb.com/docs/direct-s3#strategy-a-dynamic
- Source (using id as the location):
https://github.com/shrinerb/shrine/blob/master/lib/shrine.rb#L204-L217
- Source (atomic persist equality):
https://github.com/shrinerb/shrine/blob/master/lib/shrine/uploaded_file.rb#L222-L229
Following this guide*, I’m trying to move files.
S3 and the file ID
When there is no generate_location
present (so no method and no pretty_location
plugin, or any other plugin that defines a generate_location
method), the default of Shrine
is to store files at the filename of id
, as determined by the Direct Upload guide*, which states the "id"
is used to determine where to store the file on S3.
According to [this source code]*, the location
is used as the "id"
value.
I don’t know if this is standard for all storage backends, but it was enough to find the (potential) underlying issue.
Atomic persist
The atomic persist implementation does an equality check on the uploaded file[s]. That’s implemented [here]*. In short: it compares the type (class), id and storage_key (e.g. "cache"
/ "store"
). A file with the exact same id and class but in different storages is, as expected, a different file. If everything matches, its the same file.
When changing metadata, derived versions, this is great. You can generate new derivatives, changed metadata and throw away the data if the underlying file is changed between the start of that process.
Problem
[The guide]* lays down the following steps:
- change or implement
generate_location
- migrate (move) the existing files, either directly or via backgrounding
# Example code for step 2
begin
attacher.atomic_persist # persist changes if attachment has not changed in the meantime
old_attacher.destroy_attached # delete files on old location
rescue Shrine::AttachmentChanged, # attachment has changed during reuploading
ActiveRecord::RecordNotFound # record has been deleted during reuploading
attacher.destroy_attached # delete now orphaned files
end
Since s3 moves the file by changing the ID, and the ID is used to determine equality, atomic_persist
will always throw a Shrine::AttachmentChanged
error, because it’s changing the location, which changes the ID, which makes them inequal.
Questions
- Is the guide incorrect? Or is the guide incomplete?
- Is there a Shrine-idiomatic solution for S3?
- Is using non-atomic persist "Good enough"™?
Workaround
I’m using #persist
instead of #atomic_persist
right now; and I can get some atomicity by transacting around persist, seeing if the original record filedata
hasn’t changed. It works.
I hope everything is clear. I can set-up a minimal reproducible example, but this is all copied straight from the guide.