Delete record without deleting attachments

Hi,

I’m building some sort of image hosting service and I have built something that copies attachment_data to a new record if the file is already uploaded in another record, so it’s not hosted twice in s3.

I’m looking for a way to delete a record (with .destroy !) without deleting the file from s3 so the other records that use the same attachment won’t lose it.

Thanks!

Hi

There is a keep_files plugin to handle that.

You’ll need to take care of deleting the files when all the records referring to the file are deleted with custom code if you care about that. One strategy is having a background job running periodically purging orphan files.

Thank you!
That should help me out.

One question though. If I delete all records that hold information about files, how do I find orphan files in for example s3 storage?

Use a brute force approach where you fetch a list of all files in s3 and filter out all files you have in your model database. Any files remaining are no longer referenced and thus orphaned which you can delete from S3 using the s3 client.

Another strategy might be designing a db model to help handle that. Something like using a generic File model that only deals with files and exists as long a file exists in S3. Then you use db relations from other models to the File model so on creation Model A creates a File record Z and links to it. Model B is then links to File record Z. Now say you want to remove the reference to File Z in Model A, you can do that. Then later you remove File Z from Model B. Now File Z is orphaned. It’s easy to query for orphaned files and delete the record which in turn causes Shrine to delete the file from S3. This might be more trouble than it’s worth… it’s an idea.

Hi there,

If you’re able to post the code re: the above functionality, I would be curious to see your implementation of this.

rgds
Ben

OK sure, let me try to get this condensed down.

I’m using the signature plugin to generate md5 hashes for the files and metadata_attributes to store it in a separate column for easy access. Documentation on this is good, I don’t think I need to explain.
I then do a select on the database for the hash of the new upload* and if it exists I remove the attachment (set it to nil again) and copy over the values for _data and _md5. I’m using ActiveRecord with Rails and it looks like this:

existing = Post.where(type: @post.type, attachment_md5: params[f].md5)[0]
    if existing
      @post.attachment_md5  = existing.attachment_md5
      @post.attachment_data = existing.attachment_data
      @post.video_type = existing.video_type
      post_empty = false
    else
      @post.attachment = params[f]
      @post.video_type = "upload" if f == :video_upload
    end
  • added bonus:

As I’m using Rails the uploaded files get processed by ActionDispatch before they are handed over to my code and Shrine. I generate the hash there so I can have it before the shrine validations run. For this I created a (probably pretty bad) monkey patch (It’s my first one, be gentle!)

require 'digest'

module MonkeyPatch
    module ActionDispatch
        module IncludeMd5
            attr_accessor :md5

            def initialize(hash)
                super

                @md5 = Digest::MD5.hexdigest(File.read hash[:tempfile])
            end
        end
    end
end

class ActionDispatch::Http::UploadedFile
    prepend MonkeyPatch::ActionDispatch::IncludeMd5
end

Hope this helps!

1 Like

Second strategy sounds like a good idea, I’ll be going with that. Thank you!

Thanks for posting this.

A further question: where are you placing this code (in the controller?)?

First part is from the controller, the second (monkey patch) I have in an initializer.

1 Like