Concurrent Derivative Processing

First, I want to say thanks for the great project and the excellent documentation.

I’m attempting to use Shrine for video processing where I create two versions of a video (mp4 and webm), and it works great if I create the two versions of the video sequentially.

Unfortunately, the webm version takes a significant amount of time compared to the mp4 (higher compression/quality settings), and with the way the derivatives work right now I have to wait for both of them to finish before they’re available.

I’m trying to figure out a way to create the derivatives concurrently, and I’m using the guide I found in the documentation here: https://shrinerb.com/docs/processing#c-creating-derivatives-concurrently

Everything works great, and I’m able to fire off two different jobs to transcode the video (one for mp4, and another for webm).

I’m running into an issue, though, where the webm job needs to run twice, due to the issues with the atomic persist method provided in the documentation. Since one job finishes quickly and persists, once the slower job attempts to persist, the Shrine::AttachmentChanged exception is rightfully raised. This fails the job, which is then tried again and succeeds.

I’m wondering if there is a suggested way around this? Not using atomic persists?

Also, perhaps the sample in the documentation can be updated as well, as I imagine this could be a common issue.

Any help is appreciated - thanks!

Hmm, the Shrine::AttachmentChanged exception should be raised only when the main file changes, not when derivatives change. I’m assuming that you’re processing the cheaper derivative in the PromoteJob, which also promotes the main file to permanent storage. When the other job finishes processing, it thinks the file has changed, because it was spawned with the cached file.

I would recommend processing both mp4 and webm in separate background jobs that you spawn after calling #atomic_promote. #atomic_promote should be quick if you’re using S3 storage for both temporary and permanent storage, because in this case Shrine just issues an S3 copy operation.

Both of these jobs could be processed by the same worker class, and you’d just call #atomic_persist in the end. I believe that should work.

Hi Janko,

You were right, the issue is that I was triggering the video processing from within the PromoteJob, which is what was promoting the file to storage for an image thumbnail.

Moving the triggering of the background job to be after the atomic_promote within the job fixed the issue.

Great catch - I never would have suspected that to be the issue. Thanks!