Optimizing loading of thousand of records

Hi there! I am struggling with some performance optimisation and would like to see if I am missing something.

I am storing image files belonging to a gallery into Shrine in a Rails app using ActiveRecord.
Some galleries have thousand of images and I am looking at optimising the loading times.

For each gallery I am generating a JSON object representing the gallery and its files. My code looks like this:

class Image < ApplicationRecord
    include ImageFileUploader::Attachment(:file)

  def as_json
    { … } # Here I generate the JSON for one image

# JSON generation
  id: gallery.id,
  title: gallery.title,
  images = gallery.images.map(&:as_json),

I ran some profiling and this is spending an awful long time in generating each image’s JSON. I rewrote my Image#as_json method to directly read from file_data in the hope to avoid instanciating any Shrine-related code, but the profile still looks like this (zoomed):

Is it any way to not have define_entity_methods called? I am on purpose not calling any Shrine methods in this part of the code, but it still takes more than 1/3rd of the request’s time.

My end goal here is to add a caching layer for the whole JSON, but I would like first to reduce the un-cached response times.
I am also considering bypassing ActiveRecord entirely for this specific use-case and generating my JSON directly from the SQL results, but I would like to avoid it if possible.

I am open to any ideas you might have :slight_smile:

Hi Renaud,

If you’re calling the file_data column method directly, then Shrine::Attacher should not be instantiated, and Shrine should not be loading the attachment data. I suspect you have some code which is still calling some Shrine methods, which is then in turn instantiating the attacher.

The define_entity_methods is a metaprogramming method that’s called on Shrine::Attachment.new, which gets executed when your model file is loaded. In other words, this method is not called at runtime. Your profiling tool is just pointing to the method that’s loading the attached file, which the define_entity_methods method defined.

Just a few thoughts:

  def as_json
    super(:only => [:id]) 
  • Completely eliminate the shrine code and see if it’s still slow?

  • Also can you paste some of the json that is generate from the as_json method, and/or fully include the as_json method you’ve defined?

I indeed did a mistake and was calling file somewhere. Now Shrine does not add any overhead, but de-serializing thousand of json is very slow. Even instanciating AR objects consumes a good part of the response time.
I think I have no choice but directly fetching the array of records from postgres and parsing it directly with a fast JSON parser, maybe by concatenating all the JSONs in a big string and parsing it at once, then iterating to generate by output.
I will try to do some benchmarks and post numbers here, if somebody finds it useful :slight_smile: