Rails find_each/ find_in_batches pitfalls

Alex Korolev
1 min readMay 19, 2021

Every rails-developer knows about methods find_each or find_in_batches (if you check the implementation for the first one you find that it uses the second one under the hood) from ActiveRecord. Even more, it is a good practice to use these methods when you need to iterate throw a big amount of records in DB.
If you don't familiar with it you can check this https://apidock.com/rails/ActiveRecord/Batches/find_each.

Looks like, we always can/must use find_each instead of each when we need to iterate throw some bunch of records. We also thought this way. But one of our applications struggled with DB: load for DB was around 95%, it was obvious that DB is a bottleneck.

I’d started to analyze slow-query logs in MySQL and reports in NewRelic. It was quite simple to find the place in the code which produce this SQL query:

Survey.select(:id, :offer_uuid).active.where(provider: provider).where.not(survey_id: current_surveys).find_in_batches(batch_size: 10) do |survey_batch|
DeactivateSurveysService.perform(survey_batch, provider)
end

Because of using where.not MySQL should fetch all records in order to sort them by id (this sorting used in find_each/find_in_batches) and only after this takes 10 records - it can very slow if the scope of the records is big how we had.

CONCLUSION

It is good practice using find_each/find_in_batches in order to not consume a lot of memory but need to be aware of what resulting SQL is.

--

--