Unnoticed Elephant
There are many reasons the Ruby on Rails application failed in production. I’ll share them in weird-production series so you can avoid them.
Most Ruby on Rails applications use Redis for caching and scheduling sidekiq jobs. This will work until you run a big migration, schedule a lot of jobs, or cache a lot of data.
config.cache_store = :redis_store, ENV['REDIS_URL']
I saw Redis OOM(out of memory) many times because most bug trackers leverage background jobs to schedule and send async. The jobs’ payload is usually big. Once it has OOM, it keeps failing, and then it schedules new jobs. It will be a loop until you remove the jobs from the queue.
- Some teams are putting big data like HTML responses to sidekiq parameters. Imagine you schedule thousands of jobs, each HTML text will be around 1MB. It will be a lot of memory usage.
The worst thing that can happen with background jobs is that you have to remove them from the queue to solve the problem immediately. But after you empty the jobs, Redis usually becomes OMM(out of memory) again.
So, the first thing you need to do is to remove your background jobs code to stop the loop.
Redis is just a common problem. Some teams store all HTML responses in a database and don’t clean it up. As the application grows, the database will grow bigger and bigger, and it will cost a lot of money to store and backup it.
So, think twice when you store any big data. It will be a problem in the future.