Mastering Low Level Caching in Rails

Sometimes when your app is slow, it's not your fault. Your code might be optimized to the teeth, but it won't matter if it has to perform intrinsically slow tasks, like fetching data from an external API. In these situations, Rails' low-level caching can be a life-saver. But caching is infamously tricky. It's dangerous to go alone. In this article, Jonathan Miles guides us through the landscape of low-level caching. He covers the basics, but more importantly, digs into essential details of cache invalidation and points out common pitfalls.

Caching is a general term which means storing the result of some code so that we can quickly retrieve it later. This allows us to, for example, perform some heavy number-crunching once and then just re-use the value without having to recalculate it again. Although the general concept is the same for all types of caching, there are various mechanisms we can use depending on what we are trying to cache.

For Rails developers the most common forms of caching are things like memoization (covered in a previous part of this caching series), view caching (stay tuned for the next article), and low-level caching, which we will cover here.

What is Low-Level Caching

What Rails calls low level caching is really just reading and writing data to a key-value store. Out of the box, Rails supports an in-memory store, files on the filesystem, and external stores like Redis or memcached. It is called "low level" caching because you are dealing with the Rails.cache object directly, telling it what value to store and what key to use, this is in contrast to view caching where Rails has built-in helper methods to handle these nitty-gritty details for you.

The most common use-cases I encounter for low-level caching are read-only external API requests and heavy ActiveRecord computations. In the ActiveRecord case there are some alternatives to caching covered in the first part of this series that you may want to look into first, since introducing caching also increases the complexity and bug attack-surface of your application.

By default, Rails disables caching in development, because you usually want fresh data when you're working on a feature. You can easily toggle caching on and off using the rails dev:cache command.

How it works

Rails provides three methods to deal with the cache: read, write, and fetch. All of them take a cache "key" which is how we look up the value:

> Rails.cache.write("my-cache-key", 123)
> Rails.cache.read("my-cache-key")
=> 123
> Rails.cache.read("key-not-written")
=> nil

read and write are good to know about, but when implementing low level caching the fetch method is what you'll probably use the most.

fetch provides a nice wrapper around reading and writing. You pass it a key and a block, and if a value is present for that key in the cache it will be returned and the block is not executed. If there is no cached value for that key (or it has expired, more on expiration later) it will execute the block and store the result in the cache for next time.

def cached_result
 Rails.cache.fetch(:cached_result) do
  # Only executed if the cache does not already have a value for this key
  puts "Crunching the numbers..."
  12345
 end
end

> cached_result
Crunching the numbers...
=> 123
> cached_result
=> 123

When to Use Low-Level Caching

A great use case for this kind of caching is when you are hitting an external API to get a value that may not change that often. In one client app we had some calculations based on the current futures price of some commodities. Rather than hit the API on every page refresh, we cache the value for a period of time (in our case 10 minutes).

class ExternalApiWrapper
...
  def fetch_price
    Rails.cache.fetch([self, :fetch_price], expires_in: 10.minutes) { read_api_price }
  end
end

Keys and expiration

The value you pass to the cache method (read, write, or fetch) is the "cache key", that is, the key in the key-value pair stored in the cache. By the time it hits the cache store this will be a String, but Rails allows us to pass in some other common objects too:

  • A string with whatever content you like
  • A symbol
  • An object that responds to cache_key_with_version or cache_key (such as an ActiveRecord model, we'll dig into these shortly)
  • An array with any combination of the above

A common technique I've used when adding low level caching to an ActiveRecord model is to pass an array containing self (so the cached value is scoped to the current object) and the name of the method as a symbol, like:

class SomeModel < ApplicationRecord
 def calculated_value
  Rails.cache.fetch([self, :calculated_value]) do
   ...
  end
 end
end

To see what the actual generated cache key will look like you can call the ActiveSupport method directly:

> ActiveSupport::Cache.expand_cache_key([SomeModel.last, :test, :one, "two"])
=> "some_model/17-20200304104455464584/test/one/two"

The blob of numbers here is a combination of the model's id and updated_at timestamp. The id part is so that this cached value is not overwritten by other instances of the model. The update_at timestamp means that if the model is updated, the key automatically changes, saving us the hassle of manually invalidating the cached value.

Earlier I listed two methods for generating cache keys: cache_key and cache_key_with_version. ActiveRecord::Base implements both. cache_key_with_version takes precedence, which includes the update_at timestamp as shown above. cache_key, on the other hand, only returns the model name and id:

> SomeModel.last.cache_key
=> "some_model/17"
> SomeModel.last.cache_key_with_version
=> "some_model/17-20200323114436755491"

In older versions of Rails, caching only allowed a cache_key, which in ActiveRecord models would include the timestamp. The change to separate cache_key and cache_key_with_version was made in Rails 5.2 to allow for "recyclable cache keys". The basic problem being solved is this one: Every time a model's updated_at timestamp changes, its cache key changes. This is great for cache invalidation but means the cache is now storing old stale values that we'll never access again (because we'll never generate the old cache keys).

> widget = Widget.new
> old_key = widget.cache_key_with_version
=> "widgets/1-20200304104455464584"
> Rails.cache.fetch(old_key) { widget }
=> <Widget:0x00007fc2fe5da930, id: 1 ...
> widget.touch
> new_key = widget.cache_key_with_version
=> "widgets/1-20200323114436755491"
> Rails.cache.fetch(new_key) { widget }
=> <Widget:0x00007fc2fe5da930, id: 1 ...
> Rails.cache.read(old_key)
=> <Widget:0x00007fc2fe5da930, id: 1 ...

As you can see, the cache is now storing two copies of widget, even though the old one will never be looked up again. Eventually, the cache will hit its memory limit and start dropping old values to free up space. In apps with a lot of cached data this could mean dropping values that we still want cached but are accessed less often.

Recyclable cache keys solve this problem by allowing us to explicitly pass the version to the cache method. The underlying key used in the cache will include just the ID, and the cache store will handle checking if the version we're giving it matches what is stored in the cache:

> old_version = Widget.last.cache_version
=> "20200320201134416105"
> Rails.cache.fetch(Widget.last, version: old_version) { "Test Value" }
=> "Test Value"
> Rails.cache.read("widgets/17")
=> "Test Value"
Rails.cache.fetch(Widget.last, version: Time.current) { "New Value" }
=> "New Value"
> Rails.cache.read("widgets/17")
=> "New Value"

Touching models

There are times when changes to one model require changes to a related model. Say you have Cart and Product models for an e-commerce store, and if the product is updated you need the carts to be updated. This is where you'd specify touch: true on the relationship:

class Cart < ApplicationRecord
 has_many :products
end

class Product < ApplicationRecord
 belongs_to :cart, touch: true
end

This means any change to Product will automatically change the updated_at timestamps of all Carts it "belongs to". This is true no matter what fields on Product are being updated, so be mindful that this introduces some overhead, where what used to be a single database call to update product now also involves updating any number of related Carts.

If needed, you can also call touch on the model yourself to update the timestamp which can be very useful for manual cache-invalidation via the Rails console, or if you want finer-grained control about which particular rows are being updated.

Time-based Expiration

One of the options you can pass to the cache methods is when you want that key-value entry to be deleted. Personally, I often set this to a low number (or better yet, an environment variable) when deploying a new set of caching code, so that if things need tweaking you don't have to do much manual invalidation before testing again.

Rails.cache.fetch(Product.last, expires_in: 1.day) { ... }

You can also set a default expiration value on the cache store.

Example Use Cases

I'll be honest here, almost all the apps I've worked on have not needed to use this form of caching, with one important exception. We inherited an application that was, well, to say it was not well architected would be an understatement.

Even after a considerable amount of cleanup, we had two issues to deal with:

  1. A lot of calculations depended on the "current price" of commodities fetched from an API
  2. Various levels of nested aggregations like child.map { |c| c.computed_field }.sum, where computed_field itself contained another map{...}.sum

In an ideal world #2 would be resolved by boiling those calculations down and getting the database to do the number crunching. Consultancy work always requires balancing developer-cost against client-benefit though and this would require a non-trivial amount of hours to complete, so instead, we targeted the model methods that were causing the main performance issues and cached them.

This then tied into the solution for #1 as well; we added a scheduled job to update the price every 10 minutes. If the price has changed, the relevant models will be touched, meaning their cached calculations will be invalidated.

As a simple example:

class UpdatePricesJob < ApplicationJob
 def perform
   Commodity.each { |commodity| commodity.update!(price: <fetch_api_price>) }
 end
end

class Commodity < ApplicationRecord
 belongs_to :invoice, touch: true
end

class Invoice < ApplicationRecord
 has_many :commodities

 def total_value
  Rails.cache.fetch([self, :total_value]) { commodities.map(&:price).sum }
 end
end

Gotchas

Because Rails' low-level caching is designed with ActiveRecord's updated_at timestamp in mind, code that uses this can easily stray into one of two extremes:

  1. The cached value should change but the model's update_at did not changed (e.g. the model method being cached takes an argument), resulting in a cache invalidation bug.
  2. Liberal use of touch: true on ActiveRecord associations solves the cache invalidation issues but starts to heavily tax the database instead.

An additional note on #2 is that adding a lot of touch settings to objects can also dramatically increase database log entries. I have seen a production site go down simply because of this issue (i.e. the database server ran out of hard drive space, even though the actual DB load was normal).

When The View Is The Bottleneck

I've mostly talked about caching methods within an ActiveRecord model here, as I believe that's the most common use-case for low-level caching in Rails. Rails.cache can be called from anywhere in your Rails application though, so there's no reason it can't be used inside your business-logic classes as well.

You could even call it inside Rails views, but if you want to cache content for the view layer, Rails has support for that baked in, which is what we'll dive into in the next part of this series.

What to do next:
  1. Try Honeybadger for FREE
    Honeybadger helps you find and fix errors before your users can even report them. Get set up in minutes and check monitoring off your to-do list.
    Start free trial
    Easy 5-minute setup — No credit card required
  2. Get the Honeybadger newsletter
    Each month we share news, best practices, and stories from the DevOps & monitoring community—exclusively for developers like you.
    author photo

    Jonathan Miles

    Jonathan began his career as a C/C++ developer but has since transitioned to web development with Ruby on Rails. 3D printing is his main hobby but lately all his spare time is taken up with being a first-time dad to a rambunctious toddler.

    More articles by Jonathan Miles
    Stop wasting time manually checking logs for errors!

    Try the only application health monitoring tool that allows you to track application errors, uptime, and cron jobs in one simple platform.

    • Know when critical errors occur, and which customers are affected.
    • Respond instantly when your systems go down.
    • Improve the health of your systems over time.
    • Fix problems before your customers can report them!

    As developers ourselves, we hated wasting time tracking down errors—so we built the system we always wanted.

    Honeybadger tracks everything you need and nothing you don't, creating one simple solution to keep your application running and error free so you can do what you do best—release new code. Try it free and see for yourself.

    Start free trial
    Simple 5-minute setup — No credit card required

    Learn more

    "We've looked at a lot of error management systems. Honeybadger is head and shoulders above the rest and somehow gets better with every new release."
    — Michael Smith, Cofounder & CTO of YvesBlue

    Honeybadger is trusted by top companies like:

    “Everyone is in love with Honeybadger ... the UI is spot on.”
    Molly Struve, Sr. Site Reliability Engineer, Netflix
    Start free trial
    Are you using Sentry, Rollbar, Bugsnag, or Airbrake for your monitoring? Honeybadger includes error tracking with a whole suite of amazing monitoring tools — all for probably less than you're paying now. Discover why so many companies are switching to Honeybadger here.
    Start free trial
    Stop digging through chat logs to find the bug-fix someone mentioned last month. Honeybadger's built-in issue tracker keeps discussion central to each error, so that if it pops up again you'll be able to pick up right where you left off.
    Start free trial
    “Wow — Customers are blown away that I email them so quickly after an error.”
    Chris Patton, Founder of Punchpass.com
    Start free trial