We're working on something new! Hook Relay gives you Stripe-quality webhooks in minutes. Sign up for free today! Check out Hook Relay

Associative arrays in Ruby...what?

Have you ever had a bunch of data in an array, but needed to do a key/value lookup like you would with a hash? Fortunately, Ruby provides a mechanism for treating arrays as key-value structures. Let's check it out!

Have you ever had a bunch of data in an array, but needed to do a key/value lookup like you would with a hash?  Fortunately, Ruby provides a mechanism for treating arrays as key-value structures. Let's check it out!

Introducing Array#assoc and Array#rassoc

Imagine that you've been given a magical stock-picking machine. Every few minutes it spits out a recommendation to buy or sell a stock. You've managed to hook it up to your computer, and receiving a stream of data that looks like this:

picks = [
  ["AAPL", "buy"],
  ["GOOG", "sell"],
  ["MSFT", "sell"]

To find the most recent guidance for Google, you could make use of the Array#assoc method. Here's what that looks like:

# Returns the first row of data where row[0] == "GOOG"
picks.assoc("GOOG") # => ["GOOG", "sell"]

To find the most recent "sell" recommendation, you could use the Array#rassoc method.

# Returns the first row of data where row[1] == "sell"
picks.rassoc("sell") # => ["GOOG", "sell"]

If no match is found, the methods return nil:

picks.assoc("CSCO") # => nil
picks.rassoc("hold") # => nil

Historical data

Hashes can't have more than one value for a single key. But arrays can have as many duplicates as you like. The assoc and rassoc methods do the sensible thing in this case and return the first matching row they find. This lets us do some pretty interesting things.

Our imaginary stock picking machine provides a stream of data. Eventually, it's going to change its mind about a particular company and tell me to buy what it previously told me to sell. In that case our data looks like:

picks = [
  ["GOOG", "buy"],
  ["AAPL", "sell"],
  ["AAPL", "buy"],
  ["GOOG", "sell"],
  ["MSFT", "sell"]

If I were putting all of this data into  a hash, updating the recommendation for a particular stock would cause me to lose any previous recommendations for that stock. Not so with the array. I can keep prepending recommendations to the array, knowing that Array#assoc will always give me the most recent recommendation.

# Returns the first row of data where row[0] == "GOOG"
picks.assoc("GOOG") # => ["GOOG", "buy"]

So we get the key-value goodness of a hash, along with a free audit trail.

More than two columns

Another neat thing about assoc is that you're not limited to just two columns per array. You can have as many columns as you like. Suppose you added a timestamp to each buy/sell recommendation.

picks = [
  ["AAPL", "buy", "2015-08-17 12:11:55 -0700"],
  ["GOOG", "sell", "2015-08-17 12:10:00 -0700"],
  ["MSFT", "sell", "2015-08-17 12:09:00 -0700"]

Now when we use assoc or rassoc, we'll  get the timestamp as well:

# The entire row is returned
picks.assoc("GOOG") # => ["GOOG", "sell", "2015-08-17 12:10:00 -0700"]

I hope you can see how useful this could be when dealing with data from CSV and other file formats that can have lots of columns.


Ruby's hashes will definitely outperform Array#assoc in most benchmarks. As the dataset gets bigger, the differences become more apparent. After all, hash table searches are O(1), while array searches are O(n). However in may cases the difference won't large enough for you to worry about - it depends on the details.

Just for fun, I wrote simple benchmark comparing hash lookup vs assoc for a 10 row dataset and for a 100,000 row dataset. As expected, the hash and array performed similarly with the small data set. With the large dataset, the hash dominated the array.

...though to be fair, I'm searching for the last element in the array, which is the worst case scenario for array searches.

require 'benchmark/ips'
require 'securerandom'

Benchmark.ips do |x|
  x.time = 5
  x.warmup = 2

  short_array = (0..10).map { |i| [SecureRandom.hex(), i] }
  short_hash = Hash[short_array]
  short_key = short_array.last.first

  long_array = (0..100_000).map { |i| [SecureRandom.hex(), i] }
  long_hash = Hash[long_array]
  long_key = short_array.last.first

  x.report("short_array") { short_array.assoc(short_key) }
  x.report("short_hash") { short_hash[short_key] }
  x.report("long_array") { long_array.assoc(long_key) }
  x.report("long_hash") { long_hash[long_key] }


# Calculating -------------------------------------
#          short_array    91.882k i/100ms
#           short_hash   149.430k i/100ms
#           long_array    19.000  i/100ms
#            long_hash   152.086k i/100ms
# -------------------------------------------------
#          short_array      1.828M (± 3.4%) i/s -      9.188M
#           short_hash      6.500M (± 4.8%) i/s -     32.426M
#           long_array    205.416  (± 3.9%) i/s -      1.026k
#            long_hash      6.974M (± 4.2%) i/s -     34.828M

# Comparison:
#            long_hash:  6974073.6 i/s
#           short_hash:  6500207.2 i/s - 1.07x slower
#          short_array:  1827628.6 i/s - 3.82x slower
#           long_array:      205.4 i/s - 33950.98x slower

Honeybadger has your back when it counts. We're the only error tracker that combines exception monitoring, uptime monitoring, and cron monitoring into a single, simple to use platform.

Our mission: to tame production and make you a better, more productive developer. Learn more

author photo

Starr Horne

Starr Horne is a Rubyist and Chief JavaScripter at Honeybadger.io. When she's not neck-deep in other people's bugs, she enjoys making furniture with traditional hand-tools, reading history and brewing beer in her garage in Seattle.

“We’ve looked at a lot of error management systems. Honeybadger is head and shoulders above the rest and somehow gets better with every new release.”
Michael Smith
Try Error Monitoring Free for 15 Days
Are you using Bugsnag, Rollbar, or Airbrake for your monitoring? Honeybadger includes exception, uptime, and check-in monitoring — all for probably less than you’re paying now. Discover why so many companies are switching to Honeybadger here.
Try Error Monitoring Free for 15 Days