Going deep on UUIDs and ULIDs

A chance conversation let me to the realization that the world of unique identifiers is larger and more wondrous than I ever could have imagined. In this post we discuss five types of UUIDs and their upstart cousin, the ULID. We explore what makes each of them special and when they may be particularly useful.

The other day the HB team was chatting and Ben, our dev-ops master, mentioned that he wished he'd used ULIDs instead of UUIDs for a particular system.

Like any seasoned engineer, my reaction was to mumble something non-committal then sneak over to Google to try to figure out what the hell a ULID is.

Two hours later I emerged with a thousand-yard stare and the realization that the world of unique identifiers is larger and more wondrous than I ever could have imagined.

Before we get started with ULIDs, let's go back to the basics and discuss what UUIDs are:

What's the problem with "regular" ids?

Most web applications that use databases default to numeric ids that increment automatically. For example, in Rails you might see behavior like this:

p1 = Person.create!
p1.id
# => 1

p2 = Person.create!
p2.id
# => 2

The database can generate sequential ids because it stores a counter that increments on record creation.

This pattern can also be seen outside of databases. Sometimes we need to assign ids manually, and we might store a custom counter in - say - a Redis instance.

Sequential ids are easy to implement for low-volume use-cases, but they become more problematic as volume increases:

  • It's impossible to create records concurrently because each insert has to wait in line to receive its id.
  • Requesting a sequential id may require a network round trip and result in slower performance.
  • It's difficult to scale out data stores that provide sequential ids. You have to worry about counters on different servers getting out of sync.
  • It's easy for the node with the counter to become a single point of failure.

Sequential ids also leak data, which may be a problem in some cases:

  • You can easily guess the ids of resources that may not belong to you.
  • If you create a user and its id is 20, you know that the service has 20 users.

UUIDs are web-scale

UUIDs look a little different than sequential ids. They are 128-bit numbers, typically expressed as 32 hexadecimal digits:

123e4567-e89b-12d3-a456-426655440000

UUIDs are created using specific algorithms defined in RFC 4122. They attempt to solve many of the problems that occur with sequential ids:

  • You can generate UUIDs on any number of nodes without any shared state or coordination between nodes.
  • They're a little less guessable than sequential ids (more on that later)
  • They don't divulge the size of your dataset.

The catch is that there's a small chance of two nodes independently generating the same id. This event is called a "collision."

Many Flavors of UUID

There are five types of UUID algorithm defined in RFC 4122. They fall into two categories:

  • Time and randomness-based algorithms are the ones we've been discussing. They result in a new UUID for every run.
    • Type 4: A randomly-generated id. Probably our best bet for new code.
    • Type 1: The ID contains the host's MAC address and the current timestamp. These are deprecated because they're too easy to guess.
    • Type 2: These seem to be uncommon. They appear to be purpose-built for an antiquated form of RPC.
  • Name based algorithms are a little different. They always produce the same UUID for a given set of inputs.
    • Type 5: Uses an SHA-1 hash to generate the UUID. Recommended.
    • Type 3: Uses an MD5 hash and is deprecated because MD5 is too insecure.

In Ruby, you can generate UUIDs via the uuidtools gem. It supports every type, except the mysterious type 2;

# Code stolen from the uuidtools readme. :)
require "uuidtools"

# Type 1
UUIDTools::UUID.timestamp_create
# => #<UUID:0x2adfdc UUID:64a5189c-25b3-11da-a97b-00c04fd430c8>

# Type 4
UUIDTools::UUID.random_create
# => #<UUID:0x19013a UUID:984265dc-4200-4f02-ae70-fe4f48964159>

# Type 3
UUIDTools::UUID.md5_create(UUIDTools::UUID_DNS_NAMESPACE, "www.widgets.com")
# => #<UUID:0x287576 UUID:3d813cbb-47fb-32ba-91df-831e1593ac29>

# Type 5
UUIDTools::UUID.sha1_create(UUIDTools::UUID_DNS_NAMESPACE, "www.widgets.com")
# => #<UUID:0x2a0116 UUID:21f7f8de-8051-5b89-8680-0195ef798b6a>

Moving on to ULIDs

Note: In the original version of this blog post I forgot to link to the ULID spec. Here it is. It provides links to implementations in Ruby and other languages.

ULIDs are a useful new take on unique identifiers. The most obvious difference is that they look a little different:

01ARZ3NDEKTSV4RRFFQ69G5FAV

They are made up of two base32-encoded numbers; a UNIX timestamp followed by a random number. Here's the structure, as defined in the specification:

01AN4Z07BY      79KA1307SR9X4MV3

|----------|    |----------------|
 Timestamp          Randomness
   48bits             80bits

This structure is fascinating! If you recall, UUIDs rely either on timestamps or randomness, but ULIDs use both timestamps and randomness.

As a result, ULIDs have some interesting properties:

  • They are lexicographically (i.e., alphabetically) sortable.
  • The timestamp is accurate to the millisecond
  • They're prettier than UUIDs :)

These open up some cool possibilities:

  • If you're partitioning your database by date, you can use the timestamp embedded in the ULID to select the correct partition.
  • You can sort by ULID instead of a separate created_at column if millisecond precision is acceptable.

There are some possible downsides too:

  • If exposing the timestamp is a bad idea for your application, ULIDs may not be the best option.
  • The sort by ulid approach may not work if you need sub-millisecond accuracy.
  • According to the internet, some ULID implementations aren't bulletproof.

Conclusion

UUIDs are and will continue to be the standard. They've been around forever, and libraries are available in every language imaginable. However, new approaches are worth considering, especially as we enter a world that's increasingly run by distributed systems. New unique-id approaches may help us solve problems that weren't prevalent at the publication of RFC4122.

What to do next:
  1. Try Honeybadger for FREE
    Honeybadger helps you find and fix errors before your users can even report them. Get set up in minutes and check monitoring off your to-do list.
    Start free trial
    Easy 5-minute setup — No credit card required
  2. Get the Honeybadger newsletter
    Each month we share news, best practices, and stories from the DevOps & monitoring community—exclusively for developers like you.
    author photo

    Starr Horne

    Starr Horne is a Rubyist and Chief JavaScripter at Honeybadger.io. When she's not neck-deep in other people's bugs, she enjoys making furniture with traditional hand-tools, reading history and brewing beer in her garage in Seattle.

    More articles by Starr Horne
    Stop wasting time manually checking logs for errors!

    Try the only application health monitoring tool that allows you to track application errors, uptime, and cron jobs in one simple platform.

    • Know when critical errors occur, and which customers are affected.
    • Respond instantly when your systems go down.
    • Improve the health of your systems over time.
    • Fix problems before your customers can report them!

    As developers ourselves, we hated wasting time tracking down errors—so we built the system we always wanted.

    Honeybadger tracks everything you need and nothing you don't, creating one simple solution to keep your application running and error free so you can do what you do best—release new code. Try it free and see for yourself.

    Start free trial
    Simple 5-minute setup — No credit card required

    Learn more

    "We've looked at a lot of error management systems. Honeybadger is head and shoulders above the rest and somehow gets better with every new release."
    — Michael Smith, Cofounder & CTO of YvesBlue

    Honeybadger is trusted by top companies like:

    “Everyone is in love with Honeybadger ... the UI is spot on.”
    Molly Struve, Sr. Site Reliability Engineer, Netflix
    Start free trial