Why is URI.join so counterintuitive?

I recently found myself using URI.join to construct certain some redirect URLs. But I quickly ran into a problem. URI.join wasn't behaving like I expected. In this post we trace the unexpected behavior through the source of URI.join and back to the original RFC.

We just reached a milestone here at Honeybadger. Our sales pages are no longer part of our main Rails app. It's been on my wish list for years, but not exactly top priority.

As part of this migration, I found myself using URI.join to construct particular redirect links. But I quickly ran into a problem. URI.join wasn't behaving as I expected.

I expected it to take a bunch of path fragments and string them together like so:

# This is what I was expecting. It didn't happen.
URI.join("https://www.honeybadger.io", "plans", "change")
=> "https://www.honeybadger.io/plans/change"

What the join method did is much stranger. It dropped one of my path fragments, only using the last one, "change."

# This is what happened.
URI.join("https://www.honeybadger.io", "plans", "change")
=> "https://www.honeybadger.io/change"

So why the heck does it work like this?

The misunderstanding

It turns out that I was expecting URI.join to behave similarly to a specialized version of Array#join, taking URL fragments and combining them to make a whole URL.

That's not what it does. Big surprise.

If we take a look at the join method's code, we see that it just iterates over all arguments, and calls merge on each.

# File uri/rfc2396_parser.rb, line 236
def join(*uris)
  uris[0] = convert_to_uri(uris[0])
  uris.inject :merge
end

The merge method does two things:

  1. It converts your string like "pages" into a relative URI object.
  2. It tries to resolve the relative URI on to the base URI. It does this in exactly the way specified in RFC2396, Section 5.2.

So that's cool, but how does it explain the unexpected behavior I mentioned before?

URI.join("https://www.honeybadger.io", "plans", "change")
=> "https://www.honeybadger.io/change"

Let's step through it. The code above is equivalent to:

URI.parse("https://www.honeybadger.io/plans").merge("change")

The code above attempts to resolve the relative URI, "change" against the absolute URI "https://www.honeybadger.io/plans".

To do this, it follows RFC2396, Section 5.2.6, which states:

a) All but the last segment of the base URI's path component is copied to the buffer. In other words, any characters after the last (right-most) slash character, if any, are excluded.

b) The reference's path component is appended to the buffer string.

Let's play along:

  1. Copy everything but the final segment of the absolute URL. That gives me "https://www.honeybadger.io/"
  2. Append the relative path, resulting in "https://www.honeybadger.io/change"

The world makes sense again!

Conclusion

While URI.join can be used to build URLs from various path fragments, that's not really what it's designed to do. It's designed to do something a little more complicated: recursively merge URIs per the standards specified in the RFC.

As for my personal project — building URLs to use in redirects to our new sales pages — well, I just used Array#join instead. :)

EDIT 8/12/2016: After publishing this article I received a couple of tweets suggesting I use File.join for this purpose. This has the benefit of avoiding double slashes, ie. /my//path but will break on OSs like Windows, where the path separator isn't a forward-slash.

What to do next:
  1. Try Honeybadger for FREE
    Honeybadger helps you find and fix errors before your users can even report them. Get set up in minutes and check monitoring off your to-do list.
    Start free trial
    Easy 5-minute setup — No credit card required
  2. Get the Honeybadger newsletter
    Each month we share news, best practices, and stories from the DevOps & monitoring community—exclusively for developers like you.
    author photo

    Starr Horne

    Starr Horne is a Rubyist and Chief JavaScripter at Honeybadger.io. When she's not neck-deep in other people's bugs, she enjoys making furniture with traditional hand-tools, reading history and brewing beer in her garage in Seattle.

    More articles by Starr Horne
    Stop wasting time manually checking logs for errors!

    Try the only application health monitoring tool that allows you to track application errors, uptime, and cron jobs in one simple platform.

    • Know when critical errors occur, and which customers are affected.
    • Respond instantly when your systems go down.
    • Improve the health of your systems over time.
    • Fix problems before your customers can report them!

    As developers ourselves, we hated wasting time tracking down errors—so we built the system we always wanted.

    Honeybadger tracks everything you need and nothing you don't, creating one simple solution to keep your application running and error free so you can do what you do best—release new code. Try it free and see for yourself.

    Start free trial
    Simple 5-minute setup — No credit card required

    Learn more

    "We've looked at a lot of error management systems. Honeybadger is head and shoulders above the rest and somehow gets better with every new release."
    — Michael Smith, Cofounder & CTO of YvesBlue

    Honeybadger is trusted by top companies like: