Mixing code and data in Ruby with DATA and __END__
Did you know that Ruby provides a way for your script to use its own source file as a source of data? It's a neat trick that can save you some time when writing one-off scripts and proofs of concept. Let's check it out!
DATA and END
In the example below, I'm using a funny keyword called
__END__. Everything below
__END__ will be ignored by the Ruby interpreter. But more interestingly, ruby provides you with an IO object called
DATA, which lets you read everything below
__END__ just like you can read from any other file.
In the following example, we iterate over each line and print it.
DATA.each_line do |line| puts line end __END__ Doom Quake Diablo
My favorite practical example of this technique uses
DATA to contain an ERB template. It also works with YAML, CSV and so on. M
require 'erb' time = Time.now renderer = ERB.new(DATA.read) puts renderer.result() __END__ The current time is <%= time %>.
You can actually use
DATA to read content above the
__END__ keyword. That's because
DATA is actually a pointer to the entire source file, fast-forwarded to the
__END__ keyword. You can see this if you rewind the IO object before printing it. The example below prints out the entire source file.
DATA.rewind puts DATA.read # prints the entire source file __END__ meh
The multiple-file delimma
One of the big downsides to this technique is that it only really works if your script fits into a single source file, and you're running that file directly, rather than including it.
In the example below, I've got two files, each with their own
__END__ section. However there can be only one
DATA global. So the
__END__ section of the second file is inaccessible.
# first.rb require "./second" puts "First file\n----------------------" puts DATA.read print_second_data() __END__ First end clause
# second.rb def print_second_data puts "Second file\n----------------------" puts DATA.read # Won't output anything, since first.rb read the entire file end __END__ Second end clause
snhorne ~/tmp $ ruby first.rb First file ---------------------- First end clause Second file ----------------------
A work-around for multiple files
Sinatra has a pretty cool feature that allows you to add multiple inline templates to your apps by putting them after an
__END__ statement. It looks like this:
# This code is from the Sinatra docs at http://www.sinatrarb.com/intro.html require 'sinatra' get '/' do haml :index end __END__ @@ layout %html = yield @@ index %div.title Hello world.
But how exactly can sinatra do this? After all, your app is probably going to be loaded by rack. You're not going to run
ruby myapp.rb in production! They must have figured out a way to use
DATA with multiple files.
Though, if you dig into the Sinatra source a little, you'll see that they're kind of cheating. They're not using
DATA at all. Instead they're doing something similar to the code below.
# I'm paraphrasing. See the original at https://github.com/sinatra/sinatra/blob/master/lib/sinatra/base.rb#L1284 app, data = File.read(__FILE__).split(/^__END__$/, 2)
It's actually a little more complicated, because they don't want to read
__FILE__. That would just be the sinatra/base.rb file. Instead they want to get the content of the file which invoked a function. They get this by parsing the result of caller.
The caller function will tell you where the currently running function was invoked. Here's a quick example:
def some_method puts caller end some_method # => caller.rb:5:in `<main>'
Now it's a pretty simple matter to pull the filename out of there, and extract something equivalent to DATA for that file.
def get_caller_data puts File.read(caller.first.split(":").first).split("__END__", 2).last end
Use it for good, not evil
Hopefully it's obvious that tricks like these aren't something that you'll want to use every day. They don't exactly make for clean, maintainable large code bases.
However occasionally you need something quick and dirty, either for a one-off utility script or for a proof of concept. In that case,
__END__ can be pretty useful.