# Building a Programming Language in Ruby: The Interpreter, Part 2

## Full Source on Github

A complete implementation of the Stoffle programming language is available on GitHub. Feel free to open an issue if you find bugs or have questions.

In this blog post, we will continue implementing the interpreter for Stoffle, a toy programming language built entirely in Ruby. We started the interpreter in a previous post. You can read more about this project in the first part of this series.

In the last post, we covered how to implement the simpler features of Stoffle: variables, conditionals, unary and binary operators, data types, and printing to the console. Now, it is time to roll up our sleeves and tackle the more challenging remaining bits: function definition, function calling, variable scoping, and loops.

As we did previously, we will use the same example program from the start to the end of this post. We will go through it line by line, exploring the implementation necessary at the interpreter to bring to life each different structure in our Stoffle example program. Finally, we will see the interpreter in action and run the program by using the CLI we created in the previous article of the series.

## Gauss is Back

If you have a good memory, you can probably remember that in part two of the series, we discussed how to build a Lexer. In that post, we took a look at a program to sum up numbers in a series to illustrate Stoffle's syntax. At the end of this article, we will finally be able to run the aforementioned program! So, here is the program again:

``````fn sum_integers: first_integer, last_integer
i = first_integer
sum = 0
while i <= last_integer
sum = sum + i

i = i + 1
end

println(sum)
end

sum_integers(1, 100)
``````

The abstract syntax tree (AST) for our integer summation program is the following: ## The Mathematician Who Inspired our Stoffle Sample Program

Carl Friedrich Gauss supposedly figured out on his own, at only 7 years old, a formula to sum up numbers in a series.

As you may have noticed, our program does not use the formula devised by Gauss. Since we have computers nowadays, we have the luxury of solving this problem in a "brute-force" way. Let our silicon friends do the hard work for us.

## Function Definition

The first thing we do in our program is define the `sum_integers` function. What does it mean to declare a function? As you might be guessing, it is an action similar to assigning a value to a variable. When we define a function, we are associating a name (i.e., the function name, an identifier) with one or more expressions (i.e., the function's body). We also register to which names the values passed in during the function call should be bound. These identifiers become local variables during the execution of the function and are called parameters. The values passed in when the function is called (and are bound to the parameters) are arguments.

Let's take a look at `#interpret_function_definition`:

``````def interpret_function_definition(fn_def)
env[fn_def.function_name_as_str] = fn_def
end
``````

Pretty straightforward, huh? As you probably remember from the last post in this series, when our interpreter gets instantiated, we create an environment. This is a place used to hold the program state, and in our case, it is simply a Ruby hash. In the last post, we saw how variables and the values bound to them are stored in the `env`. Function definitions will also be stored there. The key is the function name, and the value is the AST node used to define a function (`Stoffle::AST::FunctionDefinition`). Here's a refresher on this AST node:

``````class Stoffle::AST::FunctionDefinition < Stoffle::AST::Expression
attr_accessor :name, :params, :body

def initialize(fn_name = nil, fn_params = [], fn_body = nil)
@name = fn_name
@params = fn_params
@body = fn_body
end

def function_name_as_str
# The instance variable @name is an AST::Identifier.
name.name
end

def ==(other)
children == other&.children
end

def children
[name, params, body]
end
end
``````

Having the function name associated with a `Stoffle::AST::FunctionDefinition` means that we can access all the information needed to execute the function. For example, we have on hand the number of arguments expected and can easily emit an error if a function call does not provide it. This and other details we will see when we explore, next, the code responsible for interpreting a function call.

## Calling a Function

Continuing the walk-through of our example, let us now focus on the function call. After defining the `sum_integers` function, we call it passing the numbers 1 and 100 as arguments:

``````fn sum_integers: first_integer, last_integer
i = first_integer
sum = 0
while i <= last_integer
sum = sum + i

i = i + 1
end

println(sum)
end

sum_integers(1, 100)
``````

The interpretation of a function call happens at `#interpret_function_call`:

``````def interpret_function_call(fn_call)
return if println(fn_call)

fn_def = fetch_function_definition(fn_call.function_name_as_str)

stack_frame = Stoffle::Runtime::StackFrame.new(fn_def, fn_call)

assign_function_args_to_params(stack_frame)

# Executing the function body.
call_stack << stack_frame
value = interpret_nodes(fn_def.body.expressions)
call_stack.pop
value
end
``````

This is a complex function, so we will need to take our time here. As explained in the last article, the first line is responsible for checking whether the function getting called is `println`. If we are dealing with a user-defined function, which is the case here, we move on and fetch its definition using `#fetch_function_definition`. As shown below, this function is a plain sail, and we basically retrieve the `Stoffle::AST::FunctionDefinition` AST node we previously stored in the environment or emit an error if the function does not exist.

``````def fetch_function_definition(fn_name)
fn_def = env[fn_name]
raise Stoffle::Error::Runtime::UndefinedFunction.new(fn_name) if fn_def.nil?

fn_def
end
``````

Returning to `#interpret_function_call`, things start to get more interesting. When thinking about functions in our simple toy language, we have two special concerns. First, we need a strategy to keep track of variables local to the function. We also have to handle `return` expressions. To tackle these challenges, we will instantiate a new object, which we will call frame, every time a function is called. Even if the same function is called multiple times, each new call will instantiate a new frame. This object will hold all the variables local to the function. Since one function can call another one and so on, we must have a way to represent and keep track of the execution flow of our program. To do so, we will use a stack data structure, which we will name call stack. In Ruby, a standard array with its `#push` and `#pop` methods will do as a stack implementation.

## Call Stack and Stack Frame

Keep in mind that we are using the terms call stack and stack frame loosely. Processors and lower-level programming languages generally will also have call stacks and stack frames, but they are not exactly what we have here in our toy language.

If these concepts piqued your curiosity, I highly recommend researching call stacks and stack frames. If you really want to get closer to the metal, I would suggest specifically looking into processor call stacks.

Here is the code for implementing a `Stoffle::Runtime::StackFrame`:

``````module Stoffle
module Runtime
class StackFrame

def initialize(fn_def_ast, fn_call_ast)
@fn_def = fn_def_ast
@fn_call = fn_call_ast
@env = {}
end
end
end
end
``````

Now, back to `#interpret_function_call`, the next step is to assign the values passed in the function call to the respective expected parameters, which will be accessible as local variables inside the function body. `#assign_function_args_to_params` is responsible for this step:

``````def assign_function_args_to_params(stack_frame)
fn_def = stack_frame.fn_def
fn_call = stack_frame.fn_call

given = fn_call.args.length
expected = fn_def.params.length
if given != expected
raise Stoffle::Error::Runtime::WrongNumArg.new(fn_def.function_name_as_str, given, expected)
end

# Applying the values passed in this particular function call to the respective defined parameters.
if fn_def.params != nil
fn_def.params.each_with_index do |param, i|
if env.has_key?(param.name)
# A global variable is already defined. We assign the passed in value to it.
env[param.name] = interpret_node(fn_call.args[i])
else
# A global variable with the same name doesn't exist. We create a new local variable.
stack_frame.env[param.name] = interpret_node(fn_call.args[i])
end
end
end
end
``````

Before we explore `#assign_function_args_to_params` implementation, it is first necessary to briefly discuss variable scoping. This is a complex and nuanced subject. For Stoffle, let us be very pragmatic and adopt a simple solution. In our tiny language, the only constructs that create new scopes are functions. Furthermore, global variables always come first. As a consequence, all variables declared (i.e., first usage) outside a function are global and are stored in `env`. Variables inside functions are local to them and are stored in the `env` of the stack frame created during the interpretation of the function call. There is one exception, though: a variable name that collides with an existing global variable. If a collision occurs, a local variable will not be created, and we will be reading and assigning to the existing global variable.

Okay, now that our variable scoping strategy is clear, let’s return to `#assign_function_args_to_params`. In the first segment of the method, we first retrieve the function definition and function call nodes from the stack frame object that was passed in. Having these on hand, it is easy to check whether the number of arguments provided match the number of parameters the function being called expects. We raise an error when there is a mismatch between the given arguments and the expected parameters. In the last part of `#assign_function_args_to_params`, we assign the arguments (i.e., the values) provided during the function call to their respective parameters (i.e., local variables inside the function). Note that we do check whether a parameter name collides with an existing global variable. As explained before, in these cases, we do not create a local variable inside the function's stack frame and instead just apply the passed in value to the existing global variable.

Returning to `#interpret_function_call`, we finally push our newly created stack frame to the call stack. Then, we call our old friend `#interpret_nodes` to start interpreting the function body:

``````def interpret_function_call(fn_call)
return if println(fn_call)

fn_def = fetch_function_definition(fn_call.function_name_as_str)

stack_frame = Stoffle::Runtime::StackFrame.new(fn_def, fn_call)

assign_function_args_to_params(stack_frame)

# Executing the function body.
call_stack << stack_frame
value = interpret_nodes(fn_def.body.expressions)
call_stack.pop
value
end
``````

## Interpreting the Function Body

Now that we have interpreted the function call itself, it’s time to interpret the function body:

``````fn sum_integers: first_integer, last_integer
i = first_integer
sum = 0
while i <= last_integer
sum = sum + i

i = i + 1
end

println(sum)
end

sum_integers(1, 100)
``````

The first two lines of our `sum_integers` function are variable assignments. We covered this topic in the previous blog post of this series. However, we now have variable scoping, and as a consequence, the code that deals with assignment has changed a bit. Let us explore it:

``````def interpret_var_binding(var_binding)
if call_stack.length > 0
# We are inside a function. If the name points to a global var, we assign the value to it.
# Otherwise, we create and / or assign to a local var.
if env.has_key?(var_binding.var_name_as_str)
env[var_binding.var_name_as_str] = interpret_node(var_binding.right)
else
call_stack.last.env[var_binding.var_name_as_str] = interpret_node(var_binding.right)
end
else
# We are not inside a function. Therefore, we create and / or assign to a global var.
env[var_binding.var_name_as_str] = interpret_node(var_binding.right)
end
end
``````

Do you remember when we pushed the stack frame created for the function call into `call_stack`? This comes handy now since we can check whether we are inside a function by verifying that `call_stack` has a length greater than zero (i.e., has at least one stack frame). If we are inside a function, which is the case in the code we are currently interpreting, we check whether we already have a global variable with the same name of the variable to which we are now trying to bind a value. As you already know, if a collision happens, we will simply assign the value to the existing global variable, and a local one will not be created. When the name is not being used, we create a new local variable and assign the intended value to it. Since `call_stack` is a stack (i.e., last in first out data structure), we know that this local variable should be stored in the `env` of the last stacked frame (i.e., the frame created for the function currently being processed). Finally, the last part of `#interpret_var_binding` deals with assignments happening outside functions. Since only functions create new scopes in Stoffle, nothing changes here, as variables created outside functions are always global and stored at the instance variable `env`.

Returning to our program, the next step is to interpret the loop responsible for summing up the integers. Let us refresh our memory and take a look at the AST of our Stoffle program again: The node representing the loop is `Stoffle::AST::Repetition`:

``````class Stoffle::AST::Repetition < Stoffle::AST::Expression
attr_accessor :condition, :block

def initialize(cond_expr = nil, repetition_block = nil)
@condition = cond_expr
@block = repetition_block
end

def ==(other)
children == other&.children
end

def children
[condition, block]
end
end
``````

Note that this AST node basically brings together concepts we have explored in previous articles. For its conditional, we will have an expression that will generally have at its root (think about the expression's AST root node) a `Stoffle::AST::BinaryOperator` (e.g., '>', 'or' and so on). For the body of the loop, we will have an `Stoffle::AST::Block`. This makes sense, right? The most basic form of a loop is one or more expressions (a block) to be repeated while an expression is truthy (i.e., while the conditional evaluates to a truthy value).

The corresponding method at our interpreter is `#interpret_repetition`:

``````def interpret_repetition(repetition)
while interpret_node(repetition.condition)
interpret_nodes(repetition.block.expressions)
end
end
``````

Here, you may be surprised by the simplicity (and, dare I say, beauty) of this method. We can implement the interpretation of loops by combining methods that we have already explored in past articles. By using Ruby's `while` loop, we can make sure that we continue interpreting the nodes that compose our Stoffle loop (by repeatedly calling `#interpret_nodes`) while the evaluation of the conditional is true. The job of evaluating the conditional is as easy as calling the usual suspect, the `#interpret_node` method.

## Returning from the Function

We are almost at the finish line! After the loop, we proceed and print out the result of the summation to the console. We are not going through it again since we already covered it in the last part of the series. As a quick recap, remember that the `println` function is provided by Stoffle itself and, internally in the interpreter, we are simply using Ruby's own `puts` method.

To finish this post, we have to revisit `#interpret_nodes`. Its final version is a bit different from the one we saw in times past. Now, it includes code to handle returning from a function and unwinding the call stack. Here is the completed version of `#interpret_nodes` in its full glory:

``````def interpret_nodes(nodes)
last_value = nil

nodes.each do |node|
last_value = interpret_node(node)

if return_detected?(node)
raise Stoffle::Error::Runtime::UnexpectedReturn unless call_stack.length > 0

self.unwind_call_stack = call_stack.length # We store the current stack level to know when to stop returning.
return last_value
end

if unwind_call_stack == call_stack.length
# We are still inside a function that returned, so we keep on bubbling up from its structures (e.g., conditionals, loops etc).
return last_value
elsif unwind_call_stack > call_stack.length
# We returned from the function, so we reset the "unwind indicator".
self.unwind_call_stack = -1
end
end

last_value
end
``````

As you already know, `#interpret_nodes` is used every time we need to interpret a bunch of expressions. It is used to start interpreting our program and on every occasion when we encounter nodes that have a block associated with them (such as `Stoffle::AST::FunctionDefinition`). Specifically, when dealing with functions, there are two scenarios: interpreting a function and hitting a `return` expression or interpreting a function to its end and not hitting any `return` expressions. In the second case, it means either the function does not have any explicit `return` expressions or the code path that we went through did not have a `return`.

Let us refresh our memories before continuing. As you probably remember from a couple paragraphs above, `#interpret_nodes` was called when we started interpreting the `sum_integers` function (i.e., when it got called in our program). Again, here is the source code of the program we are going through:

``````fn sum_integers: first_integer, last_integer
i = first_integer
sum = 0
while i <= last_integer
sum = sum + i

i = i + 1
end

println(sum)
end

sum_integers(1, 100)
``````

We are at the end of interpreting the function. As you might be guessing, our function does not possess an explicit `return`. This is the easiest path of `#interpret_nodes`. We basically iterate through all the function nodes, returning the value of the last interpreted expression at the end (quick reminder: Stoffle has implicit returns). This takes us to the finish line, concluding the interpretation of our program.

Although our program has now been fully interpreted, the main purpose of this article is to explain the implementation of the interpreter, so let’s take a bit more time here and see how the interpreter deals with cases in which we do hit a `return` inside a function.

First, the `return` expression gets evaluated at the beginning of the operation. Its value will be the evaluation of what is being returned. Here is the code for `Stoffle::AST::Return`:

``````class Stoffle::AST::Return < Stoffle::AST::Expression
attr_accessor :expression

def initialize(expr)
@expression = expr
end

def ==(other)
children == other&.children
end

def children
[expression]
end
end
``````

Then, we have a simple conditional that will detect `return` AST nodes. Having done this, we first perform a sanity check to verify that we are inside a function. To do so, we can simply check the length of the call stack. A length greater than zero means we are indeed inside a function. In Stoffle, we do not allow the use of `return` expressions outside functions, so we raise an error if this check fails. Before returning the value intended by the programmer, we first keep record of the current length of the call stack, storing it at the instance variable `unwind_call_stack`. To understand why this is important, let’s review `#interpret_function_call`:

``````def interpret_function_call(fn_call)
return if println(fn_call)

fn_def = fetch_function_definition(fn_call.function_name_as_str)

stack_frame = Stoffle::Runtime::StackFrame.new(fn_def, fn_call)

assign_function_args_to_params(stack_frame)

# Executing the function body.
call_stack << stack_frame
value = interpret_nodes(fn_def.body.expressions)
call_stack.pop
value
end
``````

Here, at the end of `#interpret_function_call`, note that we pop the stack frame from the call stack after interpreting the function. As a consequence, the length of the call stack will be reduced by one. Since we have preserved the length of the stack at the moment we detected the return, we can compare this initial length every time we interpret a new node at `#interpret_nodes`. Let’s take a look at the segment that does this inside the node iterator of `#interpret_nodes`:

``````def interpret_nodes(nodes)
# ...

nodes.each do |node|
# ...

if unwind_call_stack == call_stack.length
# We are still inside a function that returned, so we keep on bubbling up from its structures (e.g., conditionals, loops etc).
return last_value
elsif unwind_call_stack > call_stack.length
# We returned from the function, so we reset the "unwind indicator".
self.unwind_call_stack = -1
end

# ...
end

# ...
end
``````

This may be a bit difficult to grasp at first. I encourage you to check the full implementation on GitHub and play around with it if you think it may help you understand this last bit of the interpreter. The crucial point to have in mind here is that a typical program has many deeply nested structures. Ergo, executing `#interpret_nodes` generally results in a new call to `#interpret_nodes`, which may result in more calls to `#interpret_nodes` and so on! When we hit a `return` inside a function, we may be inside a deeply nested structure. Imagine, for example, that the `return` is inside a conditional that is part of a loop. To return from the function, we have to return from all `#interpret_nodes` calls until we return from the one initiated by `#interpret_function_call` (i.e., the call to `#interpret_nodes` that started the interpretation of the function body).

At the segment of code above, we highlight exactly how we do this. By having a positive value at `@unwind_call_stack` and one that is equal to the current length of the call stack, we know for certain that we are inside a function and that we still did not `return` from the original call initiated by `#interpret_function_call`. When this finally happens, `@unwind_call_stack` will be greater than the current length of the call stack; thus, we know that we have exited the function that returned, and we no longer need to continue the process of bubbling up. Then, we reset `@unwind_call_stack`. Just to make the use of `@unwind_call_stack` crystal clear, here are its possible values:

• -1, its initial and neutral value, indicating that we are not inside any functions that returned.
• positive value equal to the call stack length, indicating that we are still inside a function that returned.
• positive value greater than the call stack length, indicating that we are no longer inside the function that returned.

## Running our Program Using the Stoffle CLI

In the previous article of the series, we created a simple CLI to make interpreting Stoffle programs easier. Now that we have explored the interpreter's implementation, let’s see it in action, running our program. As shown above in many different sections, our code defines and then calls the `sum_integers` function passing the arguments `1` and `100`. If our interpreter is working properly, we should see `5050.0` (the sum of the set of integers starting at 1 and ending at 100) printed to the console: ## Closing Thoughts

In this post, we implemented the final parts needed to complete our interpreter. We tackled function definition, function calling, variable scoping, and loops. Now, we have a simple but working programming language!

In the next and final part of this series, I will share some resources that I consider great options for those that want to continue their programming language implementation studies. I will also propose some challenges you can take on to continue your learning while improving your version of Stoffle. See you later; ciao!

Honeybadger has your back when it counts. We're the only error tracker that combines exception monitoring, uptime monitoring, and cron monitoring into a single, simple to use platform.

Our mission: to tame production and make you a better, more productive developer. Learn more #### Alex Braha Stoll

Alex is a software developer that cannot get tired of attempting to write the next line of code at least a little better than the one before it. In his free time, he likes to study and practice Photography.

“We’ve looked at a lot of error management systems. Honeybadger is head and shoulders above the rest and somehow gets better with every new release.” Michael Smith
Are you using Bugsnag, Rollbar, or Airbrake for your monitoring? Honeybadger includes exception, uptime, and check-in monitoring — all for probably less than you’re paying now. Discover why so many companies are switching to Honeybadger here.
Stop digging through chat logs to find the bug-fix someone mentioned last month. Honeybadger's built-in issue tracker keeps discussion central to each error, so that if it pops up again you'll be able to pick up right where you left off.
"Wow — Customers are blown away that I email them so quickly after an error." Chris Patton