Taming Legacy Code With Characterization Tests

Developers make fun of legacy systems because we're scared of them. We're afraid that the tiniest change will cause the app to break in unexpected ways. We're afraid we won't realize it until a customer complains. One way to combat this fear is through testing. In this article, José Manuel shows us how to retrofit legacy systems with acceptance test suites so we can maintain them with less fear and more confidence.

Have you ever needed to change code that already works, but has no tests and is difficult to fully understand? How did you feel deploying those changes? Were you nervous?

I recently had to upgrade a 7 year old API endpoint with hundreds of lines of code. This was part of a new feature release, and in order to make the changes safely, while also making sure the final code was cleaner and well tested, I made use of some techniques that I'd love to share using a short but interesting example.

For our example, let's say that we mask sensitive data in some places in our product. Our product manager asks us if we can change the way emails are masked so that they look like this:

*****@example.com

Instead of this:

***************.com

We then open our text editor and find the method where this is happening:

def mask(str)
 if str.length <= 4
   return str
 end

 # Email
 if str =~ URI::MailTo::EMAIL_REGEXP
   limit = str.length - 4
   return "#{'*' * limit}#{str[limit..-1]}"
 end

 # Phone Number or ID
 if str.is_a?(Numeric)
   limit = str.length - 4
   return "#{'*' * limit}#{str[limit..-1]}"
 end

 # String, like a name
 limit = str.length - 4
 return "#{'*' * limit}#{str[limit..-1]}"
end

The bad news is that we don't have any tests in place, which means we won't know if we're breaking something. How should we move forward?

How to work with legacy code

The first goal is to add tests without changing anything in the code if possible. This kind of test will be useful for two reasons:

  • We'll better understand the code as it exists before the changes we need to make.

  • We'll have some tests that will tell us if we break something when we make changes.

Let's create a simple test first.

First test

We can use any testing library we want, but in this case we already have rspec set up in the project so we'll use that.

describe 'mask' do
  it "masks regular text" do
    expect(mask('simple text')).to eq('*******text')
  end
end

We're guessing by looking at the code in the last case where it says "string, like a name" that the whole string except for the last 4 characters will be asterisks. And the test passes!

$ rspec
.

Finished in 0.0026 seconds (files took 0.12025 seconds to load)
1 example, 0 failures

At this point, we might be tempted to do the same thing with other lines of the method by trying to guess beforehand what the result should be.

But instead, we'll use a different technique.

Characterization tests

A characterization test is a technique coined by Michael Feathers where we write tests for code where we don't have them. It also means that, to some extent, we shouldn't be guessing what the code should do and instead we should be in "exploratory" mode.

We should create tests without really knowing what's going to happen and then document those cases. With this approach, our first test could look like this:

it "x" do
 expect(mask('simple text')).to eq(nil)
end

And we get this output:

F

Failures:

  1) mask x
     Failure/Error: expect(mask('simple text')).to eq(nil)

       expected: nil
            got: "*******text"

       (compared using ==)
     # ./spec/mask_spec.rb:5:in `block (2 levels) in <top (required)>'

Finished in 0.01882 seconds (files took 0.12916 seconds to load)
1 example, 1 failure

Failed examples:

rspec ./spec/mask_spec.rb:4 # mask x

So now we can "document" this behavior in the test, like we did before but we will use the test output to guide us:

it "masks regular text" do
  expect(mask('simple text')).to eq('*******text')
end

Let's add another test.

Email test

Like before, let's use an "X" test:

it "x" do
  expect(mask('example@example.com')).to eq(nil)
end
$ rspec
.F

Failures:

  1) mask x
     Failure/Error: expect(mask('example@example.com')).to eq(nil)

       expected: nil
            got: "***************.com"

       (compared using ==)
     # ./spec/mask_spec.rb:9:in `block (2 levels) in <top (required)>'

Finished in 0.03376 seconds (files took 0.34069 seconds to load)
2 examples, 1 failure

Failed examples:

rspec ./spec/mask_spec.rb:8 # mask x

So it seems the behavior is the same. We get asterisks and four characters. Let's change the test:

it "masks an email address" do
  expect(mask('example@example.com')).to eq('***************.com')
end

Numeric test

We see in the middle of the method that if we pass a string with a numeric form (str.is_a?(Numeric)) there's a different "if" condition:

if str.is_a?(Numeric)
 limit = str.length - 4
 return "#{'*' * limit}#{str[limit..-1]}"
end

Let's see what happens if we pass a number as a string:

it "x" do
  expect(mask('635914526')).to eq(nil)
end
$ rspec
..F

Failures:

  1) mask x
     Failure/Error: expect(mask('635914526')).to eq(nil)

       expected: nil
            got: "*****4526"

       (compared using ==)
     # ./spec/mask_spec.rb:13:in `block (2 levels) in <top (required)>'

Finished in 0.02038 seconds (files took 0.17098 seconds to load)
3 examples, 1 failure

Failed examples:

rspec ./spec/mask_spec.rb:12 # mask x

So, again, we can document this behavior:

it "masks numbers as strings" do
  expect(mask('635914526')).to eq('*****4526')
end

Adding more characterization tests

So far, we've only covered some very simple cases, so we should include more tests with the same approach:

  1. Write a failing test "X" to see the result
  2. Execute it and copy the output
  3. Update that test to better document the behavior and include the right output
  4. Now the test passes and we can go back to step 1 for another case.

I'd probably repeat these steps until I'm relatively comfortable with all the cases covered. In this example, I'd probably include cases like:

  • Short string with 4 characters
  • Short string with 3 characters
  • Empty string
  • Other formats of strings with symbols like dashes or dots.

After doing this we end up with the following spec file:

require_relative './../mask'

describe 'mask' do
  it "masks regular text" do
    expect(mask('simple text')).to eq('*******text')
  end

  it "masks an email address" do
    expect(mask('example@example.com')).to eq('***************.com')
  end

  it "masks numbers as strings" do
    expect(mask('635914526')).to eq('*****4526')
  end

  it "does not mask a string with 4 characters" do
    expect(mask('asdf')).to eq('asdf')
  end

  it "does not mask a string with 3 characters" do
    expect(mask('asd')).to eq('asd')
  end

  it "does not do anything with empty strings" do
    expect(mask('')).to eq('')
  end

  it "masks symbols like regular characters" do
    expect(mask('text .-@$ with symbols-')).to eq('*******************ols-')
  end
end

Next steps, refactoring.

After documenting the current behavior, it's important to see if we can simplify the code before introducing a change.

Refactoring is now safer since we have tests in place in case we break something.

Refactoring step by step would make this article too long, so I'm going to skip that process and show a potential solution after some steps:

  • Extracting the number of visible characters to a constant.
  • Simplifying repeated case for numeric strings and regular strings.
  • Extract condition to check if a string is an email.
  • Move the whole method to a StringUtils class as a static one.

After these refactorings the tests still pass and the code looks like this:

class StringUtils
  N_VISIBLE_CHARACTERS = 4

  def self.mask(str)
    return str unless should_mask(str)

    if is_email(str)
      mask_email(str)
    else
      mask_string(str)
    end
  end

  def self.should_mask(str)
    str.length > N_VISIBLE_CHARACTERS
  end

  def self.mask_email(str)
    limit = str.length - N_VISIBLE_CHARACTERS
    "#{'*' * limit}#{str[limit..-1]}"
  end

  def self.is_email(str)
    str =~ URI::MailTo::EMAIL_REGEXP
  end

  def self.mask_string(str)
    limit = str.length - N_VISIBLE_CHARACTERS
    "#{'*' * limit}#{str[limit..-1]}"
  end
end

Note that I've left maskemail_ as a different method because we need to change it now, so it's not going to be the same code we have for the other case.

I would probably clean up some tests, since now some cases are repeated such as numeric strings and regular strings, but it's not necessary for the purpose of this article.

Introducing our change with TDD

Finally, after we have a cleaner method with tests, we can make the requested change. Emails should be masked like:

*****@example.com

Instead of the current behavior:

***************.com

To implement this change, we can use TDD (Test Driven Development) where we start with a failing test before the final code is written.

Since we already have a test for this case, let's change it to represent this new behavior:

it "masks an email address" do
  expect(StringUtils::mask('example@example.com')).to eq('*******@example.com')
end

And, as expected, this test fails:

$ rspec
.F......

Failures:

  1) StringUtils mask masks an email address
     Failure/Error: expect(StringUtils::mask('example@example.com')).to eq('*******@example.com')

       expected: "*******@example.com"
            got: "***************.com"

       (compared using ==)
     # ./spec/string_utils_spec.rb:10:in `block (3 levels) in <top (required)>'

Finished in 0.08843 seconds (files took 0.2876 seconds to load)
8 examples, 1 failure

Failed examples:

rspec ./spec/string_utils_spec.rb:9 # StringUtils mask masks an email address

A valid solution could be changing the maskemail_ method to:

def self.mask_email(str)
  email_sections = str.split("@")
  email_username = email_sections.first
  email_domain = email_sections.last
  "#{'*' * email_username.length}@#{email_domain}"
end

And now our test passes:

$ rspec
........

Finished in 0.00385 seconds (files took 0.14405 seconds to load)
8 examples, 0 failures

Review of the process

We've gone through several different phases, let's review them together:

  • We had a piece of code that was difficult to read. We knew it was working since it was being used in production, so we couldn't change its current behavior.
  • We documented its behavior with characterization tests.
  • After having tests to check that we aren't breaking anything, we refactored the code to make it cleaner.
  • We modified one of the tests to represent the requested change so it failed.
  • Finally, we modified the code with the requested change and the tests passed again.

This process helps solve the common challenge of needing to modify code that has no tests, and yet we know the code is working relatively well. I hope that with characterization tests you now have a new technique in your toolbox to make changes more safely in the future.

author photo

José M. Gilgado

José is a senior software engineer with over 10 years of experience. He works remotely at Buffer from Spain and loves good books and coffee.


“We’ve looked at a lot of error management systems. Honeybadger is head and shoulders above the rest and somehow gets better with every new release.”
Michael Smith
Try Error Monitoring Free for 15 Days
Are you using Bugsnag, Rollbar, or Airbrake for your monitoring? Honeybadger includes exception, uptime, and check-in monitoring — all for probably less than you’re paying now. Discover why so many companies are switching to Honeybadger here.
Try Error Monitoring Free for 15 Days
Stop digging through chat logs to find the bug-fix someone mentioned last month. Honeybadger's built-in issue tracker keeps discussion central to each error, so that if it pops up again you'll be able to pick up right where you left off.
Try Error Monitoring Free for 15 Days
"Wow — Customers are blown away that I email them so quickly after an error."
Chris Patton
Try Error Monitoring Free for 15 Days