Taming Legacy Code With Characterization Tests

Developers make fun of legacy systems because we're scared of them. We're afraid that the tiniest change will cause the app to break in unexpected ways. We're afraid we won't realize it until a customer complains. One way to combat this fear is through testing. In this article, José Manuel shows us how to retrofit legacy systems with acceptance test suites so we can maintain them with less fear and more confidence.

Have you ever needed to change code that already works, but has no tests and is difficult to fully understand? How did you feel deploying those changes? Were you nervous?

I recently had to upgrade a 7 year old API endpoint with hundreds of lines of code. This was part of a new feature release, and in order to make the changes safely, while also making sure the final code was cleaner and well tested, I made use of some techniques that I'd love to share using a short but interesting example.

For our example, let's say that we mask sensitive data in some places in our product. Our product manager asks us if we can change the way emails are masked so that they look like this:

*****@example.com

Instead of this:

***************.com

We then open our text editor and find the method where this is happening:

def mask(str)
 if str.length <= 4
   return str
 end

 # Email
 if str =~ URI::MailTo::EMAIL_REGEXP
   limit = str.length - 4
   return "#{'*' * limit}#{str[limit..-1]}"
 end

 # Phone Number or ID
 if str.is_a?(Numeric)
   limit = str.length - 4
   return "#{'*' * limit}#{str[limit..-1]}"
 end

 # String, like a name
 limit = str.length - 4
 return "#{'*' * limit}#{str[limit..-1]}"
end

The bad news is that we don't have any tests in place, which means we won't know if we're breaking something. How should we move forward?

How to work with legacy code

The first goal is to add tests without changing anything in the code if possible. This kind of test will be useful for two reasons:

  • We'll better understand the code as it exists before the changes we need to make.

  • We'll have some tests that will tell us if we break something when we make changes.

Let's create a simple test first.

First test

We can use any testing library we want, but in this case we already have rspec set up in the project so we'll use that.

describe 'mask' do
  it "masks regular text" do
    expect(mask('simple text')).to eq('*******text')
  end
end

We're guessing by looking at the code in the last case where it says "string, like a name" that the whole string except for the last 4 characters will be asterisks. And the test passes!

$ rspec
.

Finished in 0.0026 seconds (files took 0.12025 seconds to load)
1 example, 0 failures

At this point, we might be tempted to do the same thing with other lines of the method by trying to guess beforehand what the result should be.

But instead, we'll use a different technique.

Characterization tests

A characterization test is a technique coined by Michael Feathers where we write tests for code where we don't have them. It also means that, to some extent, we shouldn't be guessing what the code should do and instead we should be in "exploratory" mode.

We should create tests without really knowing what's going to happen and then document those cases. With this approach, our first test could look like this:

it "x" do
 expect(mask('simple text')).to eq(nil)
end

And we get this output:

F

Failures:

  1) mask x
     Failure/Error: expect(mask('simple text')).to eq(nil)

       expected: nil
            got: "*******text"

       (compared using ==)
     # ./spec/mask_spec.rb:5:in `block (2 levels) in <top (required)>'

Finished in 0.01882 seconds (files took 0.12916 seconds to load)
1 example, 1 failure

Failed examples:

rspec ./spec/mask_spec.rb:4 # mask x

So now we can "document" this behavior in the test, like we did before but we will use the test output to guide us:

it "masks regular text" do
  expect(mask('simple text')).to eq('*******text')
end

Let's add another test.

Email test

Like before, let's use an "X" test:

it "x" do
  expect(mask('example@example.com')).to eq(nil)
end
$ rspec
.F

Failures:

  1) mask x
     Failure/Error: expect(mask('example@example.com')).to eq(nil)

       expected: nil
            got: "***************.com"

       (compared using ==)
     # ./spec/mask_spec.rb:9:in `block (2 levels) in <top (required)>'

Finished in 0.03376 seconds (files took 0.34069 seconds to load)
2 examples, 1 failure

Failed examples:

rspec ./spec/mask_spec.rb:8 # mask x

So it seems the behavior is the same. We get asterisks and four characters. Let's change the test:

it "masks an email address" do
  expect(mask('example@example.com')).to eq('***************.com')
end

Numeric test

We see in the middle of the method that if we pass a string with a numeric form (str.is_a?(Numeric)) there's a different "if" condition:

if str.is_a?(Numeric)
 limit = str.length - 4
 return "#{'*' * limit}#{str[limit..-1]}"
end

Let's see what happens if we pass a number as a string:

it "x" do
  expect(mask('635914526')).to eq(nil)
end
$ rspec
..F

Failures:

  1) mask x
     Failure/Error: expect(mask('635914526')).to eq(nil)

       expected: nil
            got: "*****4526"

       (compared using ==)
     # ./spec/mask_spec.rb:13:in `block (2 levels) in <top (required)>'

Finished in 0.02038 seconds (files took 0.17098 seconds to load)
3 examples, 1 failure

Failed examples:

rspec ./spec/mask_spec.rb:12 # mask x

So, again, we can document this behavior:

it "masks numbers as strings" do
  expect(mask('635914526')).to eq('*****4526')
end

Adding more characterization tests

So far, we've only covered some very simple cases, so we should include more tests with the same approach:

  1. Write a failing test "X" to see the result
  2. Execute it and copy the output
  3. Update that test to better document the behavior and include the right output
  4. Now the test passes and we can go back to step 1 for another case.

I'd probably repeat these steps until I'm relatively comfortable with all the cases covered. In this example, I'd probably include cases like:

  • Short string with 4 characters
  • Short string with 3 characters
  • Empty string
  • Other formats of strings with symbols like dashes or dots.

After doing this we end up with the following spec file:

require_relative './../mask'

describe 'mask' do
  it "masks regular text" do
    expect(mask('simple text')).to eq('*******text')
  end

  it "masks an email address" do
    expect(mask('example@example.com')).to eq('***************.com')
  end

  it "masks numbers as strings" do
    expect(mask('635914526')).to eq('*****4526')
  end

  it "does not mask a string with 4 characters" do
    expect(mask('asdf')).to eq('asdf')
  end

  it "does not mask a string with 3 characters" do
    expect(mask('asd')).to eq('asd')
  end

  it "does not do anything with empty strings" do
    expect(mask('')).to eq('')
  end

  it "masks symbols like regular characters" do
    expect(mask('text .-@$ with symbols-')).to eq('*******************ols-')
  end
end

Next steps, refactoring.

After documenting the current behavior, it's important to see if we can simplify the code before introducing a change.

Refactoring is now safer since we have tests in place in case we break something.

Refactoring step by step would make this article too long, so I'm going to skip that process and show a potential solution after some steps:

  • Extracting the number of visible characters to a constant.
  • Simplifying repeated case for numeric strings and regular strings.
  • Extract condition to check if a string is an email.
  • Move the whole method to a StringUtils class as a static one.

After these refactorings the tests still pass and the code looks like this:

class StringUtils
  N_VISIBLE_CHARACTERS = 4

  def self.mask(str)
    return str unless should_mask(str)

    if is_email(str)
      mask_email(str)
    else
      mask_string(str)
    end
  end

  def self.should_mask(str)
    str.length > N_VISIBLE_CHARACTERS
  end

  def self.mask_email(str)
    limit = str.length - N_VISIBLE_CHARACTERS
    "#{'*' * limit}#{str[limit..-1]}"
  end

  def self.is_email(str)
    str =~ URI::MailTo::EMAIL_REGEXP
  end

  def self.mask_string(str)
    limit = str.length - N_VISIBLE_CHARACTERS
    "#{'*' * limit}#{str[limit..-1]}"
  end
end

Note that I've left mask_email as a different method because we need to change it now, so it's not going to be the same code we have for the other case.

I would probably clean up some tests, since now some cases are repeated such as numeric strings and regular strings, but it's not necessary for the purpose of this article.

Introducing our change with TDD

Finally, after we have a cleaner method with tests, we can make the requested change. Emails should be masked like:

*****@example.com

Instead of the current behavior:

***************.com

To implement this change, we can use TDD (Test Driven Development) where we start with a failing test before the final code is written.

Since we already have a test for this case, let's change it to represent this new behavior:

it "masks an email address" do
  expect(StringUtils::mask('example@example.com')).to eq('*******@example.com')
end

And, as expected, this test fails:

$ rspec
.F......

Failures:

  1) StringUtils mask masks an email address
     Failure/Error: expect(StringUtils::mask('example@example.com')).to eq('*******@example.com')

       expected: "*******@example.com"
            got: "***************.com"

       (compared using ==)
     # ./spec/string_utils_spec.rb:10:in `block (3 levels) in <top (required)>'

Finished in 0.08843 seconds (files took 0.2876 seconds to load)
8 examples, 1 failure

Failed examples:

rspec ./spec/string_utils_spec.rb:9 # StringUtils mask masks an email address

A valid solution could be changing the mask_email method to:

def self.mask_email(str)
  email_sections = str.split("@")
  email_username = email_sections.first
  email_domain = email_sections.last
  "#{'*' * email_username.length}@#{email_domain}"
end

And now our test passes:

$ rspec
........

Finished in 0.00385 seconds (files took 0.14405 seconds to load)
8 examples, 0 failures

Review of the process

We've gone through several different phases, let's review them together:

  • We had a piece of code that was difficult to read. We knew it was working since it was being used in production, so we couldn't change its current behavior.
  • We documented its behavior with characterization tests.
  • After having tests to check that we aren't breaking anything, we refactored the code to make it cleaner.
  • We modified one of the tests to represent the requested change so it failed.
  • Finally, we modified the code with the requested change and the tests passed again.

This process helps solve the common challenge of needing to modify code that has no tests, and yet we know the code is working relatively well. I hope that with characterization tests you now have a new technique in your toolbox to make changes more safely in the future.

What to do next:
  1. Try Honeybadger for FREE
    Honeybadger helps you find and fix errors before your users can even report them. Get set up in minutes and check monitoring off your to-do list.
    Start free trial
    Easy 5-minute setup — No credit card required
  2. Get the Honeybadger newsletter
    Each month we share news, best practices, and stories from the DevOps & monitoring community—exclusively for developers like you.
    author photo

    José M. Gilgado

    José is a senior software engineer with over 10 years of experience. He works remotely at Buffer from Spain and loves good books and coffee.

    More articles by José M. Gilgado
    Stop wasting time manually checking logs for errors!

    Try the only application health monitoring tool that allows you to track application errors, uptime, and cron jobs in one simple platform.

    • Know when critical errors occur, and which customers are affected.
    • Respond instantly when your systems go down.
    • Improve the health of your systems over time.
    • Fix problems before your customers can report them!

    As developers ourselves, we hated wasting time tracking down errors—so we built the system we always wanted.

    Honeybadger tracks everything you need and nothing you don't, creating one simple solution to keep your application running and error free so you can do what you do best—release new code. Try it free and see for yourself.

    Start free trial
    Simple 5-minute setup — No credit card required

    Learn more

    "We've looked at a lot of error management systems. Honeybadger is head and shoulders above the rest and somehow gets better with every new release."
    — Michael Smith, Cofounder & CTO of YvesBlue

    Honeybadger is trusted by top companies like:

    “Everyone is in love with Honeybadger ... the UI is spot on.”
    Molly Struve, Sr. Site Reliability Engineer, Netflix
    Start free trial
    Are you using Sentry, Rollbar, Bugsnag, or Airbrake for your monitoring? Honeybadger includes error tracking with a whole suite of amazing monitoring tools — all for probably less than you're paying now. Discover why so many companies are switching to Honeybadger here.
    Start free trial
    Stop digging through chat logs to find the bug-fix someone mentioned last month. Honeybadger's built-in issue tracker keeps discussion central to each error, so that if it pops up again you'll be able to pick up right where you left off.
    Start free trial
    “Wow — Customers are blown away that I email them so quickly after an error.”
    Chris Patton, Founder of Punchpass.com
    Start free trial