Rule of thumb: Preconditions Should be Checked Explicitly

posted by Craig Gidney on April 2, 2013

When I started programming, all I cared about was “does it work?”. All of the challenge was in figuring out how to achieve that. A decade and a half later, I care about more than just working/not-working.

I care about code not breaking when being changed. I care about code being understood. I care about not relying on assumptions that happen to work now, but may not apply in the future.

These goals imply many rules of thumb. One of them, my topic for this week, is that you should start methods with code that verifies that the input is valid.

Preconditions

Every method has preconditions. Things that are expected be true of the input. Sometimes these conditions can be checked at compile-time (e.g. “X must be a 32-bit integer”), but often they can only be checked at runtime (e.g. “X must be a non-null String” in Java).

In C/C++ land, where “you don’t pay for what you don’t use” is king, preconditions are almost never checked. Prudent callers will ensure their inputs are correct, and they will not have to pay for unnecessary checks. Not-so-prudent callers are jerks and deserve whatever happens to them (we’d say what that was, but nothing’s scarier than an undefined threat).

I take the opposite stance: correctness and error-tolerance first, then speed. Assuming your callers are prudent is insane. Your callers are human. They make mistakes, even when being very careful. It would be nice if you helped them find their mistakes, instead of silently overwriting random bytes in memory.

Actually, I go even further. It’s not enough for a method’s implementation to happen to throw the right exceptions, helping out at runtime. The preconditions should be explicitly listed in the code at the start of the method, helping out during development.

Exceptions

This is a rule of thumb, not a law. There are cases where you should not use it:

  • Sometimes correctness requires speed (e.g. real time systems). When a method is a performance bottleneck, checking anything may be too expensive.
  • Many preconditions can be enforced statically, and checking them at runtime is unnecessary (e.g. “X must be an integer”).
  • Some preconditions are far too expensive to check (e.g. “F must be a commutative function”).
  • Some preconditions are too difficult for a computer to check (e.g. “S must contain English text”).

I should also note that, ideally, this same rule of thumb would apply to postconditions. The problem is that it’s so darn inconvenient to check them. For example, you end up having to duplicate code to cover all the exit points and invariably miss one here and there. You need a tool like code contracts (that doesn’t break things like assembly signing every other version…) to make explicit postconditions viable.

Pros

Preconditions are part of a method’s signature / interface. Putting them in the code makes it self-documenting, and allows developers and automated tools to discover the preconditions without doing any serious deduction or inference.

Without explicit preconditions, it’s very easy to accidentally change a method’s behavior in the invalid cases. People have a nasty habit of relying on the behavior of invalid cases, even if the documentation says it’s unspecified (see: half the posts on The Old New Thing), so changing it is likely to cause bugs.

Checking the preconditions everywhere causes code to fail fast, so mistakes occur closer to their source. This reduces the amount of time spent debugging.

Cons

Checking the preconditions costs time. They contribute to “bloat”.

Including the preconditions in the code increases the size of the code. That means more chances to introduce a bug, more places for bugs to hide, and more typing. Thankfully the most common precondition, is-not-null, can be automatically inserted by tools.

It’s easy to waste a lot of time giving each error case a description in prose. I avoid that particular pitfall by always setting the message to be the expression that failed, but that’s another rule of thumb I’m sure others disagree with.

Example

The following code determines if a subrange of an array contains a value. It is expected that the caller specify a valid array and subrange. Three different implementation options are shown:

public static bool RangeContains(T[] array, int rangeOffset, int rangeCount, T value) {
    // explicit preconditions
    if (array == null) throw new ArgumentNullException("array");
    if (rangeOffset < 0) throw new ArgumentOutOfRangeException("rangeOffset < 0");
    if (rangeCount < 0) throw new ArgumentOutOfRangeException("rangeCount < 0");
    if (rangeOffset + rangeCount > array.Length)
        throw new ArgumentOutOfRangeException("rangeOffset + rangeCount > array.Length");

    // (implementation option A)
    return array.Skip(rangeOffset)
                .Take(rangeCount)
                .Any(e => Equals(e, value));

    // (implementation option B)
    return array.SkipExact(rangeOffset)
                .TakeExact(rangeCount)
                .Any(e => Equals(e, value));

    // (implementation option C)
    for (int i = 0; i < rangeCount; i++) {
        if (Equals(array[i + rangeOffset], value)) {
            return true;
        }
    }
    return false;
}

Consider what happens without the explicit preconditions:

  • Do the implementation options do different things in the invalid cases?
  • Does anything get checked when rangeCount is zero?
  • Does it matter that Skip and Take are extremely permissive, clamping too-small and too-large values into range?
  • If rangeOffset+rangeCount is past the end of array, can a matching item in the subset of the range that's valid determine if an exception is thrown or not?
  • Does it matter that TakeExact checks lazily, instead of eagerly, whether or not the given count is in range?

Each of the implementation options looks similar. I wouldn't be surprised to see a commit change from one to another, in the name of optimization or succinctness. But it turns out that they all differ in how invalid cases are treated. Even worse, none of them would throw in all cases deemed invalid by the explicit preconditions.

Summary

Explicit preconditions are a good way to communicate intent. They reduce the likelihood of particular types of bugs and legacy issues being introduced.

At least... that's what I expect. Someone should do a study.

---

Discuss on Hacker News, Reddit

---


Twisted Oak Studios offers consulting and development on high-tech interactive projects. Check out our portfolio, or Give us a shout if you have anything you think some really rad engineers should help you with.

Archive

More interesting posts (17 of 22 articles)

Or check out our Portfolio.