Tuesday, January 5, 2010

Recursion and Inductive Proofs

I've discovered that programmers frequently stumble when writing recursive algorithms. There are some simple rules that programmers can steal from mathematics to make writing recursive algorithms easy. I should point out that I'm only talking about writing recursive algorithms for interacting with recursive data-structures (lists, trees, etc.). Let's get started by reviewing some basic steps for proving a statement holds for all natural numbers n using an inductive proof.
  1. Base case:  Show the statement holds for a value of n (usually n=1 or n=0).
  2. Inductive step:  Assuming the statement holds for any value of n then show it also holds for a value of n+1
Rule 2.5 would be to make sure that the inductive step reduces the problem space.  Using a base case of n=0 and then and inductive step of n-1 wouldn't be a valid proof because the inductive step does not really reduce the problem space (remember we're talking about natural numbers for this example).  So let's use an inductive proof to show that 0+1+2...n=(n(n+1))/2.

Base case: Show the statement holds for n=0
0=0(0+1)/2
This equation clearly holds.

Inductive step:  Assuming the statement holds for any value of n then show it will also hold for n+1.
0+1+2...n+(n+1) = ((n+1)((n+1)+1)/2
If we assume that the statement holds for n then we can reduce the expression.
(n(n+1))/2 + (n+1) = ((n+1)((n+1)+1)/2
Using some basic algebra we can reduce both sides and discover that the inductive step holds.

Now we can apply same (perhaps slightly tweaked) rules to writing a recursive algorithm for finding the length of a linked-list (a recursive data-structure).

Let's start out with the function signature.
int length(Node *node)
{
  // Do stuff here
}
Base case: Write the algorithm for n=0 (the linked-list with 0 nodes).
int length(Node *node)
{
  if (NULL == node)
  {
    return 0;
  }
  // Do more stuff here
}
Congratulations! If you've make it this far you're above average. You've solved the base case without dumping core.

Now onto the inductive step: Assume that your length function will work for lists of length n now write code to make it work for lists of length n + 1.
int length(Node *node)
{
  if (NULL == node)
  {
    return 0;
  }
  else
  {
    return 1 + length(node->next);
  }
}
Notice that if we assume that "length(...)" works for lists of length n then for lists of length n + 1 all we need to do is add 1 to the length. You should also notice that we passed node->next to the "length(...)" function not just node because rule 2.5 states that the inductive step must reduce the problem space and just passing node would not reduce the problem space. Congratulations! If you've make it this far you're now well above average. Now not only will your algorithm not core dump, but it won't hang either.

So let's review what we've done. We've used the rules for inductive proofs to guide us in writing a recursive algorithm. Does this mean we've proved that this algorithm is correct? The answer is, not really or at least, not formally. Even though we haven't proved this algorithm is correct we can be much more confident that it will work.

If we were to write unit-tests for this algorithm (which you should *always* do before writing the algorithm) we could use the rules of induction to guide our unit tests. We could write a test for testing the base case and another test for testing the inductive case.

Since math has been around longer than computers, it makes sense that we could borrow a few ideas from it to make our jobs easier.

1 comment:

  1. This comment has been removed by a blog administrator.

    ReplyDelete