Recursion is a fundamental concept that most programmers should find familiar. It is often a more graceful way to represent performing some action over and over without resorting to a purely procedural approach, such as while-loops.

Recursion is especially important as a basic tenent in the theory of computation. In fact, the very determination of what is and what is not considered computable is often described in terms of recursively definable functions. We will see however that even as elegant as recursion can be, the traditional model can still be quite messy when used to describe computation in mathematical and logical exactitude. The theory of fixed points provides a delightful yet unintuitive solution.

From a simple algebraic point of view, most of us have first encountered fixed points in algebra or geometry, although we may not be familiar with the terminology. Simply put, given a function f(x), some particular value of x is called a fixed point if f(x)=x. As an example, consider the function g(x)=1/x. In this case the number 1 is a fixed point of g, since g(1)=1. The general idea is simple enough, although the fixed points we will be talking about are of a much more advanced breed.

Factorials revisited

The classic problem used to introduce recursion is often the factorial. The factorial of an integer n is the product of all the integers below it, e.g., n * (n-1) * ... * 3 * 2 * 1. A classic recursive function to compute the factorial is:

define function f(y):
  if equal(y,0) then 1 else y * f(y-1)

There is something unsettling about this recursive function; the definition depends upon itself, even though this is the primary property of recursion. However from an rigorous axiomatic perspective such cyclic definitions are absurd and nonsensical. Imagine if some dictionary routinely defined a word in terms of itself; it would be essentially useless. So instead of the above recursive definition, we need to find an equivalent equational definition.

We may reformulate our recursive function f(x) into an equation by abstracting out the quantity f. If you've never studied so-called first class functions, that may sound very strange, but it will make sense. Try to think of f as just another variable instead of a function. Then the factoring process starts with the following equation, which we will label F (presented first in the familiar equational form and then in lambda form):

F:  f = if equal(y,0) then 1 else y * f(y-1)

F:  λf.λy. if equal(y,0) then 1 else y * f(y-1)

Yes, we are factoring out the function f just as if it were a plain variable in algebra; this is permissible. Also notice that the above equation is not defined in terms of itself; f only occurs as a bound variable just like y.

Now the remarkable thing is that the factorial function is a fixed-point of the above equation. Since we called the above equation F, we will call one of its fixed points (fix F). To see how this works examine the following reduction sequence:

  1. fact n = (fix F) n
  2. fact n = F (fix F) n
  3. fact n = (λf.λy. if equal(y,0) then 1
      else y * f(y-1))  (fix F) n
  4. fact n = if equal(n,0) then 1
      else n * (fix F)(n-1)

The first line is by definition, our claim that the fixed point of F is the same as the factorial function. The second line is obtained by applying the definition of the fixed point, namely that FX = X, where X is (fix F). Then we just apply lambda reduction on the remaining lines. It becomes obvious that we have reproduced the factorial function, but in an equational form which does not depend upon a recursive definition. This may appear like a subtle distinction, but it is a very important one.

Of course in our discussion we have stated that (fix F) is a fixed point of the equation F, but we never actually stated how to determine what the fixed point function actually is, if it is unique, or if it even exists. All we know so far is that if a fixed point of equation F can be found, it will be equivalent to the factorial function.

The fixed-point theorem

For the purposes of introductory exposition here, I will discuss fixed points within the scope of the untyped lambda calculus. Fixed points are equally applicable to typed lambda calculi as well, but I will not discuss that here for the sake of simplicity. In lambda calculus, a function F is said to have a fixed point X, if FX reduces to just X; or more simply put, if FX=X. We now visit a remarkable theorem.

Fixed Point Theorem: For every function F there exists an X such that FX=X.

This simply stated theorem is not at all expected, especially for those only familiar with the algebraic or geometric notions of fixed points. For in algebra, it is certainly posible to find a function f(x), for which f(x) is never equal to x [take for example f(x)=x+1]. But the above theorem states just the opposite, that no matter what function you can think of, it has at least one fixed point. It is this revelation which allows us to substitute fixed points for the clumsy semantics of recursion.

The proof of the fixed point theorem is equally remarkable for its simplicity as well as providing us a way to find a fixed point. It is one thing to say that every function has a fixed point, but quite another to be able to demonstrate it. The proof follows...

Proof: For any function F, let W be the function λx.F(xx) and let X = WW.
We claim that X is a fixed point of F. Demonstrated as follows

X = WW
X = λx.F(xx) W
X = F(WW)
X = FX

The term X in the above theorem is called a fixed-point of function F.

Fixed point combinators

A combinator is simply a lambda expression which contains no free variables [see my article Combinatory Logic]. An important property of combinators is that they are self-contained, they don't rely upon an environment or external definitions. It is possible to create a fixed-point combinator, which captures the essence of a generic fixed point. That is, it provides a consistent way to generate fixed points without requiring knowledge of the function whose fixed point we are seeking.

We have already come close to a fixed point combinator in the proof of the fixed point theorem. If we just abstract the function F from the definition of the term X, then we end up with perhaps the most legandary combinator, the so called Curry Fixed Point Combinary, identified with the symbol Y.

Y = λf.( (λx.f(xx)) (λx.f(xx)) )

The combinator Y is a generator of fixed points. Thus given any arbitrary function g we have

Yg = g(Yg)

So you can see that Yg is a fixed point of g; and it doesn't matter what function g is.

Many other fixed point combinators exist. For instance, one which Alan Turing proposed is identified by the symbol Θ and defined as,

Θ = AA,   where
   A = λxy.y(xxy)