P4P: May 2011

The Y Combinator is among the most famous functions in theoretical computer science, which is quite ironic given that it's anonymous (and only really useful because it's anonymous). There's even a massive venture capital firm that took its name. And - somewhat unusually for theoretical CS constructs - it's quite useful.

There are no shortage of tutorials describing how it works, but I couldn't find many which worked through an example. So here's my go at it.

Here's a definition of the combinator that you've probably seen before:

	-- \| Y combinator. Note that this doesn't actually compile in Haskell, but it's simpler
	-- than the one which works
	Y f = (\x -> f (x x)) (\x -> f (x x))

view raw yc.hs hosted with ❤ by GitHub

Let's roll it out to see what happens when we actually use it:

	-- An example of how the Y combinator expands into an infinite
	-- list of function applications
	Y f = (\x -> f (x x)) (\x -> f (x x))
	= f $ (\x -> f (x x)) (\x -> f $ x x)
	= f $ f $ (\x -> f $ x x) (\x -> f $ x x)
	= f $ f $ f $ (\x -> f $ x x) (\x -> f $ x x)
	= f $ f $ f $ f $ (\x -> f $ x x) (\x -> f $ x x)

view raw yce.hs hosted with ❤ by GitHub

So basically, what the Y combinator does is apply a function with itself an infinite number of times. That's already remarkable, because it essentially allows for an infinite amount of recursion with only anonymous functions.

In order to really make it useful though, we need to create an analogy to convergent sequences like $1/2, 1/4, 1/8, \dots$.

A functional equivalent in haskell is something like:

f g x = 1/2 * (g x)

This just uses recursion to generate the next term in the sequence. We can try it out:

Prelude> f id 1
0.5
Prelude> f (f id) 1
0.25
Prelude> f (f (f id)) 1
0.125

The sequence f, f(f), f(f(f)), ... is analogous to the sequence $1/2, 1/4,\dots$ and so it converges to $f^\infty(x) = 0$. To generate an infinite application of f with itself, we can use a fixed point operator:

Y f = (\x -> f (x x)) (\x -> f (x x))
    = f $ (\x -> f (x x)) (\x -> f $ x x)
    = f $ f $ (\x -> f $ x x) (\x -> f $ x x)
    = f $ f $ f $ (\x -> f $ x x) (\x -> f $ x x)
    = f $ f $ f $ f $ (\x -> f $ x x) (\x -> f $ x x)

If we try to use this on our specific $f$ though, Haskell will hang, since, well, it takes an infinite amount of time to converge.

The only kind of convergent sequences that we can use in the "real world" are those which converge in a finite amount of time. No article about functional programming would be complete without a factorial implementation, so here's an example which is readily convergent:

fact g x = if x == 1 then 1 
    else x * (g (x-1))

Two examples which give us insight:

Prelude> fact undefined 1
1
Prelude> fact undefined 2
*** Exception: Prelude.undefined

The function $fact(undefined)$ looks like the following:

if x == 1 then 1
else x * (undefined (x-1))

Since Haskell is lazy, the else branch is never evaluated if x = 1, so we don't need to worry about whether the else is well-defined. It's only when we try to call $fact(2)$ that we're going to get an error.

We can consider the sequence $f(\perp), f\left(f(\perp)\right), f\left(f\left(f(\perp)\right)\right), \dots$. ($\perp$ being the formal way of saying "undefined".) In code, this looks something like:

if x == 1 then 1
else x * (
  if x-1 == 1 then 1
  else (x-1) * (
    if x-2 == 1 then 1
    else (x-2) * (
      ...
    )
  )
)

The $n$th term in this sequence has $n$ of these else clauses. As $n$ goes to infinity, the function approaches the real factorial function, as it has fewer and fewer undefined segments. So we can say that this sequence of functions "converges" to the factorial function.

And the key point is that we only need as many function applications as we need recursive calls, i.e. $fact(n)$ needs only $n$ applications of $fact$ as an input - the $n+1$th can all be undefined. So even though this sequence takes an infinite amount of time to converge, the approximation is "good enough" for finite $n$.

To come full circle: in order to make our approximation arbitrarily correct, we need an arbitrary number of function applications as an input. And this is of course what $fix$ does.

Here is another usage which might make more sense:

	-- \| Mu is just a wrapper since we can't create infinitely recursive type
	-- signatures for functions, but we can create recursive ADTs.
	-- You can glean the point of Mu from the type signature of actualFn
	data Mu a = Mu { actualFn :: [Mu a] -> a -> a }

	-- \| Factorial function
	fact :: Num a => [Mu a] -- ^ An infinite list of factorial functions
	-> a -- ^ The number to find the factorial of
	-> a
	fact (f:fs) n = if n == 1 then 1
	else n * ((actualFn f) fs (n-1))

	-- \| Create an infinite list, each element is a factorial function
	-- wrapped in a Mu
	genFact :: Num a => [Mu a]
	genFact = (Mu fact):genFact

view raw fyc.hs hosted with ❤ by GitHub

The idea is that we have an infinite list of functions to apply, and when we need a new one we just pop one off. Which just gives us the arbitrary number of function applications that we need.

To summarize:

The Y combinator is just a higher-order function which applies functions to themselves
By applying functions to themselves, we get a sequence which converges to the "normal" recursive function.
So the Y combinator (as well as any fixed-point operator) can create an "arbitrarily good" approximation of a recursive function.

I had difficulty finding a readable implementation of Booth's algorithm; hopefully this will prove useful to others.

	from bitstring import BitArray

	'''
	Returns m * r using Booth's algorithm.
	x = len(m) and y = len(r). Note that this is the length in base 2.
	See http://en.wikipedia.org/wiki/Booth%27s_algorithm
	'''
	def booth(m, r, x, y):
	# Initialize
	totalLength = x + y + 1
	mA = BitArray(int = m, length = totalLength)
	rA = BitArray(int = r, length = totalLength)
	A = mA << (y+1)
	S = BitArray(int = -m, length = totalLength) << (y+1)
	P = BitArray(int = r, length = y)
	P.prepend(BitArray(int = 0, length = x))
	P = P << 1
	print "Initial values"
	print "A", A.bin
	print "S", S.bin
	print "P", P.bin
	print "Starting calculation"
	for i in range(1,y+1):
	if P[-2:] == '0b01':
	P = BitArray(int = P.int + A.int, length = totalLength)
	print "P + A:", P.bin
	elif P[-2:] == '0b10':
	P = BitArray(int = P.int +S.int, length = totalLength)
	print "P + S:", P.bin
	P = arith_shift_right(P, 1)
	print "P >> 1:", P.bin
	P = arith_shift_right(P, 1)
	print "P >> 1:", P.bin
	return P.int

	def arith_shift_right(x, amt):
	l = x.len
	x = BitArray(int = (x.int >> amt), length = l)
	return x

	# Sample usage: find 86 * 41
	b = booth(86, 41, 8, 8)
	print b

view raw booth.py hosted with ❤ by GitHub

	(function ($) {
	$.fn.toggleAttr = function (attr, val1, val2) {
	///<summary>Toggles an attribute between having one of two possible states</summary>
	///<param name="attr">Attribute name</param>
	///<param name="val1">First value</param>
	///<param name="val2">Second value</param>

	return this.each(function () {
	var $this = $(this);
	if ($this.attr(attr) === val1) {
	$this.attr(attr, val2);
	} else {
	$this.attr(attr, val1);
	}
	});
	};
	}
	)(jQuery);

P4P

How I Figured out the Y Combinator

Toggle arbitrary attributes in jQuery

Booth's multiplication algorithm in Python

Blog Archive

Followers

About Me

Friends' Blogs