Why I'm not super-enthusiastic about humane meat

The argument from marginal cases asks us to consider a child with a severe mental handicap. They are able to do simple physical tasks like feed themselves and move around, but unable to communicate beyond basic grunts.

What, the argument asks, is the difference between this child and, say, a cow? There are of course many superficial distinctions, but it's hard to find some difference which could plausibly be considered relevant. And of course if some distinction were found (maybe our hypothetical child has a stronger sense of self), we can simply change our theoretical mental handicap appropriately.

Peter Singer made this argument famous, and in On Being Silenced In Germany he talks about how it makes him uniquely vulnerable to misquotation. He'll have some line like "so we should consider severely mentally handicapped people roughly morally equivalent to pigs" and his opponents will gleefully describe how Prof. Singer wants to raise mentally handicapped children in factory farms and torture them to death so we can eat their sweet, sweet flesh.1

Of course even a quick glance at Singer's Wikipedia page will make it obvious that he means the reverse: we should treat pigs with much more respect than we currently do. But it's at least a philosophically interesting question: why shouldn't we be free to kill mentally handicapped people?

While reading Nicollette Niman's article claiming that "humane" meat is ethical and the associated responses the article generated, it seems like the argument from marginal cases - which acted like a bombshell in my life - has had little impact on most people.

Most of the article is unremarkable; it solidly hits the three N's of meat eating (it's natural, necessary, and normal), but I suspect that a more careful reworking of her essay could remove some of the more ridiculous claims while still keeping the core message: "it is ethically defensible [to eat meat] -- provided we refrain from causing gratuitous suffering."

Maybe there isn't anything wrong with killing animals for food. Given that we're killing tens of billions of them per year, I desperately hope there's nothing wrong with it. But then why would there be anything wrong with killing the mentally disabled?

  1. Apparently A Modest Proposal is not required reading in German schools.

A Simple Proof: Why you will never see an eight sided snowflake

Suppose you take a bunch of equally-sized marbles and squish them from all sides. You'll get one of two shapes:

These configurations are referred to as "rectangular" and "hexagonal":

Certain crystals have a molecular structure similar to these patterns, which is reflected in their macroscopic structure:

Halite, a rectangular crystal, and quartz, a hexagonal crystal.

A theorem known as the crystallography restriction states that hexagonal symmetry is the most that any crystal can hope to have - you will never see an eight-sided snowflake.

Formalizing a crystal

This theorem involves only basic trigonometry, but it does require us to most formally state what we mean by a "crystal". (If you know some basic linear algebra, most of this won't be a surprise to you.) The concept we're looking for is a discrete lattice, which has two relevant properties:

  1. Every two points are some finite distance apart (i.e. points aren't infinitely close together).
  2. The structure repeats along some line, meaning that you can slide the entire thing some finite distance along that line and get back what you started with. Another way of describing this is like an abstract version of addition. Basically, draw two lines which both start and end in the center of some circles. When you place the lines end to end, the last one will still be in the center of some circle. (So "adding" two lines together gets a valid third line.)

The blue line and the red line can be "added" together to create the valid green line.

Lastly, we need some way of describing what exactly we mean by having a "six-sided" crystal. Just like we can describe "sliding" repetition with addition of lines, we can describe rotational symmetry with rotations of lines.

The blue line can be rotated to create the valid green line.

Down to earth

As a warm up to our theorem that 6 is the most sides that a crystal can have, let's prove that no crystal has 5 sides.

Suppose we have something with five sides (i.e. a pentagon). If it were a lattice, we could add any two lines together to get a third line. So let's bring the yellow line down as the red one, and add it to the blue one. But this doesn't terminate on a point. So we can't add any two lines together, meaning it can't be a lattice. QED.

The general case is only somewhat more complex. Start with the yellow line, then rotate it by some angle θ to get the blue line. Subtract the yellow line from the blue line to get the green line. If θ is less than 360/6, the green line will be shorter than the yellow line.

We stated in our definition of "discrete lattice" above that every two points must be some finite distance D apart from each other. Phrased differently, every line must have a length of at least D. But we showed above that, if a rotation by θ is legal, we can get smaller and smaller lines - eventually we will get one smaller than D, causing a contradiction.

Therefore, any lattice must not have a rotation smaller than 60 degrees. QED.

So if anyone tries shows you an octagonal snowflake this holiday season, either they're lying, or they violated the laws of math to make an xmas miracle.


I didn't really prove that the green line would be smaller. It's not too long, but it's not terribly interesting, so I'll just leave a link to wikipedia.

Update: I was worried that this would make the post too long, but here's a proof that the green line will be shorter. Starting from O, draw some line out to A. We only care about if the line gets shorter, not its absolute length, so let's just assume that the line $\overline{OA}$ has length one. Rotate $\overline{OA}$ to find point $B$, and subtract $\overline{OB}-\overline{OA}$ to get $C$. Since $\overline{BC}$ and $\overline{OA}$ are parallel, the angles $\angle AOB$ and $\angle OBC$ are equal. By the law of cosines, $\overline{OC}$ has length $\overline{OB}^2 + \overline{BC}^2 - 2 \overline{OB}\cdot\overline{BC}\cdot\cos\theta$. Recalling that $\overline{OA}=\overline{OB}=\overline{BC}=1$, we get that $\overline{OC}=2-2\cos\theta$. If $\theta\lt 60^{\circ}$, then this value will be less than one, meaning that $\overline{OC}\lt\overline{OA}$. So $\theta\geq 60^{\circ}$. QED.

The ugly pictures were created by me. The good-looking pictures courtesy of Wikimedia.

You might also like

Using Modular Arithmetic to Improve Card Games

There was recently a discussion on patterns in modular arithmetic, in which the author showed some interesting pictures that result from plotting various functions. They introduced modular arithmetic by the standard appeal to time (an example wikipedia follows too) including the awesome fact that Japan uses neither a 12 nor a 24-hour clock - they will talk about things occurring at 28 o'clock (four AM tomorrow).

I get tired of hearing the same examples as common applications of modular arithmetic, so here's my contribution.

The basic point is that, for some applications, discreteness matters. If you want to divide up $5 among 3 people, and you are writing checks, you can just give each of them a fractional dollar amount. Problems only occur if you have 5 $1 bills and you can't make change.

A common example of this is card games - you frequently want to deal an equal number of cards to everyone and you can't exactly rip a card in half to make it work. Our standard deck has an abysmal 52 = 2*2*13 cards, which means that it's evenly divisible among 2, 4 or 13 players, resulting in our inability to play Hearts with 3 or 5 players.

What you really want is a number with a lot of prime factors, an idea roughly captured by the number's abundance. If our deck had 60 = 2*2*3*5 cards, we could play hearts with two, three, four, five or six players.

This idea occurred to me while playing Clue (apparently known as "Cluedo" outside the US). The game involves taking three cards out of the deck and then dividing the rest among the players. It supports 3-6 players, so finding the optimal number of cards involves finding a number that:

  1. Has a remainder of three when divided by 3
  2. Has a remainder of three when divided by 4
  3. Has a remainder of three when divided by 5
  4. And has a remainder of three when divided by 6

The solution to this problem can be found using the Chinese remainder theorem which, like all theorems discovered by people with foreign-sounding names, is titled after the discoverer's country (cf. Polish notation). It turns out that the first two solutions are 3 and 63 cards, which would make for either a really short or a really long game of Clue[do].

According to Wikipedia, the original design of the game called for 29 cards which leaves 26 after removing the three to hide in the envelope. The astute reader will see that 26 = 13 * 2, meaning the game could only work correctly if there were two or thirteen players!

Parker Brothers revamped the game to have 21 cards total, leaving 18 to divide among the players. This works out perfectly for three and six players, but leaves a couple extra cards when four or five people play. An improvement to be sure, but better options include 15 and 27 cards, both of which fail to divide in only one case.

Anyway, these are a couple examples of why we you might find the integers interesting. Of course there are many more (virtually every form of modern cryptography springs to mind) but hopefully these are some easy to understand instances.

Angle trisection for dummies

Continuing my line of posts explaining complex proofs by skipping the hard part, this post gives a proof that certain angles cannot be trisected. Terry Tau recently had a post on the same subject, which contained a very long proof using only basic geometry and a very short proof using Galois theory. The proof I use here is between the two he presents in both difficulty and length. It's a remarkable proof in that there's very little geometry per se, most of the proof would usually be classified as algebra.

As a crash course on what we mean to say an angle can be trisected: We start by assuming you have a line of length one, a ruler and a compass. A number $n$ is "constructible" if, starting with these things, you can construct a line of length $n$. It's simple to see that if $a$ and $b$ are constructible, then $a+b$ is as well (just stick the two lines together). This means we can repeatedly add 1 to itself to generate all the natural numbers.

It's also possible to construct $ab$ and $a/b$, although this piece is harder to show. To start things off on the trisection proof, let's place a limit on what numbers can be constructed:

Lemma 1: All constructible numbers are the solution to either a linear or quadratic equation with constructible coefficients

To construct some points we can either:

  1. Draw two lines and find their intersection
  2. Draw a line and a circle, and find their intersection
  3. Draw two circles, and find their intersection

The intersection will always be a solution to an equation of degree at most two. For example, if we have the lines $y=m_1 x + b_1$ and $y = m_2 x + b_2$ their intersection has x coordinate $m_1 x + b_1= m_2 x + b_2$ or $(m_1 - m_2) x + (b_1 - b_2)=0$, so $x$ is the solution to a linear equation with both coefficients constructible.

In cases 2 and 3, the x and y coordinates will be solutions to quadratics.

Lemma 2: The only constructible points are solutions to equations of degree $2^n$ for some integer $n$

For example, we could find the point which satisfies $x^2-2=0$ which is $x=\sqrt{2}$. Now that we can construct $\sqrt{2}$, so we can use it in our equation; for example we can find $x=\sqrt[4]{2}$ by solving $x^2-\sqrt{2}=0$.

We can repeat this process to find the eigth root, sixteenth root etc. but we can only find roots that are powers of two. More generally, we can only construct equations of degree two; their coefficients might themselves be solutions to quadratics so we have an equation of degree 4 over our original set, but we can never create an equation of degree 3 or 5 or 379.

Lemma 3: If an angle $\theta$ is constructible, then $\cos\theta$ is constructible.

$\cos\theta$ represents the adjacent side of a right triangle with hypotenuse 1 and adjacent angle $\theta$; since we can draw a line of length 1 we can use that as a hypotenuse, and if we could draw the angle $\theta$ we can construct the entire triangle.

Proposition: $\cos 20^\circ$ is not constructible

Once we've proven this, we'll have proven that $20^\circ$ is not constructible, and so an angle of $60^\circ$ cannot be trisected.

I'm sure you all remember the triple angle formula $\cos 3\theta = 4 \cos ^3\theta-3\cos\theta$, plugging in $\cos 60=1/2$ and rearranging we get that $4\cos ^3\theta - 3\cos\theta - 1/2 = 0$; to make it more obvious I'll substitute $x$ for $\cos\theta$: $4x^3 - 3x - 1/2 = 0$. So $x$ is the solution to this cubic equation, but Lemma 1 told us that constructible points are only solutions to equations of degree $2^n$. So $x$ is not constructible. QED.

Undecidability for dummies

It's been my experience that many important "complex" theorems are complicated in only one or two parts. Oftentimes, if you're willing to accept a lemma or two without proof, you can understand the entire thing. So this is a post explaining what I think is a fairly complicated topic, but there is a piece I don't try to prove. I hope the result is something that's understandable to anyone with a little programming experience.

One of the things which fascinates me most are "incompleteness" theorems. I understand how we can look at some individual solution and determine that it doesn't solve some problem, but the proof that no solutions exist seems a lot harder.

A foundational result in theoretical computer science is the undecidability of the halting problem. This states that, given some program P, there is no general way to tell if P will run forever or if it will eventually stop. For example, the following program:

for each integer x:
	if x + 1 = x, then halt
	else go on to the next integer

will run forever, because there's not integer which is the same as itself with one added to it. This program:

for each integer x:
	if x + 0 = x, then halt
	else go on to the next integer

will halt immediately.

So how can we prove that there is no general method to tell whether a program will halt? It's a proof by contradiction: suppose we have some method, let's say M, of telling if a program will halt. Consider the following program:

if M says *this program* will halt, then run forever
else if M says this program will run forever, then halt

For any method M, it will fail to correctly predict the behavior of our program. So for any method M, there must be at least one program whose behavior it fails to predict, which proves our theorem.

When I first learned this, I felt (and to a certain extent still feel) that it was kind of a cheap trick. In retrospect, I think this was due to the program's ability to refer to itself. This is the complexity that I left out of the post, but the basic idea isn't that hard: we can assign a natural number to every program (i.e. there's a "first" program, a second program, ...). Once we've done that, a program "referring to itself" can be accomplished via these numbers, e.g. program 12345 referring to program 12345.

One simple yet important extension to the halting problem is Rice's Theorem, which basically says that you can never prove anything about a program. Want to prove that your program won't crash? Tough. Someone asks you to prove your algorithm always gives the right answer? No can do.

If you're interested, you can see a list of undecidable problems, the most famous of which is Hilbert's tenth.

Divisible by 3

You probably know the rule of thumb to check if a number is divisible by three: add the digits and see if that number is divisible by three. E.g. 3627 is divisible by three since 3+6+2+7=18, which is divisible by three.

Here is a short proof of this, as well as some extensions. Note that $3627=3*10^3+6*10^2+2*10+7$; if we consider $p(x)=3x^3+6x^2+2x+7$, then $3627=p(10)$. Finding the sum of the digits is equivalent to finding $p(1)$. Our claim is that $p(10)$ is divisible by three if and only if $p(1)$ is divisble by three.

A more general theorem holds, in fact: if $a\equiv b \bmod m$ then $p(a)\equiv p(b) \bmod m$ for any polynomial $p$. A colloquial phrasing is "if a and b have the same remainder after dividing by m, then $p(a)$ and $p(b)$ have the same remainder after dividing by m." In our case, we're just interested in the case where the remainder is zero. The proof of this follows directly from the properties of modular arithmetic, so I won't bore you with it.

Using this theorem, we find that $10\equiv 1 \bmod 3$, so $p(10)\equiv p(1) \bmod 3$, just like we wanted. Note that $10\equiv 1\bmod 9$ as well, so the same "add up the digits" shortcut works with division by 9.

We can extend this to more complicated shortcuts. You might know the trick for divisibility by 11: alternately add and subtract the digits. For example, $1-2+1=0$ so $121$ is divisible by 11.

This is just another instance of our theorem. Since $10\equiv -1 \bmod 11$, $p(10)\equiv p(-1)\bmod 11$ and $p(-1)$ is just alternately adding and subtracting the digits.

Wikipedia has an extensive list of divisibility "shortcuts", some of which seem more complex than just doing long division, but hopefully this illuminates the reasoning behind some of them.

The Meta-Diet: Was Pollan Right?

Michael Pollan wrote a very popular book which made the claim that there are only three imperatives about diets that we can know with any certainty:

  1. Eat Food: That is, avoid processed foods.
  2. Not too much: limit calories.
  3. Mostly plants. This one's pretty self explanatory.

Not that I don't trust Pollan and all, but I would've like to see a survey of nutritionists or something which would indicate that people who have formal training in the subject agree with him. Fortunately for us, US News and World Report surveyed a bunch of nutritionists and had them rank various diets. From this ranking, I tried to divine how nutritionists feel about his three commandments. The results:

DietOverall RankNo processedLimit caloriesLimit meat
Weight Watchers2YYY
Mayo Clinic5YYY
Jenny Craig7YY
Slim Fast10YY
South Beach13YYY
Eco Atkins14YYY
Glycemic Index16
Raw food18YYY

For the most part, I just read the descriptions of each diet. The most controversial thing I did was including "collateral damage" - e.g. I said that the vegan diet is calorie-restricted since the dietitians said that vegans eat fewer calories. I think this is a better way to go about it, since we want to find out the nutritionist's opinions about these commandments, not about how some diet theoretically could be followed. If a diet had less than 1,500 calories per day for the average person, I considered it calorie-restricted.

In the ranks of "collateral damage" I also included a "low sodium" requirement as a restriction on processed foods. Four diets (Slim Fast, Nutrisystem, Zone, Medifast) limit saturated fat (i.e. their limit is significantly below the recommended daily max) so I considered these as having restrictions on meat, since a saturated fat restriction usually (but not always) results in eating less meat.

My notes and sources that weren't within the USN pages can be found here.


I'd say Pollan's rules hold up pretty damn well. The top 5 diets all follow all three of them, and the most anti-Pollan diets are ranked 16th, 19th and 20th.

It seems like the no-processed-food rule is the least followed, but I think that's just because the rankings include a number of companies whose sole mission is to sell processed foods.

I wish they had included a "control" or "do nothing" diet to tell whether any of these diets are actually harmful, or whether the bottom of the list is good but not great. Here's what they said about the worst diet:

Experts took issue with the Paleo diet on every measure. Regardless of what a dieter's goal is—weight loss, heart health, or finding a diet that's easy to follow—most experts concluded he or she is better off looking elsewhere.

Does "elsewhere" include "not going on a diet"? I don't know, but if it does this might be a mark against Pollan, since the paleo diet excludes processed foods. (An alternative explanation: since the paleo diet tends to be meat-heavy, maybe the "mostly plants" maxim is stronger than the "eat food" one.)


I scored the diets 0-3 on how many of the rules they followed and tested how well this score correlated with their ranks. Because there are a lot of ties, I used the Pearson product-moment metric. This yielded a correlation of -.52, significant at p=0.018. So it appears unlikely that the relationship between rank and extent to which they follow Pollan's criteria is random chance. However, the relationship isn't incredibly strong.

Kendall's tau and Spearman's rho tests were significant at p=.035 and .033 respectively, but keep in mind that ties are excluded, and there are lots of ties.


Go ahead: eat food, not too much, mostly plants.

A clever proof of Euler's Theorem

I've been learning abstract algebra for the last six months or so - and I have learned a lot - yet I get the distinct impression that if Euclid were to appear before me and demand to know why he should care about groups or rings or what have you I wouldn't have much to say. So I was happy to learn of a simple proof of Euler's theorem, which relies solely on Lagrange's theorem, a basic result of group theory.

Euler's theorem deals with $\phi$, the Euler totient function, where $\phi(n)$ is the number of integers less than $n$ that are coprime to $n$. For example, $\phi(6)=2$, since 1 and 5 are the only integers less than 6 that are coprime to 6. If $p$ is prime, $\phi(p)=p-1$ since every number less than $p$ is relatively prime to $p$.

Upon first hearing about this function, I thought it was too insanely complex to know anything about. How the hell should I know what $\phi(n)$ is? There's no easy formula. But it turns out that we can prove the following theorem quite easily:

Euler's theorem: Suppose $a,n$ are coprime. Then $a^{\phi(n)}\equiv 1 \pmod{n}$.

Note that, since $\phi(p)=p-1$, this is just a generalization of Fermat's little theorem. Here's a three-step proof:

  1. An integer $a$ is invertible modulo $n$ if and only if $gcd(a,n)=1$ ("invertible" means there's some $a^{-1}$ such that $aa^{-1}\equiv 1 \bmod n$). This is the Linear Congruence Theorem.
  2. Consider the group $G$ of all positive integers less than $n$ which are invertible modulo $n$ (formally, $G=\left(\mathbb{Z}/n\mathbb{Z}\right)^{\times}$). $\phi(n)$ simply counts the number of positive integers $a$ where $gcd(a,n)=1$, so by (1) there are $\phi(n)$ elements in $G$ (the order of $G$ is $\phi(n)$). You can satisfy yourself that $G$ follows the group axioms quickly.
  3. Lagrange's theorem says that there must be some $k$ such that $a^k=1$ where $k$ divides the order of the group, namely $\phi(n)$; let's say $km=\phi(n)$. Then $a^{\phi(n)}=a^{km}=\left(a^k\right)^m=1^m=1$ which proves our theorem!

Isn't that cool? I went from thinking that $\phi$ was an impossibly complex function to knowing a foundational result about it within 5 minutes, all because of what seemed like an esoteric result in group theory. 90% of this short post is explaining the terms, the proof itself is only a paragraph!

Make LocalLinks work for \\ links

LocalLinks is an add-on for Chrome which lets you open local files in Chrome. There is one minor problem: it doesn't work for links which start "\\". Fortunately, there's a simple fix: add the following lines to your locallinks_estension.js file (for me, this was in C:\Users\me\AppData\Local\Google\Chrome\User Data\Default\Extensions\jllpkdkcdjndhggodimiphkghodgcpida\1.1_0\content_script\locallinks_estension.js):

    live('mousedown', handleMousedownOnLink).
    live('mouseout', handleMouseoutOnLink).
    live('mouseup', handleMouseupOnLink).
    live('click', handleClickOnLink);

How I Figured out the Y Combinator

The Y Combinator is among the most famous functions in theoretical computer science, which is quite ironic given that it's anonymous (and only really useful because it's anonymous). There's even a massive venture capital firm that took its name. And - somewhat unusually for theoretical CS constructs - it's quite useful.

There are no shortage of tutorials describing how it works, but I couldn't find many which worked through an example. So here's my go at it.

Here's a definition of the combinator that you've probably seen before:

Let's roll it out to see what happens when we actually use it:

So basically, what the Y combinator does is apply a function with itself an infinite number of times. That's already remarkable, because it essentially allows for an infinite amount of recursion with only anonymous functions.

In order to really make it useful though, we need to create an analogy to convergent sequences like $1/2, 1/4, 1/8, \dots$.

A functional equivalent in haskell is something like:
f g x = 1/2 * (g x)
This just uses recursion to generate the next term in the sequence. We can try it out:
Prelude> f id 1
Prelude> f (f id) 1
Prelude> f (f (f id)) 1
The sequence f, f(f), f(f(f)), ... is analogous to the sequence $1/2, 1/4,\dots$ and so it converges to $f^\infty(x) = 0$. To generate an infinite application of f with itself, we can use a fixed point operator:
Y f = (\x -> f (x x)) (\x -> f (x x))
    = f $ (\x -> f (x x)) (\x -> f $ x x)
    = f $ f $ (\x -> f $ x x) (\x -> f $ x x)
    = f $ f $ f $ (\x -> f $ x x) (\x -> f $ x x)
    = f $ f $ f $ f $ (\x -> f $ x x) (\x -> f $ x x)
If we try to use this on our specific $f$ though, Haskell will hang, since, well, it takes an infinite amount of time to converge.

The only kind of convergent sequences that we can use in the "real world" are those which converge in a finite amount of time. No article about functional programming would be complete without a factorial implementation, so here's an example which is readily convergent:
fact g x = if x == 1 then 1 
    else x * (g (x-1)) 
Two examples which give us insight:
Prelude> fact undefined 1
Prelude> fact undefined 2
*** Exception: Prelude.undefined
The function $fact(undefined)$ looks like the following:
if x == 1 then 1
else x * (undefined (x-1))
Since Haskell is lazy, the else branch is never evaluated if x = 1, so we don't need to worry about whether the else is well-defined. It's only when we try to call $fact(2)$ that we're going to get an error.

We can consider the sequence $f(\perp), f\left(f(\perp)\right), f\left(f\left(f(\perp)\right)\right), \dots$. ($\perp$ being the formal way of saying "undefined".) In code, this looks something like:
if x == 1 then 1
else x * (
  if x-1 == 1 then 1
  else (x-1) * (
    if x-2 == 1 then 1
    else (x-2) * (
The $n$th term in this sequence has $n$ of these else clauses. As $n$ goes to infinity, the function approaches the real factorial function, as it has fewer and fewer undefined segments. So we can say that this sequence of functions "converges" to the factorial function.

And the key point is that we only need as many function applications as we need recursive calls, i.e. $fact(n)$ needs only $n$ applications of $fact$ as an input - the $n+1$th can all be undefined. So even though this sequence takes an infinite amount of time to converge, the approximation is "good enough" for finite $n$.

To come full circle: in order to make our approximation arbitrarily correct, we need an arbitrary number of function applications as an input. And this is of course what $fix$ does.

Here is another usage which might make more sense: The idea is that we have an infinite list of functions to apply, and when we need a new one we just pop one off. Which just gives us the arbitrary number of function applications that we need.

To summarize:
  • The Y combinator is just a higher-order function which applies functions to themselves
  • By applying functions to themselves, we get a sequence which converges to the "normal" recursive function.
  • So the Y combinator (as well as any fixed-point operator) can create an "arbitrarily good" approximation of a recursive function.

Toggle arbitrary attributes in jQuery

jQuery has a toggle function, which toggles whether an element is visible or not, as well as toggleClass, which toggles whether an element has a class. But I frequently want to toggle other attributes, particularly the title. This plugin lets you do that.

Booth's multiplication algorithm in Python

I had difficulty finding a readable implementation of Booth's algorithm; hopefully this will prove useful to others.

Theorems named after the wrong people

In the graph below, an edge from A to B indicates that a theorem named after B was actually proven by A.

You can see the graphviz source file here. Inspired by this question on Math Overflow.

There is actually a formal name for this phenomenon: Stigler's Law of Eponymy. Ironically enough, the law was proposed by Stigler.