Predictions of ACE's surveying results


Carl Shulman is polling people about their predictions for the results of the upcoming ACE study to encourage less biased interpretations. Here are mine.

Assuming control group follows the data in e.g. the Iowa Women's Health Study they should eat 166g meat/day with sd 66g.1 (For the rest of this post, I'm going to assume everything is normally distributed, even though I realize that's not completely true.)

For mathematical ease, let's take our prior from the farm sanctuary study and say: 2% are now veg, and an additional 5% eat "a lot less" meat which I'll define as cutting in half. So the mean of this group is 159g (4.2% less) w/ sd 69g.

I don't know what tests they will do, but let's look at a t-test because that's easiest. The test statistic here is:
$$t=\frac{166-159}{\sqrt{\frac{66}{N_1}+\frac{69}{N_2}}}$$
Let's assume 5% of those surveyed were in the intervention group. Solving for $N$ in
$$1.96=\frac{7}{\sqrt{\frac{66}{.95N}+\frac{69}{.05N}}}$$
we find $N\approx 350$, meaning that I expect the null hypothesis to be rejected at the usual $\alpha=.05$ if they collected at least 350 survey responses.2 I'm leaning slightly towards it not being significant, but I'm not sure how much data they collected.

Here's my estimate of their estimate (I can't do this analytically, so this is based on simulations):
You can see that the expected outcome is the true difference of about 4 veg equivalents per 100 leaflets, but with such a small sample size there is a 25% chance that we'll find leafleted people were less likely to go veg.

Here's how a 50% confidence interval might shake out:

The left graph is the bottom of the CI, the right one is the top.

Putting Money where my Mouth Is

The point of this is so that I don't retro-justify my beliefs, which is that meta-research in animal-related fields is the most effective thing. I have a lot of model uncertainty, but I would broadly endorse the conclusions of the above. The following represent ~2.5% probability events (each), which I will take as evidence I'm wrong.
  • If a 50% CI is exclusively above 9 veg equivalents per 100 leaflets, then I think its ability to attract people to veganism outweighs the knowledge we'd gain from more studies. Therefore, I pledge $1,000 to VO or THL (or whatever top-ranked leafleting charity exists at the time).
  • If a 50% CI is exclusively below zero, then veg interventions in general are less useful than I thought. Therefore I pledge $1,000 to MIRI (or another x-risk charity, if e.g. GiveWell Labs has a recommendation by then).
I don't think my above model is completely correct, and I'm sure ACE will have a different parameterization, so I don't know that these are really the 5% tails, but I would consider either of them to be a surprising enough event that my current beliefs are probably wrong.

I am open to friendly charity bets (if result is worse than X I give money to your charity, else you give to mine), if anyone else is interested.

Footnotes
  1. I tried to use MLE to combine multiple analyses, but found that the standard deviation is > 10,000 g/day. It's a good thing ACE has professional statisticians on the job, because the data clearly is kind of complex.
  2. I used $d.f.=\infty$

An Improvement to "The Impossibility of a Satisfactory Population Ethics"


Gustaf Arrhenius has published a series of impossibility theorems involving ethics. His most recent is The Impossibility of a Satisfactory Population Ethics which basically shows that several intuitive premises yield a stronger version of the repugnant conclusion.

If you know me, you know that I believe that modern ("abstract") algebra can help resolve problems in ethics. This is one example: using some basic algebra, we can get a stronger result than Arrhenius while using weaker axioms.

This is a "standing on the shoulders of giants" type of result: mathematicians have had centuries to trim their axioms to the minimal required set, so once you're able to phrase your question in more standard notation you can quickly arrive at better conclusions. Similarly, the errors in Arrhenius' proof that I've noted in the footnotes are mostly errors of omission that many extremely smart people made, until others pointed out pathological cases where their assumptions were invalid.

Assumptions


We assume that it's possible to have lives that are worth living ("positive" welfare), lives not worth living ("negative" welfare) and ones on the margin ("neutral" welfare). Arrhenius doesn't specify what the relationship is between "positive" and "negative" welfare, but I think there's a very intuitive answer: they cancel each other out. Just as $(+1) + (-1) = 0$, a world with a person of $+1$ utility and one with $-1$ utility is equivalent to a world with people at the neutral level.1

We continue the analogy with addition by writing $Z=X+Y$ if $Z$ is the union of two populations $X$ and $Y$. Just as with normal addition, we assume that $X+Y$ is always defined2 and that we can move parentheses around however we want, i.e. $(X+Y)+Z=X+(Y+Z)$. Lastly, I'm going to assume that the order in which you add people doesn't matter, i.e. $X+Y=Y+X$.3 I will finish the analogy with addition by specifying that welfare is isomorphic to the integers.4

(The above is just a long-winded way of saying that population ethics is isomorphic to the free abelian group on $\mathbb Z$.)

Also, for simplicity, I will write $nX$ for $\underbrace{X+\dots+X}_{n\ times}$.5

Lastly, we need to define our ordering. I'll use the notation that $X\leq Y$ means "Population $X$ is morally worse than population $Y$" and require that $\leq$ is a quasi-order, i.e. $X\leq X$ and $X\leq Y, Y\leq Z$ implies that $X\leq Z$. Notably, this does not require us to believe that populations are totally ordered, i.e. there may be cases where we aren't sure which population is better.

The major controversial assumption we need from Arrhenius is what he calls "non-elitism": for any $X,Y$ with $X-1>Y$ there is an $n>0$ such that for any population $D$ consisting of people with welfare levels between $X$ and $Y$: $(n+1)(X-1)+D\geq X+nY+D$. In less formal terms, this is basically saying that there are no "infinitely good" welfare levels.

Claim


We claim that any group following the above axioms results in:
The Very Repugnant Conclusion: For any perfectly equal population
with very high positive welfare, and for any number of lives with very
negative welfare, there is a population consisting of the lives with negative welfare and lives with very low positive welfare which is better than the high welfare population, all things being equal.

Unused Assumptions


The following are assumptions Arrhenius makes which are unused. (Note: these are verbatim quotes from his paper, unlike the other assumptions.)

(Exercise for the advanced reader: figure out which of these also follow from the assumptions we did use.)
  1. The Egalitarian Dominance Condition: If population A is a perfectly
    equal population of the same size as population B, and every person in
    A has higher welfare than every person in B, then A is better than B,
    other things being equal.
  2. The General Non-Extreme Priority Condition: There is a number n
    of lives such that for any population X, and any welfare level A, a
    population consisting of the X-lives, n lives with very high welfare, and
    one life with welfare A, is at least as good as a population consisting
    of the X-lives, n lives with very low positive welfare, and one life with
    welfare slightly above A, other things being equal.
  3. The Weak Non-Sadism Condition: There is a negative welfare level and
    a number of lives at this level such that an addition of any number of
    people with positive welfare is at least as good as an addition of the
    lives with negative welfare, other things being equal.

Proof

Lemma

First we prove a lemma: what Arrhenius calls "Condition $\beta$" and what mathematicians would refer to as a proof that our group is Archimedean. This means that for any $X,Y>0$ there is an $n$ such that $nX\geq Y$.

Basically we just observe that the "non-elitism" condition makes a simple induction. Starting from the premise that $(n+1)(X-1)+D\geq X+nY+D$, let $Y, D=0$, giving us that $(n+1)(X-1)\geq X$, i.e. $X$ is Archimedean with respect to $X-1$. Continuing the induction we find that $X$ is Archimedean with respect to $X-k$, completing the proof.6,7

Theorem

First, let me give a formal definition of the "Very Repugnant Conclusion": For any high level of welfare $H$, low positive level of welfare $L$ and negative level of welfare $-N$ and population sizes $c_{H},c_{N}$ there is some $c_{L}$ such that $c_{L}\cdot L+c_{N}\cdot(-N)\geq c_{H}H$.

To prove our claim: we know there is some $k_{1}$ such that
$$k_{1}\cdot L\geq c_{H}\cdot H\label{ref1}$$
because of our lemma. Because it's a group, we know that $(N+-N)+L=L$ and moreover $(c_{N}N+c_{N}\cdot-N)+L=L$. Substituting this into (1) yields
$$k_{1}\left[\left(c_{N}N+c_{N}\cdot-N\right)+L\right]\geq c_{H}H\label{ref2}$$
Expanding the left hand side of (2) we get
$$k_{1}c_{N}N+k_{1}c_{N}\cdot(-N)+k_{1}L\label{ref3}$$
By our lemma there is some $k_{2}$ such that $k_{2}L+D\geq k_{1}c_{N}N+D$; letting $D=k_{1}c_{N}(-N)+k_{1}L$ and using transitivity we get that
$$k_{2}L+k_{1}c_{N}(-N)+k_{1}L\geq c_{H}H$$
Rewriting terms leaves us with
$$\left(k_{1}+k_{2}\right)L+k_{1}c_{N}(-N)\geq c_{H}H$$
or
$$c_L L+c_{N'}(-N)\geq c_{H}H$$
$\blacksquare$

Comments


I don't know that this shorter proof is much more convincing than Arrhenius' - my guess is that the people who disagree with an assumption are those who take a "person-affecting" view or otherwise object to the entire premise of the theorem. I would though say that:
  1. None of the math I've used is beyond the average high-school student. It's just making the "algebra can be about things other than numbers" leap which is hard.
  2. While abstract algebraic notation can be intimidating, it's relevant to realize that using it makes you more concise. (To the extent that a 26-page paper can be rewritten into a two-page blog post.)
  3. Because we can be more concise and use standard terminology, it shines a light on what is really the controversial assumption: Non-Elitism.
  4. Similarly, because we use standard concepts it's easier to see missing assumptions (e.g. I didn't realize that Arrhenius was missing a closure axiom until I tried to cast it in group theory terms).
Lastly, because I can't finish any post without mentioning lattice theory, I'll add that some of the errors in Arrhenius' paper occurred because lattices are such a natural structure that he assumed they exist even where they weren't shown to. Of course, if you involve lattices more you end up with total utilitarianism, giving more insight into why Arrhenius' result holds.

Acknowledgements


I would like to thank Prof. Arrhenius for the idea, and Nick Beckstead for talking about it with me.

Footnotes

  1. Formally, for each $X$ there is some $-X$ such that for all $Y$, $X+(-X)+Y=Y$.
  2. This isn't an explicit assumption in Arrhenius, but it's implicitly assumed just about everywhere
  3. This arguably is controversial so I'll point out that commutativity isn't really required, but since it keeps the proof a lot shorter and most people will accept it, I'll keep the assumption
  4. Arrhenius "proves" that welfare is order-isomorphic to $\mathbb Z$ incorrectly, so I'll just assume it instead of attempting to derive it from others. If you prefer, you can take his "Discreteness" axiom, add in assumptions that welfare is totally ordered and has no least or greatest element and you'll get the same thing.
  5. Which is just to say that since it's an abelian group it's also a $\mathbb Z$-module.
  6. Nick Beckstead thought that some people might not like using the neutral level like this, so I'll point out that you can use an alternative proof at the expense of an additional axiom. If you assume non-sadism, then you can find that $X+nY\geq X$ and therefore transitively $(n+1)(X-1)\geq X$.
  7. This is somewhat misleading: we've only shown that the group is archimedean for totally equitable populations. That's all we need though.

How Conscious is my Relationship?

One of the most interesting theories of consciousness is Integrated Information Theory (IIT), proposed by Giulio Tononi. One of its more radical claims is that consciousness is a spectrum, and that virtually everything in the universe from the smallest atom to the largest galaxy has at least some amount of consciousness.

Whatever criticisms one can make of IIT, the fact that it allows you to sit down and calculate how conscious a system is represents a fundamental advance in psychology. Since people say that good communication is the most important part of a relationship, and since any information-bearing system's consciousness can be calculated with IIT, I thought it would be fun to calculate how conscious Gina and my's relationship is.

A Crash Course on Information

Entropy
The fundamental measure of information is surprise. The news could be filled with stories about how gravity remains constant, the sun rose from the east instead of the west and the moon continues to orbit the earth, but there is essentially zero surprise in these stories, and hence no information. If the moon were to escape earth's orbit we would all be shocked, and hence get a lot of information from this.

Written words have information too. If I forget to type the last letter of this phras, you can probably still guess it, meaning that trailing 'e' carries little surprise/information. Claude Shannon, founder of information theory, did precisely this experiment, covering up parts of words and seeing how well one could guess the remainder. (English has around 1 bit of information per letter, for the record.)

Whatever you're dealing with the important part to remember is that "surprise" is when a low-probability event occurs, and that "information" is proportional to "surprise". Systems which can be predicted very well in advance, such as whether the sun rises from the east or the west, have very low surprise on average. Those which cannot be predicted, such as the toss of a coin, have much more surprising outcomes. (Maximally surprising probability distributions are those where every event is equally likely.) The measure of how surprising a system is (and hence how much information the system has) was named Entropy by Shannon based on von Neumann's advice that "no one knows what entropy really is, so in a debate you will always have the advantage".

Divergence
Someone who knows modern English will have a bit more surprise than usual upon reading Shakespeare - words starting with "th" will end in "ou" more often than one would expect, but overall it's not too bad. Chaucer's Canterbury tales one can struggle through with difficulty, and Caedmon (the oldest known English poem) is so unfamiliar the letters are essentially unpredictable:
nu scylun hergan hefaenricaes uard
metudæs maecti end his modgidanc
uerc uuldurfadur swe he uundra gihwaes
eci dryctin or astelidæ
- first four lines of Caedmon. Yes, this is considered "English".
If we approximate the frequency of letters in Shakespeare based on our knowledge of modern English we won't get it too wrong (i.e. we won't frequently be surprised). But our approximation of Caedmon from modern English is horrific - we're surprised that 'u' is followed by 'u' in "uundra" and that 'd' is followed by 'æ' in "astelidæ".

Since you can make a good estimate of letter's frequencies in Shakespeare based on modern English, that means Shakespearean English and modern English have a low divergence. The fact that we're so frequently described when reading Caedmon means that the probability distribution there is highly divergent from modern English.

Consciousness

Believe it or not, Entropy and Divergence are the tools we need to calculate a system's consciousness. Roughly, we want to approximate a system's behavior by assuming that its constituent parts behave independently. The worse that approximation is, the more "integrated" we say the system is. Knowing that, we can derive its Phi, the measure of its consciousness.

Our Relationship as a Conscious Being

Here is a completely unscientific measure of mine and Gina's behavior over the last day or so:

The (i,j) entry is the fraction of time that I was doing activity i and Gina was doing activity j. (The marginal distributions are written, appropriately enough, in the margins.)

You can see that my entropy is 1.49 bits, while Gina (being the unpredictable radical she is) has 1.69 bits. This means that our lives are slightly less surprising than the result of two coin tosses (I can hear the tabloids knocking already).

However, our behavior is highly integrated: like many couples in which one person is loud and the other is a light sleeper, we're awake at the same time, and our shared hatred of driving means we only travel to see friends as a pair. Here's how it would look if we didn't coordinate our actions (i.e. assuming independence):

The divergence between these two distributions is our relationship's consciousness (Phi). Some not-terribly-interesting computations show that Phi = 1.49 bits.

The Pauli exclusion principle tells us that electrons in the innermost shell have 1 bit of consciousness (i.e. Phi = 1), meaning that our relationship is about as sentient as the average helium atom. So if we do decide to break up, the murder of our relationship won't be much of a crime.

Side Notes

Obviously this is a little tongue-in-cheek, but one important thing you might wonder is why my decision to consider our relationship to have two components (me and Gina) is the correct one. Wouldn't it be better to assume that there are 200 billion elements (one for each neuron in our brains) or even 1028 (one for each atom in our bodies)?

The answer is that yes, that would be better (apart from the obvious computational difficulties). IIT says that consciousness occurs at the level of the system with the highest value of Phi, so if we performed the computation correctly, we would of course find that it's Gina and myself who are conscious, not our relationship, since we have higher values of Phi.

(The commitment-phobic will notice a downside to this principle: if your relationship becomes so complex and integrated that its value of Phi exceeds your own, you and your partner would lose individual consciousness and become one joint entity!)

I should also note that I've discussed IIT's description of the quantity of consciousness, but not its definition of quality of consciousness.

Conclusion

Our beliefs about consciousness are so contradictory it's impossible for any rigorous theory to support them all, and IIT does not disappoint on the "surprising conclusions" front. But some of its predictions have been confirmed by evidence (the areas of the brain with highest values of Phi are more linked to phenomenal consciousness, for example) and the fact that it can even make empirical predictions makes it an important step forward. I'll close with Tononi's description of how IIT changes our perspective on physics:
We are by now used to considering the universe as a vast empty space that contains enormous conglomerations of mass, charge, and energy—giant bright entities (where brightness reflects energy or mass) from planets to stars to galaxies. In this view (that is, in terms of mass, charge, or energy), each of us constitutes an extremely small, dim portion of what exists—indeed, hardly more than a speck of dust.

However, if consciousness (i.e., integrated information) exists as a fundamental property, an equally valid view of the universe is this: a vast empty space that contains mostly nothing, and occasionally just specks of integrated information (Φ)—mere dust, indeed—even there where the mass-charge–energy perspective reveals huge conglomerates. On the other hand, one small corner of the known universe contains a remarkable concentration of extremely bright entities (where brightness reflects high Φ), orders of magnitude brighter than anything around them. Each bright “Φ-star” is the main complex of an individual human being (and most likely, of individual animals). I argue that such Φ-centric view is at least as valid as that of a universe dominated by mass, charge, and energy. In fact, it may be more valid, since to be highly conscious (to have high Φ) implies that there is something it is like to be you, whereas if you just have high mass, charge, or energy, there may be little or nothing it is like to be you. From this standpoint, it would seem that entities with high Φ exist in a stronger sense than entities of high mass.

Acknowledgements

The idea for this post came from Brian's essay on Suffering Subroutines, and the basis for my description of IIT came from Tononi's Consciousness as Integrated Information: a Provisional Manifesto. Gina read an earlier draft of this post.