A 60-Second Exercise That Boosts Goal Achievement By 20%

The hero of our tale, Jason Padgett.

The hero of our tale, Jason Padgett.

(Content note: this is an example of what I send out to email subscribers. You can sign up to receive more like it on any of the many forms scattered throughout the website, like the one at the bottom of this post.)

In 2002, Jason Padgett got into a fight. It was the fight of the decade, maybe the century. Not because Jason trounced his two assailants (he didn’t), and not because it was a fair fight — it wasn’t — but because of what happened the next morning.

But, wait, rewind a little. Let me tell you about Jason before everything changed.

Jason Padgett: Jock, Underachiever… Time Traveler?

The year was 2002 but, looking at Jason, you wouldn’t know it.

It was as if he’d been beamed straight from the 80s. A grungy time-traveler left stranded in the future, perhaps a consequence of an evil genius’s twisted revenge plot gone awry.

His blonde hair was cut into a mullet.

Attire: t-shirt with ragged, cut-off sleeves — as if he’d gnawed them off himself, like your dog might when left alone, bored. And the finishing touch, transforming him from trucker-stop chic into a form of trailer-park fashion so common you’d mistake it for an official uniform: he tucked his browning white tee into tight, faded jeans.

Plus a leather jacket. The leather jacket.

Just as The Lord of The Rings hinged on the whims of The One Ring, Jason’s story hinges on The One Leather Jacket.

At 31, with a daughter, he looked almost like an awkward teenager, except — barring Mike Tyson and steroids — I’d never seen a teen so well-muscled.

His hobbies included drinking beer — the existence of which, he liked to say, implied that there must be a God — skydiving, cliff-jumping, and thrill seeking generally.

He’d bounced around college for a while, but books were not his scene. In his own words, “I cheated on everything, and I never cracked a book.”

At least, that was Jason before the attack.

The Attack: When A Bar Fight Is A Blessing

The attack happened on Friday the 13th — a superstitious day, to be sure. If Jason had stayed in, he wouldn’t have ended up in the hospital.

My grandmother likes to say that the one week when she doesn’t play the lotto will be the one week that her numbers are called.

For Jason, if he’d stayed in and avoided the hospital, he would have missed out on the equivalent of a winning lottery ticket.

It happened at a karaoke bar near his home.

Two men attacked him from behind, punching him in the back of the head. The blows knocked him to the ground.

They then kicked him until he handed over his prized leather jacket. Worth maybe, if we’re being generous, 40 bucks on eBay.

An exchange more than worth it for Jason.

He ended up in the hospital, with a concussion and bruised kidney, but they released him that same night.

When he awoke the next morning, everything was different.

Jock Today, Savant Tomorrow


An example of Padgett’s fractal art.

Today, Jason is one of 40 known cases of acquired savant syndrome. He sees mathematics. He can draw complicated geometric fractals by hand.

When the sun glints, he sees the arc.

Before, he worked at a furniture store.

Now, he’s an aspiring number theorist and an artist.

He draws what he can see and then sells it. He’s even written a book about the experience, Struck By Genius, with an upcoming adaptation for the silver screen.

All because someone punched him in the back of the head.


That’s what I want to be. The convincing fist that transforms you into a number theorist.

Except, no, maybe that’s not right.

…I know.

I want to be the friendly surgeon that communicates with you via email. I teach you how to remove a spleen, and then you, kitchen knife in hand, do it yourself.

Yeah. That’s who I want to be. Email-spleen-remover guy.

The Toughest Part of Behavior Change: Remembering to Change

For Jason, radical behavior change was the result of someone striking him in the back of the head.

For you and me, that sort of change is decidedly more painful than a concussion, as anyone who’s attempted to lose weight can tell you.

Let me know if this scenario sounds familiar.

You want to change something about yourself.

Maybe you want to be friendlier.

Let’s say you’ve read about operant conditioning and positive reinforcement and you think, hey, this just makes sense — I should treat the people around me better.

So this becomes a goal: treat your colleagues better.

And, to do this, your plan is not more cowbell, but more compliments. Criticism sucks. No one likes receiving it.

Solution: more positive feedback.

So you set this goal.

And then you forget about it.

You go to work, critique people like usual, come home, and then realize: I was going to make a change.

But I didn’t even think about it when the opportunity was present.

I just kept acting out of habit, on autopilot, going through the same motions. Like Sisyphus, doomed to repeat my sentence for eternity.

All intended behavior change suffers from this flaw: forgetting to execute the new behavior when its applicable.

Maybe you want to start taking the stairs more, but every night you’re so tired when you check into your apartment that you opt for the elevator.

Or you want to wake up earlier, but every morning you silence your alarm.

What can be done? Is it hopeless?


If-Then Rules Are A Real Life Cheat Code

…what if I told you that life has cheat codes?

That there are certain techniques you can use to make it more likely that you’ll achieve anything you want? Fully-general goal techniques that will increase your probability of success?

Sounds pretty good, right?

These exist.

They’re buried in textbooks, in scientific papers, across a dozen disciplines. Psychology, cognitive science, operations research, game theory, economics, and more.

Today’s email is about one of those cheat codes.

A way to solidify and increase the odds of permanent behavior change.

A tool to move you from who you are now, to who you want to be.

Today’s email is about if-then rules.

If-Then Rules Prevent Breast Cancer

Comic by Vicki Jacoby.

Comic by Vicki Jacoby.

Let me tell you a story. About boobs.

Orbell, Hodgkins, & Sheeran, 1997 rounded up a bunch of women, who all shared the same goal.

They wanted to perform a breast self-examination, or BSE, sometime during the next month. You know what I’m talking about: where women feel for lumps in order to detect breast cancer.

The authors of the study split participants into two groups.

The first group recited an “implementation intention”, which is just newly invented jargon for “if-then rule.” These are of the form, “If [situation], then [behavior].”

For instance, a participant in the study might form an intention like, “If I’ve just finished washing my hair in the shower, I will perform a breast self-exam.”

Or maybe, “If it’s the first Wednesday of the month, I will perform a breast self-exam while changing into comfortable clothes after work.”

The second group didn’t create any if-then rules — they just had the goal of performing a breast self-exam.

The result?

100% of the if-then group successfully performed a breast self-exam, while only 53% of the second group did so.

With one simple if-then rule recited in probably less than 60-seconds, participants doubled their odds of goal success.

If-Then Rules Are Very Effective, Even Across Different Circumstances

The effectiveness of if-then rules for behavior change has since been confirmed many times, in many circumstances. They’ve been used to:

  • Increase the likelihood of implementing a vigorous exercise program (29% -> 91%.) In contrast, an entire motivational intervention that focused on the danger of coronary heart disease raised compliance merely 10%, from 29% to 39%.
  • Hasten activity resumption after joint replacement.
  • In one study, forming if-then rules for eating healthy foods reliably increased the rate at which people did so.
  • In another instance, drug addicts undergoing withdrawal were given the task of creating a brief resume before 5pm that evening. Of those who didn’t form implementation intentions, none were successful. Of those who did, 80% were successful.
  • This effect has even been observed in those with damage to the prefrontal cortex — the front part of the brain, sometimes called the seat of reason. Forming the implementation intention to work quickly when given a certain stimulus — in this case, the number 3 while completing a computer task — increased the speed at which participants did so.
  • Here’s my favorite example: implementation intentions can make you less sexist. In one study, participants formed the if-then rule, “If I see a woman, I will ignore her gender!” The results? No automatic activation of stereotypical beliefs.
  • This has since been replicated both for the old (“Whenever I see an old person, I tell myself: Don’t stereotype!”) and the poor (“Whenever I see a homeless person, I ignore that he is homeless.”)

At least 94 similar studies have been conducted, and since integrated into a meta-analysis (n=8461). The analysis found that implementing this extremely simple technique had an effect size of d=.65.

What does that mean?

Let’s say that, when it comes to achieving goals, you have exactly average performance — 50% of people do worse than you, and 50% do better. (This is just an example. Given that you’ve read this far, you’re almost certainly above average.)

Given an effect size of .65 for implementation intentions, this would mean that — by implementing relevant if-then rules — you’d improve your goal-achieving-ability by .65 standard deviations.

Which is enough to outperform 20% more people. Just by adding these if-then rules, an average goal achiever would end up outperforming 70% of the population.

Oh, and here’s a neat tip: if-then rules can themselves be supercharged. Stellar (1992) enhanced goal achievement by having participants form an implementation intention, and then adding “I strongly intend to follow the specified plan!”

You should use if-then rules – Here’s how

I’m excited about this technique.

It costs nothing to implement, and it will very probably have a substantial impact on your life — if you bother trying it out.

Here’s how: Come up with some if-then rules, either write them down or say them aloud, and voila!, suddenly you’re more likely to achieve whatever it is that you want.

Plus, you can apply this to anything. It’s a fully general technique.

So why wouldn’t you?

The general template is straightforward: If [situation], then [behavior]. The idea is to pair a concrete scenario with a behavior you want to enact.

Here are some examples:

  • If I’m mindlessly browsing the web, refreshing Reddit, I will instead pick up and read a book.
  • When I go out to eat with friends, I will order a salad.
  • If I have just finished dinner, I will write 500 words.
  • If I’m writing and interrupted, I will ignore it.
  • If I add something to my Amazon cart, I will wait 24 hours before purchasing it.
  • When I get my paycheck, I will set aside 10% as savings.

And my personal favorite: if I’m attacked at a bar, I will become a number theorist.

P.S. You’ve read this far – want more? Get articles like this emailed directly to your inbox, just fill out the form below. Thanks!



  1. Orbell, Sheina, Sarah Hodgkins, and Paschal Sheeran. “Implementation intentions and the theory of planned behavior.” Personality and Social Psychology Bulletin 23.9 (1997): 945-954.

  2. Gollwitzer, Peter M., and Paschal Sheeran. “Implementation intentions and goal achievement: A meta‐analysis of effects and processes.” Advances in experimental social psychology 38 (2006): 69-119.

  3. Gollwitzer, Peter M. “Implementation intentions: strong effects of simple plans.” American Psychologist 54.7 (1999): 493.

  4. Steller, Birgit. Vorsätze und die Wahrnehmung günstiger Gelegenheiten. [Implementation intentions and the detection of good opportunities to act]. tuduv-Verlag-Ges., 1992.

Analogical Thinking: Concepts as Example Bundles

Analogy is our best guide in all philosophical investigations; and all discoveries, which were not made by mere accident, have been made by the help of it.
—Joseph Priestley

Words are not the stuff of thought.

This is straightforward to demonstrate. Present someone with a quote — it can be anything, but for concreteness let’s say you go with a bit of Thoreau: “I was not designed to be forced. I will breathe after my own fashion. Let us see who is the strongest.”

So, you present this line to someone, and then you let some time pass, say an hour. Then, ask them to repeat the quote back to you. What do they tell you?

I’d wager that you don’t get the exact quote back, but the gist of the thing. Sort of like when reading this, you will come away not with an exact memory of each word and every comma, but instead a general idea of what it is that I’m talking about — a summary. Almost as if your mind were a lossy compression algorithm.

If words were the stuff of thought, or at least of memory, you’d expect the mind to store words as, well, words. If words were the stuff of thought, when presented with a quote, on recall you’d repeat the exact quote back.

But, instead, there seems to be some kind of mental translation that goes on. You don’t remember the exact quote but, instead, it gets stored as a “gist,” as if your mind translated it to meaningness.

So, words are not the stuff of thought.

Let me put it another way. What I’m saying is that, when you are offered some concept in words, you store that concept in meaning-nese. And, then, when you communicate it with someone, you translate that meaning-nese back into words.

This explains why the quotes are not exact, but become garbled — the words have to undergo translation: first, from words to meaning-nese to be stored, and then from meaning-nese back into words during recall. It’s like taking English, translating it into Chinese, and then translating it back into English.

You won’t end up with the original English.

How this relates to metaphor

Analogy is anything but a bitty blip — rather, it’s the very blue that fills the whole sky of cognition — analogy is everything, or very nearly so, in my view.
—Doug Hofstadter

Now, with this in mind, let’s consider the problem of communication. To make this easier, let’s restrict ourselves to idealized communication — comminucation where the goal really is communication. This is different from communication “in the wild”, where a lot of talking is not about substance, but about expressing friendliness or (perhaps unconsciously!) furthering one’s agenda.

So, idealized communication, where discussion really is about the transfer of ideas. Given this idea of meaningness translation, what can we say about this transfer?

Well, the goal of communication is for the speaker to translate some useful structure she has in her mind, encoded in meaning-nese, and to re-encode it in some other form — typically language, but it could also be art, or movement, whatever.

Then, the task of the listener, is to take this language-encoded structure and to decode it back into the original meaning-nese — or, at least, some dialect of meaning-nese compatible with the listener’s mind.

Thus, communication is really about the transfer of useful mind structures between speakers — but, since we can’t directly transfer from one brain to another via an uplink ala The Matrix, there’s an intermediate encoding and decoding step.



What labels imply

Okay, let’s take a step back then and consider the implications. What does it mean when you encounter a word or a phrase that you don’t understand? What does that indicate?

If we take the encode-decode dance literally, it’s an indication that the speaker has some useful cognitive structure in her head, with which you’re unfamiliar. So, concretely, I recently learned the word “ostensibly” which means “as it seems on the surface, but perhaps not actually.”

I have found this a gratifying label to have in my head, now that I’ve gone through the effort of re-building the cognitive structure that it represents. I can say something like, “Big business is ostensibly pro-immigration reform because they care about the welfare of would-be immigrants.” And “ostensibly” here acts as a sort of wink that says, yeah, that’s one explanation, but maybe there’s something more to it. In this example, this something more would be that maybe business just cares about cheap labor.

So, what am I trying to say here? What’s the practical interpretation? When you come across some equation, word, phrase, or whatever, that strikes you as foreign, this signals that the person has some useful cognitive structure that you don’t.

What does this have to do with analogy

Now, in a section that is about analogical thinking, you are maybe wondering why I’ve taken you through this detour into communication and cognitive structures. The idea is that, in some sense, all language acts as a metaphor.

This notion has recently been making the rounds with the endorsement of Doug Hofstadter, of Gödel, Escher, Bach (very recommended) fame, but the idea is at least as old as Lakoff and Johnson’s Metaphors We Live By (and, no doubt, older still than that.)

Here is what I mean when I say that all communication is analogy. Consider again the encode-decode theory I just told you about: it’s about taking meaning-nese, mapping it into words, and then unmapping it back into meaning-nese.

What do you call a mapping between two different things? An analogy. So, in a sense, all communication is about constructing an analogy between cognitive structures and words, and then the task of the listeners is to decode that analogy into their own mental model.

Essentially, that’s what’s happening right now: I’m encoding my ideas here, as words, and you, the reader, are decoding them. And, if everything is going as planned, you’re building a cognitive structure in your head right now which is similar to the one that I have in mind.

This is what separates a good exposition from a bad one: with a good one, it’s easy to decode and build up the writer’s cognitive structures in your own mind. With bad exposition, you either end up with no structure or a damaged one, a misunderstanding.

Concepts as analogical bundles

In fact, we can go even a little further than this, and say, what is a concept, really? That is, what are these cognitive structures that I have in my mind?

As a concrete example, let’s consider the number three. What is the idea of three-ness?

Well, with the concept, if you wanted to transfer it to a young child, you’d give concrete examples. A group of three rocks, three bananas, and so on, except of course you wouldn’t say “three” — you would show them three rocks, three bananas, and you’d ask them, so, what do these things have in common?

With enough examples, they would catch on, or at least so I suspect. I’ve been unable to acquire the necessary funding to experiment on young children.

The idea, though, is that with any concept, when you start unbundling it, you find that it’s really just a bunch of examples with some common core — some hidden structure, that isn’t immediately obvious when you consider just one thing in isolation, but becomes apparent with the study of tangible examples.

That is, a concept is a bundle of examples. The process of abstraction, of obtaining a useful cognitive structure, is ultimately one of comparing and contrasting these examples, until have built up this structure in your mind.

Some evidence regarding analogical thinking

So, let’s recap for a moment:

  • Communication is the process of translating a cognitive structure into words, and then from words back into a cognitive structure.
  • This mapping and unmapping is an analogy: setting up an isomorphism between cognitive structures and words.
  • A concept is a bundle of concrete examples. Each example contains some common core, with is captured in the concept. Thus, every concept is itself an analogy.

So, really, here we have two different uses of analogy: a concept/cognitive
structure is an analogy, and we use a process of analogy to transfer them
between people.

If this is really true, if I’m not just spinning you a nice story, we ought to expect that the study of concrete examples is the best way to go about learning a new concept. Really, it’s probably the only way to build a new cognitive castle in your head.

Is there any evidence to support this view? Well, yes. There’s significant evidence suggesting that comparing and contrasting examples is a powerful technique when it comes to understanding something new.

Consider the inert knowledge problem. This is when you’re in a situation, and you have relevant, applicable knowledge, but you fail to apply that knowledge. So, concretely, say you’ve taken a basic calculus class, and you’re arguing with someone about population growth. You get in this heated disagreement. They say that our current growth is unsustainable, and we’re headed towards an inevitable collapse because there is not enough food to go around — a Malthusian catastrophe.

You take a contrary position, and point out that, as nations develop, birth rates fall, such that population growth is below the replacement rate in some developed nations, like Japan and Germany. At a certain tipping point of prosperity, population plateaus and then actually begins to fall.

If your calculus knowledge transferred, here you might realize that this is an argument about the shape of the derivative of population growth. And, if you so realized, you might both draw out curves of what you think it looks like, and then compare that to real-world data.

The inert knowledge problem would be 1) knowing calculus, 2) having this argument, and 3) not realizing that you’re actually arguing about derivatives.

Now, depending on the amount of learning you’ve done in the past, you may or may not have noticed that inert knowledge is the devil. Why learn something if you fail to apply it? What can be done about this?

Well, okay, learning something is about the acquisition of concepts, right? So calculus knowledge is about building up calculus structures in your head.

If, as I’ve argued, this is the case, we might expect that comparing and contrasting examples (and thusly promoting concept acquisition) would help us overcome the inert knowledge problem.

Is this the case?

Yes. There’s even some evidence that comparing and contrasting examples, “analogical encoding”, is potentially the only effective technique at dealing with this inert knowledge plague. One review put it this way: “The best-established way of promoting relational transfer is for the learner to compare analogous examples during learning (Catrambone & Holyoak, 1989; Gentner, Loewenstein, & Thompson, 2003; Gick & Holyoak, 1983; Reeves & Weisberg, 1994; Ross & Kennedy, 1990; Seifert et al., 1986, Experiments 1 and 2).”

The quoted study further finds that analogical encoding — comparing and contrasting examples — not only promotes future transfer, but actually works backwards, too.

What do I mean by this? I mean that, if you sit down and compare and contrast examples, you’re going to be much more effective at coming up with past, relevant experiences of the principle in question. You can use this to transform inert knowledge into animated knowledge. To piece together the once dead into a new Frankenstein’s monster.

To use our calculus example again, if you’re reading about the jerk (the rate of change in acceleration), and you compare and contrast real-world examples, you’re more likely to spontaneously realize that, when learning to drive, the jerk you felt when stopping too quickly was an example of, you know, the jerk in physics.

Benefits for the acquisition of expertise

So, at this point, I hope you’re convinced that comparing and contrasting examples is the way to go about acquiring a new concept — it’s how to absorb a bundle of concrete examples and distill them into a useful cognitive structure.

But that’s not all! This is not the only benefit. Consider what it means to be an expert at something. One of the most cited studies on expertise compared how graduate students in physics categorized physics problems, versus how novices did.

The finding? Physics experts were more likely to pick out the underlying physical principle, while novices tended to focus on irrelevant surface characteristics. Presumably, physics experts had built up a cognitive structure that they recognized in the problem. The novices, lacking this mental structure, were unable to spot it.

If the theory I have sketched here is correct, then we ought to expect that comparing and contrasting examples will accelerate the acquisition of concepts and therefore expertise. Analogical encoding allows one to swim out from shallow seas and into the depths — “comparison between two analogous examples acts to make their common relational structure more salient (Gentner & Medina, 1998; Gentner & Namy, 1999; Markman & Gentner, 1993).”

Practical implications

Okay, then, we’ve just breezed through the core ideas of analogical thinking. To sum it up:

  • Words are not the stuff of thought. Our minds translate words into something else (meaning-nese).
  • Communication is essentially about analogy: it’s about mapping a cognitive structure (“meaning-nese”) into words, and then the listener unpacks that back into a cognitive structure.
  • Thus, successful communication is about setting up understandable analogies.

Then, I covered the relationship between concepts and analogy:

  • A concept is a bundle of concrete examples that illustrate some core relationship between those examples. The concept of three-ness can be understood as the relation between concrete instances of three things (bananas, rocks, years).
  • Given that a concept is a bundle of examples, we should expect that the best way to acquire a useful cognitive structure is to compare and contrast examples (“analogical encoding”).
  • There is a significant body of evidence that suggests that this is the case: comparing and contrasting examples is a powerful way to acquire a concept.

I also touched on the inert knowledge problem, and how analogical encoding
allows us to overcome it:

  • The inert knowledge problem is when you have relevant knowledge but fail to take advantage of it. Example: failing to realize that an argument about population growth is an argument about the shape of a derivative.
  • The only consistently supported method of overcoming the inert knowledge problem, and promoting the application of a concept in new situations (“transfer”), is analogical encoding — by comparing and contrasting examples. “When subjects explicitly compared the analogs and then immediately attempted to solve the target problem in the context of a single experiment, transfer was obtained with significant frequency even without a hint that the analogs and target were related. (Holyoak and Catrambone)”

Finally, I mentioned how this relates to expertise:

  • Experts are distinguished by better developed cognitive structures. Physics experts, for example, are able to pick up on the underlying structure of physics problem, while novices focus on surface characteristics.
  • How can we acquire such a cognitive structure? By analogical encoding — comparing and contrasting examples. Contrasting examples fosters a focus on deeper structures.
  • Thus, to accelerate the acquisition of expertise, one should take advantage of analogical encoding.

So, practically speaking, how can you, as an individual put this into practice? This method, analogical encoding, is both simple and powerful. To acquire a new cognitive structure, gather together a bunch of examples of the concept, and then compare and contrast those examples.

If you would like to improve your calculus skill, you should Google for real-world examples of derivatives or integrals or any concept that you’d like to acquire. Then, write them down, and then list how each is similar and each is different.

You can also use these principles to improve your communication and teaching skills. If you want someone to obtain a cognitive structure that you have, illustrate the principle with examples, and then bring their attention to the underlying similarity connecting the examples. In the case of this section, the principle behind all of these examples has been that learning and communication is about the transfer of concepts, which are bundles of examples, and can be acquired by contrasting concrete examples.

Now, what was that Thoreau quote, again?

Further Reading

Why I Like Surprises and You Should Too

I love surprises.

Imagine a man — oh, I’ll just pick a name at random, let’s call him James Randi. He’s a staunch materialist. And not the “I like to buy a lot of stuff” kind of materialist, but the sort that believes everything is made out of atoms and quarks (and whatever quarks are made out of) and that magic is physically impossible.

Now, imagine that this man, fed up with arguing with hippies and magical thinkers generally, snaps and declares, “I’ll bet anyone a million dollars that they can’t come into my laboratory and produce obviously supernatural phenomena.”

He declares this in a moment of exasperation, but he gets to thinking: y’know, this is a pretty good idea. A million dollars for the impossible. If someone is psychic, it’s free money for them. And if someone won’t take free money just to demonstrate their claimed power, well, they must be a fraud.


So he gets on the phone with the New York Times and he tells them about his brand new “One Million Dollar Paranormal Challenge”.

In the lab

Let’s fast forward a couple of months, to the point where money-seeking cranks have started slithering out of the Earth like worms in a storm. Imagine that this definitely fictional James Randi is in the lab with one of these worms. This worm, I’ll pick another name at random, let’s call her Theresa Caputo.


And this Theresa, who is fictional and definitely not a real life charlatan, claims that she can communicate with the dead. So James sighs, “Okay, Theresa. Tell me something that only a dead person could know. Something I know and they knew, but you have no way of knowing.”

Here’s where the universe diverges. In one branch, the usual happens. Theresa says something about James’s mother loving the water and boats, while James makes agreeable sounding noises — except he’s leading her on. His mom couldn’t swim and the water terrified her. (She had seen one too many “World’s Deadliest Sharks” specials on Animal Planet.)

But in the other branch, the surprising happens. Theresa details the family dog James had as a child. A fluffy, white contraption named Houdini, who loved milk so much that Randi’s mom would share her bowl of cereal with him after she had finished with it.

And on and on, like the time Randi had tried to grow his first beard and everyone called him Fuzz Aldrin. Or when he asked Sally Banks to prom and she turned him down and it crushed him, but actually she was just a lesbian and how could that be a reflection on him?

She’s saying this stuff, anecdote after anecdote that no one could know, and James grows more and more freaked out, the color of his face shifting from its typical rosy hue to a pale-moon grey, as if someone had opened up a picture of him in Photoshop and slowly moved the saturation from 100% to zero — until finally he bellows, “Okay, stop, enough! I get it.”

Theresa stops for a moment, a pause, silence, and then she chirps, “…but we haven’t even gotten to the mind reading bit yet.”

Contrasting consequences

Okay, now, consider how the lives of each James will unfold over the next couple of months. The James in the first branch will continue going about his life as he’s been doing it. Giving an interview here and there, maybe working on a book about skepticism (dubbed Citation Needed), and occasionally debunking so-called psychics in his lab. Business as usual, you know, the grind.

The other James has just had his belief system wrecked. Assuming that he’s not hallucinating and that this woman’s abilities hold up to further scrutiny, our best models of the universe are just wrong. The supernatural exists! Consciousness continues on after death.

This second James will be tasked with picking up the pieces of his belief system after this intellectual Earthquake, that has not only shook but toppled his belief systems and proved that the foundations were air all along. He’ll have to ask himself stuff like, “Is this proof of the existence of God? Should I be converting to some religious movement? Which one? If I’m wrong about this, what else am I wrong about?”

And that’s just James’s personal struggle. Consider the far reaching implications such a discovery would have. Proof of the supernatural! This would be a larger scientific discovery than anything before. More than Newton discovering classical mechanics, more than special relativity, even bigger than the ancient Greek’s discovery that, yes, the universe contains regularities that can be described by simple mathematical equations.

We’ll want to know: how does this woman communicate with the dead? Where are they? What’s the causal mechanism here? Is there some as-of-yet undiscovered physical phenomena taking place here? Or is a mysterious, inexplicable force somehow fundamental to the universe? Maybe it’s not so much that we don’t know, but that we can’t know.

Plus a billion more mundane questions: Can humans use their psychic powers to communicate faster than light? What about to communicate with the long dead to write history books? How about to make crops grow faster or to find oil? Can every human do this or just one? Can you train this ability? Is it located in a region of the brain? Can psychics predict the stock market?

And, of course, there would be a religious superstorm, a mad rush to claim that we called it, we knew it all along, that this woman was a product of our god. (Like terrorists taking responsibility for an attack, if you will.)

Information content

The value of information is usually defined as the amount someone would pay to know something prior to making a decision, but I like to this of it as the amount your behavior would change if you had the answer. That is, if you were clairvoyant — you know, like you could somehow plug your brain into the heavens and have always-on-access to a line of the best kind of credit: pure truth.

For instance, you might consider a firm doing medical research on new drugs. The economic incentive here is massive: the total revenue of Lipitor alone is something like $141 billion.

What’s the value of information here? Well, what if you knew before doing a bunch of expensive research and development that a certain chemical was going to be bust? The median cost of research and development per drug is something like 1 billion USD. Since 19 out of 20 medications in experimental development fail, the value of knowing ahead of time is going to be worth at least half a billion (and probably a lot more).

This can be converted to degrees, too. Reducing uncertainty from 38 to 27 percent might be worth something like 55+ million.

From this view, surprising information is more valuable than not surprising information: it leads to greater shifts in behavior. For the first James, he’s just found out that his model of the world was right, that psychics really are tricking people and whatever, and he gets to go on with his life as usual. Not too valuable. It maybe shifted his confidence from 99.99 percent to 99.990001 percent.

The second James has his world — well, at least his model of the world — torched. He found out that he’s been wrong about damn near everything. And what’s the value of this information? “What does it matter?,” you might ask. “Wouldn’t he be happier oblivious?”

Maybe not — consider the ramifications again. What if James had continued to live oblivious to the existence of the supernatural, remaining a staunch atheist, and the rapture comes along and, whoops, no heaven for James.

genesis-snake-gardenOr, even more prosaically, consider the Earthly value of information here. James, as a result of his lab interview, might learn how to harness his own psychic powers, and maybe he has some prescient abilities. He can see the future and foresees a typhoon in India. After googling, he realizes that India is one of the world’s largest producers of coconuts, and this typhoon is going to coincide with the harvest. So he buys up coconut futures and sells them for a cool billion when the typhoon hits.

Or, you know, he could totally use that information to save a bunch of lives.

Surprise as an indicator of incorrect models

As you’re probably noticing and, if not, the header should have given it away, surprise indicates that your model of the world is incorrect — that there’s something that you’ve failed to take into account.

I don’t know if you’ve seen the movie Groundhog Day but, basically, Bill Murray repeats the same day over and over again, and he’s the only one who’s aware that it’s happening. There’s a scene where he abuses his near omniscience to duck in behind an untended armored vehicle and steal a bag of money.

groundhog-day-armored-truck-heistHe’s able to do this because he can anticipate everything that’s going to happen (he’s lived it before). His model of the world is perfect and, indeed, if you look at the original script, the author intended him to have spent like 10,000 years going through this same day. He was supposed to be godlike. There’s even a line in the movie where Murray says, “Well maybe the real God uses tricks, you know? Maybe he’s not omnipotent. He’s just been around so long he knows everything.”

This is to say that, when you have a perfect model of the world, you can anticipate everything that’s going to happen. Nothing should be surprising. When someone faceplants directly into their wedding cake, you saw it coming.

Surprise is all about the violation of expectations, and if an expectation has been violated, an implicit model has been violated. And that means that you’re wrong about something.

Every surprise indicates something you’ve failed to take into account. Like when I found out that cats are lactose intolerant and shouldn’t be given milk, I was surprised. I had failed to connect some pieces of knowledge in my head, namely, that most animals don’t drink milk past infancy, so why wouldn’t they be lactose intolerant? In this respect, humans are a bit of an anomaly (and it’s a relatively recent anomaly, too, at about 7500 years old.) ## Recovering from surprise

Which brings me to my next point, which is that, given surprise indicates something wrong with your model of the world, whenever you’re surprised, you should fix your model.

Ask yourself, “How could I have anticipated this?” and, usually, once you’ve answered that question, you’ve fixed everything.

As an example, I like Moravec’s paradox, which was the discovery that it’s much easier to teach a computer chess than it is to teach a computer to walk or recognize faces, while the reverse is true for humans. In a sense, what’s easy for computers is hard for humans, and vice versa.

Why should this be the case? Well, recall that the prefrontal cortex is a relatively new part of the brain. Evolution has not spent too many computational cycles optimizing our ability to reason, and virtually none on chess (modern chess is only about 550 years old.)

On the other hand, motor skills, sight, object recognition, facial recognition, that sort of stuff — that’s gone through a lot of iterations, something like 410 million years of iterations. So, yeah, that one is going to be a bit harder for humans to reverse engineer and implement on a general purpose computer.

Moravec’s paradox, then, can be anticipated by considering how long, on evolutionary timescales, a “feature” has existed. Duplicating human sight? Probably difficult. Mathematics? Easy, especially stuff like multiplication, division, whatever. (Trickier when you get to, say, proving the Riemann hypothesis.)

The sheer superiority of surprise over other forms of noticing wrongness

Okay, so we’ve covered:

  • Surprises contain more information than the expected. * Surprises indicate incorrect models. * To fix incorrect models, ask how you could have anticipated a surprise.

Finally, let me point out the sheer superiority of surprise over the alternatives. Say you want to improve your model of the world, what are you going to do? Well, you could try to notice any tiny, nagging doubts you have about something, the sort that your mind quietly brushes over and you never even notice.

This is hard. I’ve spent more than a year meditating daily and I’m still not very good at it, and most of the time I don’t even notice those doubts until afterwards, when I’m like, “Huh, should have noticed that sound was his wooden foot and not rationalized it away as a funny brand of shoe.”

Surprise, on the other hand, demands your attention. You don’t even have to think about it. It turns out that your eyes and attached-frontal-brain-region-plus-amygdala automatically filter out nonsurprising information and direct your visual system towards surprising stuff in your environment. So you don’t even have to make an effort to notice something surprising. You just will.

why-i-like-surprisesYou could try to enjoy being wrong. Then you’ll naturally seek out opportunities for wrongness and wrongitude, chances to actively test your beliefs. There are people who say they can actually do this.

I suspect these people are just lying. I don’t like being wrong about anything. I have to make a conscious decision when someone disagrees with me to be like, “Wait, maybe they have a point and are objectively right, even though my monkey brain is too worried about status to admit to it.”

Surprise, on the other hand, is easy. Want to know some surprising information? Hell yeah, I want to know some surprising information. And, hopefully, you do, too, now that I’ve told you why I love surprises.

Further Reading

Herbert Simon’s Ant

Here’s a metaphor that comes to me by way of Nobel laureate and Turing award recipient Herbert Simon.

Imagine watching an ant on the beach. Its path looks complicated. It zigs and zags to avoid rocks and twigs. Very reminiscent of complex behavior — what an intelligent ant!

Except an ant is just a simple machine. It wants to return to its nest, so it starts moving in a straight line. When it encounters an object, it zigs to avoid it. Repeat until the destination is reached.

Trying to simulate the path itself would be difficult, but simulating the ant is easy. It’s maybe a half-dozen rules.

The point of this parable is to illustrate the interaction between the environment and perceived complexity. Lots of complex looking things are really the result of the territory, the shape of the beach, and not the agent, in this case, an ant.

But, of course, with this metaphor, I’m not really talking about ants. I’m talking about people. How much of the complexity of human behavior is really the product of the environment?

Consider yesterday’s post. Zach Weinersmith wrote this about writer’s block:

If you can’t write, read more. In my experience, writer’s block is not a condition, but a result. Lots of people seem to think they can play video games 12 hours a day, then one day happen upon a great idea. It doesn’t work that way. You’ve got to put in time on input if you want good output.

Now, what’s so interesting about this? Well, it’s a lot like that ant. Humans can’t just sit and intuit something complicated — we have to go and engage with the complexity of our environment.

Creativity, Fan Fiction, and Compression

I’ve written before about the relationship between creativity and compressibility. To recap, a creative work is one that violates expectations, while a compressible statement is one that’s expected.

For instance, consider two sentences:

  • Where there’s a will, there’s a way.
  • Where there’s a will, there’s a family fighting over it.

I suspect you find the second more creative.

Three more examples of creative sentences:

  • When I was a kid, my parents moved a lot. But I always found them.
  • Dad always said laughter is the best medicine, which is why several of us died of tuberculosis.
  • A girl phoned me the other day and said, “Come on over, there’s nobody home.” I went over. Nobody was home.

Given that less predictable sentences are more creative, and less predictable sentences are less compressible, creative works ought to be less compressible than non-creative ones. And, indeed, I found some evidence for this in a previous experiment.

But that was not too compelling as it compared technical, repetitive works to novels. This time, I decided to compare very creative writing to normal creative writing.


The idea then is to compare the compressibility of amateur creative writing with that of experts. To accomplish this, I took 95 of the top 100 most downloaded works from Project Gutenberg. I figure that these count as very creative works given that they’re still popular now, ~100 years later. For amateur writing, I downloaded 107 fanfiction novels listed as “extraordinary” from fanfiction.net.

I then selected the strongest open source text compression algorithm, as ranked by Matt Mahoney’s compression benchmarkpaq8pxd. I ran each work through the strongest level of compression, and then compared the ratio of compressed to uncompressed space for each work.

Analysis and Results

I plotted the data and examined the outliers, which turned out to be compressed files that my script incorrectly grabbed from Project Gutenberg. I removed these from the analysis, and produced this:


Here the red dots are fanfiction novels, while the blue-ish ones are classic works of literature. If the hypothesis were true, we’d expect them to fall into distinct clusters. They don’t.

Comparing compressibility alone produces this:


Again, no clear grouping.

Finally, I applied a Student’s t test to the data, which should tell us if the two data sets are distinguishable mathematically. Based on the graphs, intuition says it won’t, and indeed it doesn’t:

The p-value here is 0.1755, which is not statistical significance. The code and data necessary to reproduce this are available on GitHub.


I must admit a certain amount of disappointment that we weren’t able to distinguish between literature and fanfiction by compressiblity. That would have been pretty neat.

So, what does this failure mean? There at least six hypothesis that get a boost based on this evidence:

  • Creativity and compression are unrelated.
  • A view of humans as compressors is wrong.
  • Human compression algorithms (the mind) and machine compression algorithms are distinct to the point where one cannot act as a proxy for the other.
  • Compression algorithms are still too crude to detect subtle differences.
  • Fanfiction is as creative as literature.

And so on and, of course, it’s possible that I messed up the analysis somewhere.

Of all of these, my preferred explanation is that compression technology (and hardware) are not yet good enough. Consider, again, the difference between a creative and a not-creative sentence:

  • Honesty is the best policy.
  • I want to die peacefully in my sleep, like my grandfather… not screaming and
    yelling like the passengers in his car.

The first is boring, right? Why? Because we’ve heard it before. It’s compressible — but how’s a compression algorithm supposed to know that? Well, maybe if we trained it on a corpus of the English language, gave it the sort of experience that we have, then it might be able to identify a cliche.

But that’s not how compression works right now. I mean, sure, some have certain models of language, but nothing approaching the memory that a human has, which is where “human as computer compression algorithm” breaks down. Even with the right algorithm — maybe we already know it — the hardware isn’t there.

Scientific American estimates that the brain has a storage capacity of about 2.5 petabytes, which is sort of hand-wavy and I’d bet on the high side, but every estimate I’ve seen puts the brain at more than 4 gigabytes, by at least a couple orders of magnitude. I don’t know of any compressors that use memory anywhere near that, and certainly none that use anything like 2.5 petabytes. At the very least, we’re limited by hardware here.

But don’t just listen to me. Make up your own mind.

Further Reading

  • The idea that kicked off this whole line of inquiry is Jürgen Schmidhuber’s theory of creativity, whichI’ve written up. If you prefer, here’s the man himself giving a talk on the subject.
  • To reproduce what I’ve done here, everything is on GitHub. That repository is also a good place to download the top 100 Gutenberg novels in text form, as rate-limiting makes scraping them a multi-day affair.
  • I similarly compared technical writing and creative writing in this post and did find that technical writing was more compressible.
  • For an introduction to data compression algorithms, try this video.
  • Check out the Hutter Prize, which emphasizes the connection between progress in compression and artificial intelligence.
  • For a ranking of compressors, try Matt Mahoney’s large text compression benchmark. He’s also written a data compression tutorial.

The Creative Process Demystified

Jack Kerouac is a liar.

Okay, let me rewind. I don’t know how much experience you’ve had with creative writing types — pale, imaginative creatures — but let me tell you how they talk about Jack Kerouac. They say his name in sort of hushed, reverent tones, and whisper things like, “Can you believe that he wrote On the Road in one sitting?” Like great authors are some sort of gods. We, you and me, on this blog, we know better. There are no gods and his name is Richard Feynman.

Except Kerouac didn’t even write On the Road in one sitting. He spent three years writing pieces of it and, eventually, spent three weeks writing a first draft from that material. He then spent a couple of years revising that draft, which became On the Road. But this doesn’t make as good of a story, so instead Jack told everyone that he wrote it all at once because, as we’ve established, he was a liar.

Now, what’s the significance of this story? The answer is incrementalism: Great works are the result of a process of incremental growth and improvement.

Consider the Christian creation native. In the beginning, God created the heavens and the earth. Now, God is an omnipotent, all powerful dude, so presumably the heavens and earth are less impressive than he. The implicit assumption is that you need something complicated to create something else complicated.

We were more or less speculating in the dark with this narrative until Charles Darwin and Alfred Russel Wallace came along with evolution. It turns out that something as complex as the human mind is not a miracle from on high, but the result of millions of years of selective pressure.

Or, to put it another way, simple things grow into complicated things if you expend enough energy on them.1

Creating something is more like evolution than like God creating the heavens and the earth. It’s not a process of flipping a switch, or letting it all fall out of your mind. You have to grow a book or a blog post. Write a rough draft and filter it into something better. Hill-climb until the quality is twice what it was before.

No magic

There is nothing magical behind the creative process. Sure, authors and poets will sometimes wax romantic about writing and play up the mystery, but this is misdirection.

Look, I can do it, too:

Writing is, in its essence, the soul’s interpretation of the signs that are revealed to it. The ability to write well, the gift of a soul, is something innate. One must be born into it. Just as not all men possess the capacity for reading tea leaves or interpreting the whims of the spirit realm, few are born with that devil’s touch that brands one writer.

Except, you know, that’s all bullshit. There’s no magic. You get an idea. You think about the idea. Maybe write an outline. Write a rough draft. Delete a lot. Write another draft. Repeat until good. With a liberal sprinkling of self-loathing and lots of doubt, that’s creativity.

Indeed, to create something good:

  1. Come up with an idea.
  2. Create a rough version.
  3. Refine it until it’s good.

That’s it. That’s how books are written. I mean, sure, there are some specifics, like how to keep everything organized and whatever, but this is the gist of it. Create a prototype and then refine it over and over. That’s incrementalism. That’s what I mean when I say that great works are grown. It’s not magical. It’s algorithmic. Follow these instructions and you’re golden.

Most of the difficulty in creating something worthwhile is not because of the complexity of the process, but rather the difficulty in maintaining effort over time. We get bored and frustrated and quit. The trick to writing or creating something great is figuring out how to tame those tendencies and continue exerting yourself in pursuit of that goal.

1. Evolution, however, is not a race towards greater complexity. Single-celled organisms rule the world.

Creativity, Literature, and Compression

But first, a joke:

I was at a bar last weekend, chatting with this woman. She was decent looking. There was a lull in the conversation, so I say to her, “Hey, I’ve got this talent. I can tell when a woman was born after feeling her breasts.” She doesn’t believe me at first, but after a minute or so, she comes around. “Go on, then,” she says to me. I feel her up a bit before she gets impatient. “Well, when I was born?” she asks. So I tell her — “Yesterday.”

Dissecting and killing the joke

What’s funny about that joke? The surprise. First, there’s the set up. It’s titillating, and listeners start anticipating: this is going somewhere. And then — punchline! Outta nowhere, or at least that’s what it feels like. Cue laughter. This shock, this violation of expectation, is what comedy’s all about.

Here’s another one: “I’m not a member of any organized religion. I’m a Jew.” If the sentence had instead been, “I’m not a member of any organized religion. I’m an atheist,” there would be nothing at all funny about it. George Carlin’s, “If you can’t beat them, arrange to have them beaten,” follows the same pattern.

Brains are sort of anticipation engines and, when you violate those expectations, well, that’s comedy. It’s the difference between something original and something not. If Carlin had said, “If you can’t beat them, join them,” I wouldn’t be talking about him. It’s boring, cliched. It’s not creative.

Creativity is about violating expectations.

Anticipation is compression

Anticipation and prediction are the same thing. When I drop a ball, I anticipate and predict that it will fall to the ground.

Now, let us imagine a program that can take in a few facts about you and then predict with certainty what it is that you’re going to say. In such a case, the machine wouldn’t need to remember anything about you except those few facts. If it needs to know your opinion on something in the future, it can take those facts, run them through its internal predictor, and regenerate your opinions.

You, as a human, sort of already do this. For instance, if I tell you how I lean politically, you might not need to know my stance on anything — you might be able to anticipate it. So instead of storing, “The author’s opinion on Serious Political Topic X,” in long term memory, you could just remember, “The author is a Blue.”

This difference between remembering everything and remembering just a few details is compression. It follows, then, that when you can predict something, you can compress it.

Given then, that:

  1. Creativity is about violating expectations.
  2. That which can be expected can be compressed.

I would expect that creative things are less compressible than non-creative things. Do creative books compress less than non-creative ones?

That’s what I want to find out.


The idea, then, is to take works that are creative and non-creative, compress both, and observe whether the non-creative books are more compressible. Given the theory fleshed out above, I expect the non-creative works to be more compressible.

Sorting books into creative and non-creative buckets is, by its nature, a subjective task. I attempted to grab from the most obviously creative and non-creative works. In practice, this tended to blur the line between non-creative and boring. The most mind-numbing works, I figure, are the least creative.

Creative works:

  • Alice in Wonderland
  • Godel, Escher, Bach: an Eternal Golden Braid
  • Through the Looking Glass
  • Flatland
  • Beyond Good and Evil
  • Emerson, First Essays
  • A few other popular novels on Project Gutenberg.

Non-creative works:

  • The Berne Convention
  • RFC 4880
  • IRS Publication 557: Tax exempt status for your organization
  • ITunes User Agreement
  • Patent 8,322,614, “System for processing financial transactions in a self-service library terminal”
  • The Affordable Care Act

I took each of these works, ran them through the xz compressor — the strongest general purpose compressor in wide circulation, as far as I know — and then compared the “compressibility” (ratio of uncompressed to compressed data) of the two classes of files. The comparison was done with R.

Analysis and Results

Before anything else, I plotted the compressibility of the data using a dotplot, and colored each by work as creative or not. The results are visually striking:


You will notice that the works fall into two distinct clusters. Creative works (black) are less compressible than non-creative works (red), which is what I would suspect given that my hypothesis is true.

My statistics-fu is weak, but I think the Student’s t-test is the right tool for the job here. This calculates the p-value that the two groups are different, which comes out to 0.00001488 or, if the computer could speak, “I’m near certain that the non-creative group is more compressible than the creative group.” (That level of certainty is almost certainly inappropriate, though. In a trial of 10,000 analyses like these, I screw up more than one of them.)


Let’s dig in a little deeper to the creative works:

This is sorted from most compressible to least, implying that Jekyll and Hyde is the most creative novel of those tested, while Godel, Escher, Bach is the least. I find this unlikely.

Indeed, if I plug in Moby Dick, I get a ratio of .3353, or less compressible than Alice in Wonderland, Breakfast of Champions, and others. Now, I’ve read Moby Dick and it’s a terrible, boring affair. I much prefer Godel, Escher, Bach or Alice in Wonderland. And internet reviewers largely do, too.

So it seems that compressibility can classify novels from technical works, but it’s not — at least using xz — possible to separate very creative works from just creative works.


So, the theory predicted that non-creative works would be more compressible than creative ones, and that panned out. This is far from confirmation of the model, but it’s still evidence, and I’m pretty confident that the average novel is less compressible than the average piece of technical work.

It would be much more impressive if this could distinguish more specific degrees of creativity. If I compared some of the novels produced by first time authors (or bad fan-fiction) to those on the Modern Library’s top picks and it found that the Modern Library picks were more creative, well, that’d be neat. (Maybe I will try this out in a future post.) We can imagine such a technique becoming more and more powerful — to the point where it can distinguish between the relative merit of different works by the same author.

The limiting factor here, of course, is the power (or intelligence) of the compression technology. The compression algorithms that we all use are not that complicated. I can feed them sensory input and they won’t compress it down to the laws of physics. Instead, they’re relatively crude-but-effective attempts at deduplication, which means that they’re an imperfect measuring sticks for how creative something seems to a human.

For instance, if I feed a compressor a cliche or a clever play on that cliche, the compressor doesn’t have the intellectual context necessary to ding the cliche. If I could, instead, train an algorithm on a huge corpus of English text, of the sort that Google possesses, I’d be able to better construct a compressor that’d evaluate originality.

Even then, there are theoretical limits on this. I could feed a compressor random input, which cannot be compressed, but that doesn’t mean that there’s anything of interest to humans. And we can wonder: how much can word alone, surface level characteristics of text, be representative of creativity? At some point, a sufficiently intelligent compressor must understand the content, too.

In that final sense, humans are the ultimate compressors — at least for now.

Further Reading

Why Do We Think The Way We Do?

I sometimes experience a sort of mental disconnect — a sense of knowing what I’m going to think before I bother to think it. Sort of like an experience of “pure thought” that is followed by a mental translation into words. It happens maybe a couple times a day and I wonder, “Why do I bother thinking at all? At least in words. Why not stick to the stuff of pure thought?”

After realizing this sort of thing, I started to pay more attention to the specifics of my thought processes. Just what, I wondered, is going on in my head? And I realized: Much of my thought is not in words, but images and motion. I’m no longer even certain that most of my thought happens in English — maybe it’s all meaningness that is converted into words when I throw a mental spotlight onto it.

Indeed, research by Linda Silverman reports that some portion of the population thinks exclusively in words (estimated at 25%), another portion strongly prefers imagery (30%) and, like me, most prefer a mix of the two (45%). There may even be sex differences. In one study, female participants reported more vivid mental imagery than their male counterparts.

Why do we bother?

What’s the point of thought? What’s my mind doing all day long?

There are (at least) two levels we can pursue this on. We can consider the differences between humans and, say, chimpanzees, and speculate as to the general nature of intelligence. Why do some of us have more of it than others and what is it good for? Why did intellect evolve?

The other tact we might pursue, the other set of questions, is: Why is there consciousness? Why do we have a “mental space” and self-awareness? Why couldn’t it all happen, you know, elsewhere — like a reflex? What’s the point of these words and images in my head?

The Evolution of Intelligence

There is no clear scientific consensus on why intelligence evolved. Wikipedia is maybe the most useful resource — better than the review article in The Cambridge Handbook of Intelligence — but it only goes so far as to list the different theories, not evaluate the plausibility of each.

However, I am a man on a blog, which means I have free rein to speculate. I’m partial to the notion that intelligence evolved, fundamentally, to be weaponized in disputes against other humans.

Consider the social structure of the violent, yet much beloved, chimpanzee. He lives in a group of some 15 to 125 individuals. He has a definite place in the community’s pecking order — submitting to those more powerful and dominating those less.

Bizarrely, the “top dog” chimpanzee, the leader of the group, is not always the strongest male. Rather, he’s the one who is the most politically suave — forming alliances in a crude chimp analog to House of Cards. Alliances which allow two lesser chimps to dominate a greater chimp.

Given that the most powerful chimpanzee has, more or less, free access to the females of community, he will pass his genes down to more children than other, less dominant males. This means that political savvy and the chimp equivalent of social skills are of no small reproductive benefit. If we assume that there is a significant correlation between intelligence and the sort of strategic thinking required to become and stay top chimp, then we can begin to see a path through which intelligence might have been selected for.

In such a scenario, then, intelligence starts to look like a sort of arms race. If I can outsmart my fellow chimps, I’m more likely to reproduce and, thus, my genes survive another generation. This would mean that intelligence’s purpose, at least in the sense of what it evolved for, is the manipulation of social hierarchies.

However, if this is the case, it doesn’t seem to align all that well with what we see in modern society. President Obama, the top of our monkey social pyramid, is not the most intelligent man I can name — Terry Tao, Scott Aaronson, John Baez, and those are just people in my RSS feed.

But maybe that’s too harsh a demand on the theory. I’m not proposing that intelligence perfectly correlates with reproductive success, just that the more intelligent were more likely to reproduce than those of average intelligence, thanks to their throne at the top of monkey pile.

In that case, presidents fare better. George Bush, who everyone likes to hate on as so dumb, still scored above the 85th percentile on both the math and verbal portions of the SAT, putting him more than a standard deviation above the mean. If we’re willing to concede that most presidents are at least as intelligent as my boy George, this means that the average leader of the free world has at least a standard deviation on average folk.

A separate trouble with the theory is that, currently, intelligence and reproductive success are negatively correlated — the smarter you are, the fewer children you’re likely to have. But this is almost certainly a symptom of modernity — after all, if it held throughout the ages, how could intelligence possibly have evolved in the first place? We exist, so this seems more like a glimpse of the face of that Cthulhu which is modernity.

The utility of intelligence

Here is a famous thought experiment that comes to me by way of Roger Crisp:

You are a soul in heaven waiting to be allocated a life on Earth. It is late Friday afternoon, and you watch anxiously as the supply of available lives dwindles. When your turn comes, the angel in charge offers you a choice between two lives, that of the composer Joseph Haydn and that of an oyster. Besides composing some wonderful music and influencing the evolution of the symphony, Haydn will meet with success and honour in his own lifetime, be cheerful and popular, travel and gain much enjoyment from field sports. The oyster’s life is far less exciting. Though this is rather a sophisticated oyster, its life will consist only of mild sensual pleasure, rather like that experienced by humans when floating very drunk in a warm bath. When you request the life of Haydn, the angel sighs, ‘I’ll never get rid of this oyster life. It’s been hanging around for ages. Look, I’ll offer you a special deal. Haydn will die at the age of seventy-seven. But I’ll make the oyster life as long as you like…

Presumably, you would rather be Haydn than the oyster (and, if not, you’re probably a hedonist and I’d like to party with you.) It’s better to be smart than to be dumb. If Pfizer tomorrow comes up with +50 IQ pills, I’ll be first in line.

But why? What’s good about intelligence?

It helps us achieve our goals, whatever those goals may be. Well, except happiness and maybe sex, but I think the sex result is probably a hormone thing. I expect a genius would have little trouble learning pick-up. Case in point: the weird section in Surely You’re Joking, Mr. Feynman where Feynman talks about negging women in bars so that they’ll sleep with him.

Other goals are amenable to the application of intelligence — finding food, housing, becoming president, that sort of thing. Smarter people have an easier time finding information — what to eat, how to lose weight, and so on, and they use this to shape their plans. Smarter people are even more likely to successfully quit smoking.

Yay, intelligence.

But Why Thoughts?

What’s the point of conscious reasoning? The sort of thought that is available to introspection — that which I’m aware of taking place, at least sometimes. Why?

Last night, I was working on one of the puzzles in To Mock a Mockingbird and paying attention to what was going on in my head. A possible solution would spring from the unconscious or as a result of the reasoning process. Then, I would work through the implications of the idea — does it solve the puzzle? What happens in this case? What about if I try this?

Thought seems to be related to this sort of mental simulation, this considering of consequences and verification of intuition. Indeed, we might think of intuitive, unconscious thought as a sort of tennis partner with slower, conscious reasoning — a back and forth. The intuition provides material to the conscious mind and the conscious mind processes that information, which sculpts and corrects the intuition.

Try sitting for a moment and thinking about nothing at all. It’s impossible to maintain for long — you’ll find thoughts popping unbidden into your mind. The control that the self, the “I”, has seems to be related to working with what pops into mental space. I can let go of a thought and wait for something else to occur to me, or I can grab onto that thought and work through the implications of it — sorta leaping from one thought to the next, each a related procession of mental experience, a bit like jumping from one train car to the next in an action film.

That, I think, is the point of human thought, of reason. If I do this, what will happen? If this is true, what are the implications?

Further Reading

People All Think The Same


On May 7th of 1997, Garry Kasparov — the second strongest chess player of all time — was hunched over a chess board. Both of his elbows rested on the table in front of him, with one hand clutching his forehead. His face sported a look of the purest determination.

His opponent felt nothing. It was Deep Blue, a machine built by IBM and, at the time, the most powerful computer chess player in existence.

The match ended in a draw after 56 moves and lasted just over 5 hours, but most notable was Kasparov’s opening strategy. He intended to force Deep Blue out of “The Book” as quickly as possible, in the hope that human ingenuity would have an advantage over the rigid processing of the machine.

Chess and the Book

The earliest recorded chess match dates back to the 10th century, played between a historian and a student. Since then, it’s become a tradition for moves to be recorded — especially if a game has some significance, like a showdown between two strong players. As a consequence, today, students of the game benefit from one of the richest data sets of any game or sport, with sites like ChessBase boasting more than 6 million recorded matches.

These recorded matches are sometimes referred to as “The Book,” as in, “He’s still playing from The Book,” which means that they’re playing a move that has been played before in the history of recorded chess.

It may not be immediately obvious that there exist chess moves that haven’t been played before. Claude Shannon famously estimated the number of possible chess positions at \(10^{43}\). That’s a whole lot more than the number of atoms in an apple — more possible moves than humans can ever hope to play.

If you wanted, you could go right now and download a chess program, load a few million recorded matches into it, and check whether or not a move has ever been played before by another human being. If you were playing against a friend, you could pull up a computer and compare your own moves to those of other players throughout recorded chess history — a sort of thread linking yourself with someone two hundred years ago.

Human thought is a lot like chess.

Been There, Thought That

This is a somewhat embarrassing and juvenile admission, but there was an AskReddit 20 days ago, “What is something you want to touch more than anything in the world?” I tried to come up with something creative, couldn’t think of anything good, and thought, “Welllll, guess I’ll go with Scarlett Johansson’s boobs.”

I opened the thread and that was the second most popular response.

Earlier today, Scott posed a question to his readers, “What’s a lifehack that everyone uses?” I came up with a few that I thought were clever — calendars, clocks, hand-washing, prayer and meditation. Except when I scrolled through the comments, people had thought of each of those ideas, along with maybe fifty others.

Another example: during a lull in conversational topics with friends, I sometimes drag problems that I’ve been thinking about into the discussion arena — something like, “What are the most important things for a man to know?” These are problems where I’ve spent maybe four or so hours working on them, but haven’t discovered anything satisfying.

These conversations are eerie. I sit and watch people run through thought processes that I’ve already had, offering suggestions that I’ve already considered, thinking thoughts I’ve already thought.

A “Book” of Thoughts

All of these are cases where someone has either thought something before me, or I’ve thought of something before them. These thoughts have been thought before. This is like “The Book” in chess. Thoughts are moves that have been played before.

Sure, we don’t have a central repository of thoughts in the same way that there is a database of chess matches, but we do have all the words published and indexed by Google on the internet, along with more than 35 million books in the Library of Congress.

If I want to have a novel thought, to produce some actual insight, I’m tasked with trying to think something that a human being has never thought before. This is not too difficult. I can invent a sentence like, “Imagine a giraffe that’s made out of extension cords, except where the plugins are eels with toothy, human-like grins.” This is a novel thought, but that’s about the extent of positive things that one can say about it. It’s useless.

Thinking something novel and useful is harder.

An Algorithm for Thought

The insinuation, then, is that human thought (or reason) is a structured process. It occurs broadly in the same way across individuals.

Which is not to say that my thoughts and your thoughts are identical. Thanks to a variety of genetic and environmental causes, we run on sorta different hardware, and we each have absorbed a different knowledge base to reason from.

Still, human thought is surprisingly homogeneous. Often I will make an observation to a friend or family member and they will say, “That’s what I was thinking.” This is a clue that we’re all doing the same sort of thing in our heads. Our thoughts are like water, each drop tending to end up coursing through the same canyons.

Further Reading

What Savant Memory Says About The Limits of Memory

Scientific American has published an article on savantism, which rattled a few ideas loose in my head. A savant is roughly defined as someone with cognitive deficiencies — usually on the autism spectrum — who displays superior performance in one area. A savant may be unable to speak or dress without assistance, but able to play the piano. Savantism comes in degrees — one can be a savant by being an average piano player, given that they’re functionally disabled in all other activities.

More interesting, though, are savants who are prodigiously gifted — the sort that display skills that are by any measure incredible, all while experiencing severe disability.

Consider Kim Peek, the real life inspiration for Dustin Hoffman’s role in Rain Man and, as such, the most famous savant. (Wikipedia calls him a megasavant.) He passed away five years ago. The remarkable thing about Peek was his ability to immediately transfer information from short into long-term memory.

Compare Peek with some of the computational models of mind that I’ve built and borrowed from cognitive science on this blog. We can imagine the human mind as a sort of computer that takes in information from the environment, processes it, and the stores it in long-term memory. Notice that there are two distinct components here: a memory store and a reasoning component that operates on and processes mental structures.

In such a model, Peek looks sort of like a machine that has an excellent memory store but limited reasoning capacity. The article provides some evidence for such a view:

Peek’s abnormal brain wiring certainly came at a cost. Though he was able to immediately move new information from short-term memory to long-term memory, there wasn’t much processing going on in between. His adult fluid reasoning ability and verbal comprehension skills were on par with a child of 5, and he could barely understand the meaning in proverbs or metaphors.

Limits of Memory

In the Sherlock Holmes novels, there is a memorable passage where Watson is shocked that Sherlock doesn’t know that the Earth revolves around the sun. When Watson tells him that the Earth does, indeed, revolve around the sun, Sherlock informs him that he’ll try to forget this at once. While Sherlock’s memory problems are probably the result of his copious drug use, he explains it to Watson by means of a metaphor — the mind is a room and if one fills it with junk, one will never be able to find anything.

Does human memory have fixed limits? Is it a hard drive that runs out of space with time? The life of Kim Peek suggests not. From the Scientific American article, “His repertoire included the Bible, the complete works of Shakespeare, U.S. area codes and zip codes, and roughly 12,000 other books.”

To put that into perspective, let’s assume that the average human lifetime is 75 years and that one begins reading in earnest at the age of 10. This gives you 65 years of reading, or at the rate of a book a week, 3391 and a half books at the time of your death — or about a fourth of what Kim Peek had packed in long-term memory. The dude remembered every word — I’m lucky if I recall a vague sense of what the plot of a book was a year later.

Savants as Technical Masters

One characteristic that savants — even the prodigiously gifted — share is technical, rote mastery rather than creative performance. Musical savants might be able to memorize and play back a piece of music after one hearing, but unable to produce anything original. (Maybe originality is the realm of the reasoning component.)

Indeed, the “creative” achievements of most savants are boring. They may be able to recall a nature scene and sketch it from memory, but who cares? That’s what cameras are for. The Scientific American article put it this way:

The paintings that the patients produced were generally realistic or surrealistic without symbolism or abstraction, and the patients approached their art in a compulsive way, repeating the same design many times.

Contrast this with the artwork of schizophrenics (neat example here). Maybe I romanticize mental illness a too much, but if it’s one thing that schizophrenics have in spades, it’s symbolism and abstraction — the polar opposite of autism. There is even some evidence that autism and schizophrenia may be opposite sides of the same spectrum.

The Scientific American article likes to tease, however, suggesting that intense technical mastery and prodigious memory may eventually give way to improvisation:

Toward the end of Peek’s life, Peek showed a marked improvement in his engagement with people. He also began playing the piano, made puns, and even started becoming more self-aware. During one presentation at Oxford University, a woman asked him if he was happy, to which he responded: “I’m happy just to look at you.”

Further Reading