The Secretary Problem Explained: Dating Mathematically

marriage-proposal

I was, to put it mildly, something of a mess after my last relationship imploded. I wrote poems and love letters and responded to all of her text messages with two messages and all sorts of other things that make me cringe now and oh god what was I thinking.

I learned a few things, though, like when you tell strangers that your long-term relationship has just been bulldozed as thoroughly as the Romans salted Carthage, they do this sorta Vulcan mind-meld and become super empathy machines. Even older folk, who usually treat me not exactly as a non-person but something sorta like it. At the time, I had this gruff, Russian psychiatrist I’d see once a month, and he was all like, “Been there, man. Have some Diazepam and relax.” — except, you know, he said it in Russian-accented doctorspeak.

This was surprising to me then but isn’t now. Live long enough and you’ll have your heart thrashed about a fair bit, along with the rest of you. Mention heartbreak and everyone has their own private story — maybe more than one. It’s not Vietnam. They’ve been there and they understand.

I sometimes wonder — if I could go back in time, what could I say to comfort my former self? What can you say to someone that will pull them out of the throes of hormone-induced suffering? Probably nothing. The remarkable thing about words is not that they sometimes move people, but that they so seldom do.

Still, I think I’d say something like, “My boy, evolution is a motherfucker and you need a new woman in your life.” He would probably protest that women were the problem and that he’s pretty sure the last thing he needs is another one. Then, I would let out the most condescending sigh imaginable, the sort of sigh that says I-have-unimaginable-wisdom-born-of-experience-and-am-from-the-future, and say, “Not that sort of woman. You need the Queen. You need mathematics –”

“Let me tell you about the secretary problem.”

The Secretary Problem

Consider the plight of John. John’s 25. He lives in Utah and likes country music, hunting, and four wheelers. You probably see where I’m going with this. That’s right, ladies and gentlemen. John is gay.

The recent supreme court decision overturning gay marriage in Utah has him thinking. He’d like to settle down one day — maybe adopt a child with the right man. He has a couple short-term relationships going on right now, but married to Bruce or Sidney? No way.

How can he guarantee that he snags, if not Mr. Right, at least Mr. Close Enough? He figures he ought to date at least a few different men, and then… what?

Imagine he meets this guy, Jim. They make out at a party, hang out a few times, and realize that they’re kinda already dating, and decide to label it by making it Facebook official. Things progress. John asks Jim to move in with him.

Then there’s a snag. Valentine’s Day rolls around, and John finds himself, at the last minute, at Walmart, looking to pick up some chocolate and cheap Champagne, wondering, “Is this really what love feels like?”

What should John do in such a situation? Should he next Jim and take his chances on the dating market? Or should he settle and settle down?

John’s predicament is an example of the secretary problem — so named because we can imagine the same situation, except instead of a man searching for a husband, it’s a man interviewing potential secretaries. When is the candidate good enough? What’s the stopping criteria?

Formalizing the Secretary Problem

We can abstract away the specifics of John’s plight and formalize the problem. Let’s consider each man that John dates as an integer — the integer representing his “husbandness factor.” Thus, a sequence of lovers like \((Sidney, Bruce, Jim, Todd, Keith, Bruno, Terrence, Cecil, Nigel)\) would translate to the integer sequence \((1, 3, 7, 5, 8, 3, 1, 9, 4)\).

This problem would be trivial — just pick the max element — if it weren’t for two properties.

  1. There’s no look-ahead. When I’m dating any one person, I’m unable to look forward into the future and consider who I’ll date in the future. I have no crystal ball.
  2. There’s no undo. If I date a great girl for a while, but leave her in a misguided attempt to find someone better, there’s a good chance she’ll be unavailable in the future, married to some douche named Trevor who played lacrosse in high school.

We can think of it visually as a machine which is fed a tape of integers. It has two actions: it can either stop or it can consider the next integer. The machine’s objective is to stop on the highest integer.

tape

Real World Examples of the Secretary Problem

At the heart of the secretary problem is conflict. Do I reject the current possibility in hopes of landing something better if I keep looking, or do I stick with what I have?

Examples of the secretary problem:

  • This is the case with dating. I could commit to the woman I’m with right now or I could start texting her best friend.
  • It applies to hiring not just secretaries, but anyone. Is the current candidate the right person for the job or should I hold out for someone better? What if no one comes along?
  • When buying a house — should I put an offer on some house, or should I hope that something better comes along in the future? How many houses ought I look at before deciding?
  • The opposite side of the interviewing problem: should I accept this job offer or should I keep looking?
  • Alligator hunting, at least in Louisiana. Each year, you’re allotted a set number of tags based on the size of your property — that’s the amount of alligators you’re allowed to “harvest” under the law. When you stumble across an alligator, you’re forced to decide: should I kill this one or save my tag and hope a bigger one comes along?
  • When selling a house or a car or, well, any big ticket item. When presented with an offer, you’re forced to decide: should I accept this offer or hope something better comes along?
  • Deciding whether or not to buy something at the supermarket. Is this bread cheap enough or should I hope for a sale next week? The same goes for clothing and, well, anything.

Solving the Secretary Problem

The following contains the formal details for solving the secretary problem analytically. It can safely be skimmed.

As a general problem solving strategy, I often find it useful to first come up with a horrible solution to a problem and then iterate from there. I call this the dumbest-thing-that-could-work heuristic.

In the case of the secretary problem, our horrible solution can be the lucky number seven rule: In an integer sequence, always choose the 7th item.

If we follow this rule, we’re essentially picking an integer at random. The probability, then, of picking the best element from an integer sequence of length \(N\) with this rule is \(\frac{1}{N}\).

To beat this, let’s consider how people go about solving secretary problems in the real world. I don’t know anyone who’s dating strategy is, “I’m going to date seven women and pick the seventh one — no matter what.” One would have to be a staunch nihilist to adopt such a strategy.

Instead, the strategy most adults adopt — insofar as they consciously adopt a strategy — is to date around for a while, gain some experience, figure out one’s options, and then choose the next best thing that comes around.

In terms of the secretary problem, such a strategy would be: Scan through the first \( r \) integers and then choose the first option that is greater than any of the integers in \( [1,r] \).

number-line-for-secretary-problem

How does this new strategy compare to our old one? The above image is a prop to help understand the discussion that follows. Assume that \(i\), the greatest integer, occurs at \(n + 1\).

In order for this strategy to return the maximum integer, two conditions must hold:

  1. The maximum integer cannot be contained in \([1,r]\). Our strategy is to scan through \([1,r]\), so if the solution is in \([1,r]\), we necessarily
    lose. This can also be stated as \(n \geq r\).
  2. Our strategy is going to select the first integer, \( i \),  in \([r,N]\) that’s greater than \(max([1,r])\). Given this, there cannot be any integers greater than \(i\) that come after \(i\), otherwise the strategy will lose. Alternatively put, the condition \(max([1,r]) == max([1,n])\) must be true.

Thus, to calculate the effectiveness of our strategy, we need to know the
probability that both of these will hold. For some given \(n\), this is:

$$ \frac{r}{n}\frac{1}{N} $$

\(\frac{1}{N}\) is the probability that \(i\) occurs at \(n + 1\) (remember, this is the probability for some n, not the n), while \(\frac{r}{n}\) is a consequence of the second condition — the probability that the condition \(max([1,r]) == max([1,n])\) is true.

To calculate the probability for some \(r\), \(P(r)\), not for arbitrary \(n\), but for everything, we need to sum over \(n \geq r\):

$$ P(r) = \frac{1}{N}(\frac{r}{r}+\frac{r}{r+1}+\frac{r}{r+2}+…+\frac{r}{N-1}) = \frac{r}{N}\sum_{n=r}^{N-1}\frac{1}{n} $$

This is a Riemann approximation of an integral so we can rewrite it. By letting \(\lim_{N \rightarrow \infty}\frac{r}{N} = x\) and \(\lim_{N \rightarrow \infty}\frac{n}{N} = t\), we get:

$$ P(r) = \lim_{N \rightarrow \infty}\frac{r}{N}\sum_{n=r}^{N-1}\frac{N}{n}\frac{1}{N}=x\int_{x}^{1}\frac{1}{t}dt=-x \ln x $$

Now, we can find the optimal \(r\) by solving for \(P'(r) = 0\). By plugging \(r_{optimal}\) back into \(P(r)\), we will find the probability of success.

$$ P'(r) = -\ln x – 1 = 0 \Rightarrow x = \frac{1}{e} $$
$$ P(\frac{1}{e}) = \frac{1}{e} \approx .37 $$

What The Math Says

How can this math help John? Well, the optimal solution is for him to estimate how many people he believes he might reasonably date in the future, say \(20\). We plug this into the equation \(\frac{N}{e}\), where \(N=20\), \(\frac{20}{e} \approx 7\).

This result says that, if John wants to maximize his probability of ending up with the best possible man, he should date 7 men and, then, marry the next man who is better than all of those men.

However, we have sneaked some probably untenable assumptions into our analysis. The typical secretary problem maximizes the chances of landing the best man, and considers all other outcomes equally bad. Most on the dating market are not thinking this way — they want to maximize the probability that they end up with a pretty good spouse. It’s not all or nothing.

Maximizing the Probability of a Good Outcome

Fear not, there’s a modification of the secretary problem that maximizes the probability of finding a high-value husband or wife. I’m not going to cover the derivation for this flavor of the secretary problem in this post. (For technical details, see Bearden 2005), but suffice it to say, the strategy is the same except we use a cutoff of \(\sqrt{N}\) rather than \(\frac{N}{e}\).

Consider dating for the average American. Assuming one wants to settle down by the age of 35, one has the opportunity for somewhere between 7 and 30 sorta serious relationships. Taking the geometric mean, we get about 14. Johannes Kepler famously considered 11 women for his second wife, so this is, at the very least, not absurd.

The square root of 14 is about 4. Thus, according to the math, one should have four kinda serious relationships and then marry the next person that comes along who is better than all of those four.

How Human Behavior Compares to the Mathematics

The median number of premarital sexual partners is unclear, with different sources reporting markedly different numbers. I’m inclined to place the number between 1 and 4. Using this number as a rough proxy for the number of kinda serious relationships before marriage, reality conflicts with the results of the secretary problem.

Most people aren’t dating even four other people before marriage. What gives?

At its core, the conflict implies that either the solution to the secretary problem does not apply or that humans are not gathering enough information before getting married.

A number of experimental studies (here, here, here, and here) support the second view. When undergraduates are asked to participate in a secretary problem in its pure form — that is, like the tape discussed in the beginning — they almost always stop searching too early.

One might argue that evolution ought to know what it’s doing — especially when it comes to human mating — and that we should have a strong prior that dating is, in some sense, optimal. Such a view ignores that we’re no longer in small tribes of 50 to 200. Humans did not evolve to deal with modern society and the horror that is dating today. A preference for sugar was adaptive 50,000 years ago, but we have since invented the Twinkie.

Indeed, probably pre-civilized human life didn’t look a whole lot like a secretary problem. Back then, one might have had to choose from a half dozen possible mates — mates one had already known for many years. This looks more like a game of pick the maximal element from a set than a bona fide secretary problem.

What Sort of Optimal?

If I use the results of the secretary problem to find a wife, I will almost certainly end up worse off than a strictly rational agent who pursues the same goal, but probably better than those who have no strategy at all. At the end of the day, the secretary problem is a mathematical abstraction and fails to take into account much of complexity of, you know, reality.

The secretary problem assumes, for instance, that our only means of finding out about the distribution of potential mates and our preferences for them is via dating. This isn’t remotely true. We can observe the actions of others, introspect, read about human mate preferences, discuss our experiences with friends, and otherwise share information.

It’s also not the case that we’re dating men or women at random. There are a huge number of filters that go into deciding whether or not someone is marriage material. Are we of similar ages and interests? Do we speak the same language? Do I feel any attraction for this person?

A theory of optimal dating would need to take this and much more into account. There are a near unlimited number of paths to strategically choosing who to spend the rest of your life with, and a lot of that strategy consists of things other than choosing. You might try getting fit, earning more money, adopting interesting hobbies, honing social skills, meeting lots of the opposite sex, taking voice acting or improv classes, and so on. An optimal theory of dating would, I have no doubt, emphasize some subset of these skills.

All Together Now

marriage-proposal

  • The secretary problem is the problem of deciding whether or not one should stick with what they have or take their chances on something new.
  • Examples of secretary problems include finding a husband or wife, hiring a secretary, and alligator hunting.
  • The solution to the secretary problem suggests that the optimal dating strategy is to estimate the maximum number of people you’re willing to date, \(N\), and then date \(\sqrt{N}\) people and marry the next person who is better than all of those.
  • In laboratory experiments, people often stop searching too soon when solving secretary problems. This suggests that the average person doesn’t date enough people prior to marriage.
  • At the end of the day, the secretary problem is a mathematical abstraction and there is more to finding the “right” person than dating a certain number of people.

Further Reading

Worldbuilding, Worldbuilders, and Mathematics

This week, I was introduced to the hobby of worldbuilding — inventing imaginary places, making maps, elaborating histories. (The platonist in me prefers to think of worldbuilding as the discovery of fictional universes, rather than an act of invention.)

Tolkien

There is a (perhaps apocryphal) tale that J. R. R. Tolkien got into a fight with his publisher over using the words “elves” and “dwarves” instead of “elfs” and “dwarfs”. The publisher argued that the latter was how the dictionary did it. “I know,” Tolkien responded, “I wrote the dictionary.” Tolkien had, in fact, spent several years as an assistant working on the Oxford English Dictionary.

Although most known for his fiction, Tolkien was a linguist at heart, even inventing some eleven fictional languages along with a number of variations on those languages.

The remarkable degree of internal consistency of some Tolkien’s language use in The Lord of the Rings is perhaps unsurprising, then. The prefix “mor” in his work translates literally to black and is used consistently — Mordor is black land, moria is black pit, and morranon is black gate. The “dor” in Mordor means land. Gondor, as you might expect, means stone land.

Of his languages and Lord of the Rings, Tolkien wrote “The invention of languages is the foundation. The ‘stories’ were made rather to provide a world for the languages than the reverse.”

Foundations

It’s very far away,
It takes about half a day
to get there
if we travel by my, uh, dragonfly.
—Jimi Hendrix, “Spanish Castle Magic”

I have no such patience for languages. I have little interesting in learning a new one, except perhaps Lojban, and less still in attempting to invent my own, unless we’re speaking of programming.

But it does suggest a question: Where would I start with discovering a fictional universe? If I wanted to maximize the believability of a fictional universe, on what rock would I put it?

I’m thinking maybe economics. Robin Hanson recently pointed out that the movie Her, while enjoyable, is far from realistic. Humans invent human-level “strong” AI and use it… to chat with.

I doubt that’s how it’s going to play out in the real world. Why bother hiring a human to do any sort of computer work when an AI can do it faster, better, and cheaper? Newspaper reporter? Computer has it covered. Secretary? Computer. Author? Computer. Researcher? Computer. Teacher? Computer. Customer support? Computer.

Talk about a fuck up. In the real world, everything makes a perverted sort of sense. A causes B causes C, ad infinitum. In the real world, physics makes the rules, and we, well, we’re an expression of them. When worldbuilding, there is no physics ensuring that what you write is consistent. It’s up to you.

Discovering a coherent world draws on the same skills that are necessary for understanding our own world. When the movie Gravity was released, it was criticized for fucking up a whole lot of physics — vehicles in impossible orbits, backpacks with unlimited fuel supplies, and indestructible space suits.

That’s all ignoring messy details like relationship mechanics, attraction, and how language works. The movie Ted drives me up the wall — not because there is a magical, talking teddy bear, but because Mark Wahlburg and Mila Kunis’s relationship strikes me as absurd. A chick like that, decent career, and she’s with this unemployed man-child? Ri-ght.

All of which is to suggest that the skills and knowledge necessary to understand this world are the same needed to build your own: economics, game theory, physics, social dynamics, and so on. The converse is true, too: to understand this world, consider the sort of things you’d need to know to build your own. Or, as Feynman put it, “What I cannot create, I do not understand.”

The Fractal Nature of Worldbuilding

Worldbuilders often go out of their way to produce natural looking maps — like by spilling coffee on paper. Here’s an example:

coffee-stain-world

Another technique for generating maps: taking pictures of rusted fire hydrants.

world-building-rusted-fire-hydrant

But there’s a whole branch of mathematics for this sorta thing, fractals! Ever notice the self-similar nature of trees? Each branch looks like a small tree unto itself. Or coastlines — each “crevice” of a coastline itself looks like a coastline. Branches, snowflakes, crystals — all like this.

Indeed, ever after reading Benoit Mandelbrot’s The Fractal Geometry of Nature, I sometimes catch myself seeing fractals superimposed on clouds when I unfocus my eyes.

Mathematics and Worldbuilding

Okay, confession: I wasn’t 100% honest with you. While this is my first brush with groups of other worldbuilders, I’ve toyed with the idea in the past — after colliding with Tegmark’s mathematical universe hypothesis and reading Permutation City.

I wasn’t thinking about making maps. I was wondering: How could I model the essentials of emergent behavior? Is it Conway’s game of life — or something simpler? How could I simulate a universe? What do the fundamental laws of our universe look like?

And I started wondering if mathematics wasn’t a sort of world unto itself. A set of axioms with implications of the sort we could never anticipate — implications we are still discovering, who knows what it could lead to? And not one system of axioms, but infinitely many — each with different definitions, objects, theorems, branches, and applications.

In short, I started to think of the work of a mathematician as being a whole lot like worldbuilding. Discovering some object that obeys certain rules of logic and nothing more, and asking, “What does it do? Is such and such true of this thing? How does it behave here?”

I’m reminded of János Bolyai. Of non-euclidean geometry, he wrote, “Out of nothing I have created a strange new universe.”

Does the internet lie? (Hint: Yes.)

lying-on-the-internet

Yesterday, I saw someone spin this very plausible theory about why it’s so repellent when someone brags about their IQ on the internet. (For the record, each time I’ve been tested I’ve been told that I’m “off the charts” and “almost certainly the smartest man that has ever lived” — their words, not mine.)

It went something like, “Well, people who brag about their IQ on the internet are narcissists, who have nothing worth bragging about except their intelligence. That’s why they’re so off-putting.”

Except narcissts aren’t off-putting. Not at first, anyways. According to Back et al., “Narcissism leads to popularity at first sight.” Holtzman and Strube confirms, “A meta-analysis (N > 1000) reveals a small but reliable positive narcissism–attractiveness correlation that approaches the largest known personality–attractiveness correlations.”

The real reason it’s off putting? It’s probably false. The probability that someone is wrong about their IQ is, I’d estimate, at least one in four. They might have taken some fake online test, “misremember” their score, taken it at the age of seven, that sort of thing. Or, you know, they could be lying.

The distribution of pathological liars in the general population is not clear. Wikipedia suggests 1 in 1000 among repeat juvenile offenders, but given the prevalence of other mental illnesses — psychopathy at 1 percent and depression at 7 percent — I expect that’s a lower bound.

So, I guess what I’m saying is, these days, when I come across an unbelievable story on the internet, I try not to believe it.

Statistician on a Plane Joke

Speaking of probability and statistics, there is the story of a statistician who told a friend that he never took airplanes: “I have computed the probability that there will be a bomb on the plane,” he explained, “and although this probability is low, it is still too high for my comfort. ” Two weeks later, the friend met the statistician on a plane. “How come you changed your theory?” he asked. “Oh, I didn’t change my theory; it’s just that I subsequently computed the probability that there would simultaneously be two bombs on a plane. This probability is low enough for my comfort. So now I simply carry my own bomb.”
—from To Mock a Mockingbird

Overcoming Writer’s Block: Narrow To Generate Ideas

brain

Heuristic: Focus on concrete categories when generating ideas.

The brain is a stupid lump of fat. I sometimes say to it, “Brain, what ought I write about today?” and Brain goes, “Dunno, boss,” and then shuts off — starts humming some melody and wondering if anything has been posted to The n-Category Cafe lately. It’s like I’m on vacation in the Sahara, and Brain is driving, and I get out of the car to pee, and then Brain just takes off, leaving me twice deserted.

Sometimes I wait, thinking that Brain is running some sort of process, and if I just leave it alone for a bit, it will come up with something — as if Brain is running OS X and after I ask him for an idea that infernal loading beach ball pops up.

Except when I come back to Brain five or ten minutes later and go, “Hey, come up with any ideas yet?” Brain says, “Ideas? What ideas?” and looks at me sorta confused, like he’s not sure how he got here or why.

Or sometimes I sit and try to “let the thoughts flow.” Except instead of flowing its more along the lines of me switching on the faucet and hearing this terrible, mechanical screeching noise, followed by rust colored sludge — certainly nothing fit for human consumption, not drinkable or readable. This usually triggers a reflection on the fact that brains are just stupid lumps of fat, followed by despair and deletion.

That’s the beautiful reality of the creative process.

The Power of Narrowing

Given that any point in time, a person probably has at least one thing worth saying somewhere in their head, the trouble is finding it. That’s the issue with just asking Brain, “What should I write about?” Brain can’t find anything without a clue.

There is a passage from the book Zen and the Art of Motorcycle Maintenance where a student tries to write an essay on The United States. It comes due. The student misses the deadline. “Couldn’t think of anything,” she explains. The narrator tells her, “Narrow it down to the front of one building on the main street of Bozeman. The Opera House. Start with the upper left-hand brick.”

I reinvented this technique a few minutes ago when I realized that it’s a whole lot easier to write about a category like “thumbs” than a category like “anything.” The restriction, in this case, makes the problem easier.

But what if you don’t have any category at all — not even thumbs? Try generating one at random, or generate two and ask “What do these have in common?”

Online Community Building: Why Communities Decay

The first day of September 1993 was the beginning of an eternal September, a calendar month whose days stretched to infinity. Prior to this infamous day, there would be an influx of noobs onto Usenet each September. These were the arriving college freshman. They were not legion. They were few enough that they could be corralled and assimilated by Usenet veterans.

September 1993, however, was different. It was the day the gates of hell were thrown open and never-ending torrent of demonspawn descended on Usenet — like locusts, they devoured the community.

These locusts were AOL users. In September of 1993, the company granted Usenet access to their entire user base, which triggered an unending deluge of noobs into the Usenet community. Thus began the September that never ended.

Community Decay Over Time

youtube-comment

There’s a website dedicated to documenting terrible YouTube comments. Its tag line is, “The aim of this website is to document and preserve the most retarded YouTube comments, so that people a hundred years from now can look back and take solace in the fact that the authors of these stupid comments have all since died.” The dude running the website even posts an analysis of why each comment is awful. He’s doing God’s work.

YouTube’s awfulness is so infamous that there’s even a section on Wikipedia documenting it.

Or let’s consider Reddit as an example. Most Redditors agree that the default subreddits are awful, with r/funny, r/atheism and r/politics being the worst offenders. What do these have in common? They’re enormous — r/funny has more than 5 million subscribers.

If Redditors were one of those dolls where you pull a string and the doll repeats a fixed number of phrases, one of those phrases would be, “The default subreddits are terrible — stick to the smaller subreddits.”

We have this dichotomy, then, where the larger the subreddit, the more it sucks, and YouTube, one of the largest sites on the web, is horrendous. This suggests that the larger an online community, the worse it is.

community-size-vs-quality

The natural extension: beyond a certain threshold, adding more users reduces site quality.

The Trouble With Large Communities

Why do online communities get worse as they grow? What’s going on?

Consider the vampire bat. At night, it ventures from its cave, along with thousands of others of bats, and goes on the hunt. The vampire bat — like some sort of comic-book villain — has evolved a special brain region that enables the detection of hot spots on animals (usually goats, its favorite food). It’s a sort of infrared vision.

The bat must feed every two nights, but doesn’t always manage to find a goat. Instead, it often has to rely on the charity of other bats, who share blood after a successful hunt. The bat then pays this forward — sharing blood with those bats when they’re hungry.

As it turns out, bats are pretty decent game theorists. The hunting-sharing cycle is sorta like an iterated prisoner’s dilemma. If everyone shares, we’re all well off, and if everyone is selfish, a lot of good bats will starve, but if everyone else shares except me, great, since I get gallons of blood. To enforce cooperation, the bats implement a tit-for-tat strategy — if you share with me, I’ll share with you next time. This, plus a friendliness clause — sharing with unknown bats — is tit for tat.

Such a strategy breaks down when the share-or-not-share decision is not iterated. In a single shot game, prisoners defect — all the bats will be selfish. For blood sharing among bats to occur, they must frequently interact with bats they know.

Online communities are just like vampire bats sharing blood. I’m nice to my friends, to people I know, because I expect to see them again. That’s sorta what being my friend means. I like you, so I’ll be nice to you, and maybe that feeling of liking is evolution’s way of nudging me with “Hey, you’re going to see this person again. Cooperate!”

I’m not nearly as nice to strangers as I am to people I know. It’s the human condition. Those who claim to treat all beings equally are as naive as a child who tries to catch a rainbow and wear it as a coat.

There is no reason to trust people you will not interact with again in the future. There’s no incentive not to defect. At the end of my last relationship, our interactions became significantly less pleasant as it became more obvious that it was over — I would not have to deal with this person in the future, so why bother going through the motions of kindness?

That’s what happens in large internet communities. The probability that I will interact with any one user ever again on a site like YouTube tends toward zero. I have no real incentive to be polite or to put much effort into anything I say. Even my reputation will remain intact — who’s going to witness it?

In a smaller community, the opposite is true of the incentive structure. I will have to deal with this person again in the future and the stable set of regulars will probably see whatever it is that I do, coloring their opinion of me. Thus, I ought to act kindly and make an effort.

This kindness and effort, these are not calculated responses, not always. Much of it happens outside of conscious deliberation. The same cognitive hardware that evolved to deal with small tribes in the ancestral environment is repurposed for online discussion, and this manifests as emotion and nonconscious behavior — I like people that I see a lot and this pushes me to be more charitable. Or I feel more empathy towards regulars in a community, which affects my actions. And so on.

In summary, then:

  • Communities decay as they grow larger.
  • The development of trust and kindness between two people depends on the probability that they will interact in the future.
  • When communities grow to a certain size, people no longer expect to interact in the future, and thus are more likely to defect — to be petty, mean, aggressive, and to put little effort into their contributions.

Further Reading

What Savant Memory Says About The Limits of Memory

Scientific American has published an article on savantism, which rattled a few ideas loose in my head. A savant is roughly defined as someone with cognitive deficiencies — usually on the autism spectrum — who displays superior performance in one area. A savant may be unable to speak or dress without assistance, but able to play the piano. Savantism comes in degrees — one can be a savant by being an average piano player, given that they’re functionally disabled in all other activities.

More interesting, though, are savants who are prodigiously gifted — the sort that display skills that are by any measure incredible, all while experiencing severe disability.

Consider Kim Peek, the real life inspiration for Dustin Hoffman’s role in Rain Man and, as such, the most famous savant. (Wikipedia calls him a megasavant.) He passed away five years ago. The remarkable thing about Peek was his ability to immediately transfer information from short into long-term memory.

Compare Peek with some of the computational models of mind that I’ve built and borrowed from cognitive science on this blog. We can imagine the human mind as a sort of computer that takes in information from the environment, processes it, and the stores it in long-term memory. Notice that there are two distinct components here: a memory store and a reasoning component that operates on and processes mental structures.

In such a model, Peek looks sort of like a machine that has an excellent memory store but limited reasoning capacity. The article provides some evidence for such a view:

Peek’s abnormal brain wiring certainly came at a cost. Though he was able to immediately move new information from short-term memory to long-term memory, there wasn’t much processing going on in between. His adult fluid reasoning ability and verbal comprehension skills were on par with a child of 5, and he could barely understand the meaning in proverbs or metaphors.

Limits of Memory

In the Sherlock Holmes novels, there is a memorable passage where Watson is shocked that Sherlock doesn’t know that the Earth revolves around the sun. When Watson tells him that the Earth does, indeed, revolve around the sun, Sherlock informs him that he’ll try to forget this at once. While Sherlock’s memory problems are probably the result of his copious drug use, he explains it to Watson by means of a metaphor — the mind is a room and if one fills it with junk, one will never be able to find anything.

Does human memory have fixed limits? Is it a hard drive that runs out of space with time? The life of Kim Peek suggests not. From the Scientific American article, “His repertoire included the Bible, the complete works of Shakespeare, U.S. area codes and zip codes, and roughly 12,000 other books.”

To put that into perspective, let’s assume that the average human lifetime is 75 years and that one begins reading in earnest at the age of 10. This gives you 65 years of reading, or at the rate of a book a week, 3391 and a half books at the time of your death — or about a fourth of what Kim Peek had packed in long-term memory. The dude remembered every word — I’m lucky if I recall a vague sense of what the plot of a book was a year later.

Savants as Technical Masters

One characteristic that savants — even the prodigiously gifted — share is technical, rote mastery rather than creative performance. Musical savants might be able to memorize and play back a piece of music after one hearing, but unable to produce anything original. (Maybe originality is the realm of the reasoning component.)

Indeed, the “creative” achievements of most savants are boring. They may be able to recall a nature scene and sketch it from memory, but who cares? That’s what cameras are for. The Scientific American article put it this way:

The paintings that the patients produced were generally realistic or surrealistic without symbolism or abstraction, and the patients approached their art in a compulsive way, repeating the same design many times.

Contrast this with the artwork of schizophrenics (neat example here). Maybe I romanticize mental illness a too much, but if it’s one thing that schizophrenics have in spades, it’s symbolism and abstraction — the polar opposite of autism. There is even some evidence that autism and schizophrenia may be opposite sides of the same spectrum.

The Scientific American article likes to tease, however, suggesting that intense technical mastery and prodigious memory may eventually give way to improvisation:

Toward the end of Peek’s life, Peek showed a marked improvement in his engagement with people. He also began playing the piano, made puns, and even started becoming more self-aware. During one presentation at Oxford University, a woman asked him if he was happy, to which he responded: “I’m happy just to look at you.”

Further Reading

Effective Study Skills for College Students: “Why?” Questions

Consider two sentences:

  • The llama was made out of watermelon flavored cactus.
  • Policeman doe terminology star inconvenience recruit.

If I asked you to close this web page and then recall both sentences, you’d have an easier time with the first sentence. It has meaning and structure — even if a bit strange. I could make this still harder by adding a third sentence that’s just a jumble of letters. That would be less structured and even harder to recall.

Let’s say you’re reading a textbook, like Sedgewick’s The Algorithm Design Manual, and you come across the fact that \( \Theta(n \lg n) \) is the lowest possible complexity of a comparison sort algorithm. You could commit this to long-term memory as is — it’s true, after all. It would be connected to some other knowledge, like what you already know about sorting algorithms. This seems okay.

But, when doing something like this, you’re missing out on a whole lot of structure. If you forgot about the lower bound, you wouldn’t be able to regenerate it from what you already know. It’s connected to other knowledge, but it’s not recomputable. You’re forced to take Sedgewick’s word for the whole thing.

How can we absorb more of the structure of a piece of knowledge — to not be content with knowing a fact that someone else has stated, but to be able to recompute it, to solidly place it in our web of knowledge? The answer is the question, “Why?” There is a massive gulf between knowing that something is true and understanding why something is true. Being able to answer that why question makes all the difference — it forces you to absorb and understand deeper structural characteristics.

If I told you that 3 bits can represent 8 different values, you would not be able to answer the question, “How many bits do you need to represent 1729 different values?” But, if you understood why 3 bits can represent 8 values, that sort of question is trivial. It’s the difference between being able to regurgitate facts from Wikipedia and being able to solve novel problems — to understand the not yet seen.

Asking “Why is this so?” is an easy to implement strategy for absorbing a piece of knowledge, and connecting it to the rest of your beliefs in such a way that you can answer novel questions in the future.

Further Reading

  • I wrote recently about this whole structure thing in “Compressing Knowledge.”
  • Asking “Why?” while learning is sometimes called elaborative interrogation. There’s a review of its effectiveness, along with that of other learning techniques, here.

Why Psychology Is Not A Science

Doubt is not a pleasant condition, but certainty is absurd.
—Voltaire

I was on Reddit earlier, and an exchange went like this (perfectly illustrating why psychology is not a science):

Bob: Psychology isn’t a science. (downvoted)

Alice: I’m a neuroscientist and while a lot of psychology isn’t very good, you’re just not looking at the right sort of psychology. The media doesn’t report on the right sort of psychology because it’s hard to understand. (upvotes)

Jack: What sort of studies are good studies in psychology?

Larry: One of my favorites is Baumeister’s work on ego depletion — willpower as a fixed resource. It’s a model of what psychology ought to look like. It’s applicable and replicable. (upvotes)

Except, you know, the much venerated model that is Baumeister’s work on ego depletion — nearing 2000 citations — has failed to replicate a bunch of times, makes little sense from a computational model of mind, and is probably false.

At least most published research isn’t false, right? I have some bad news.

Developing Good Research Skills: Compressing Knowledge

I wrote a couple of days ago about how we can think of humans as agents who take in information from the environment, compress that information, and then store it in long term memory. I argued that interesting knowledge is knowledge which improves our ability to compress other knowledge — interestingness signals something is an upgrade to our compressor module.

With that in mind, consider what John Baez recently had to say about developing good research skills.

Keep synthesizing what you learn into terser, clearer formulations. The goal of learning is not to memorize vast amounts of data. You need to do serious data compression, and filter out the noise.

John follows this up with an example out of algebraic topology, a field (pun intended) which I know nearly nothing about and could not follow. The gist seems to be, though, that a whole lot of knowledge is a special case of other, more general knowledge (at least in mathematics), and by climbing Mount Abstraction we can compress old knowledge.

Tools for Compressing Knowledge

I have a head full of junk — disconnected facts, half-baked social theories, psychological studies, programming trivia. It would be very nice indeed if this were all organized around some core guiding principles — if it were a beautiful graph of knowledge, spreading out in every direction, tended like a self-organizing garden, refactoring and elaborating itself. How could I go from the current mess to something more like that?

Well, we can imagine a body of knowledge as a literal body, a corpse. We want to figure out the bones of that knowledge, the deep structure, and hang the rest of it — the facts or “flesh” — on it. The trick to getting at such a skeleton, or building your own, is to seek models. Mental processes, for instance, can often be understood in terms of computation — like when I speak of habits as cache lookups.

Then, when new information comes in, one can hang it on an already built skeleton — connect it to what you already know. John suggests comparing and contrasting different phenomena as a means of compressing his own knowledge.

The effectiveness of both of these techniques — connecting and contrasting — may be a side-effect of the benefits of translating knowledge into new, novel forms.

Pennington (1987) compared highly and poorly performing professional programmers. When trying to understand a program, high performers showed a “cross-referencing strategy” characterized by systematic alterations between systematically studying the computer program, translating it to domain terms, and subsequently verifying domain terms back in program terms. In contrast, poorer performers exclusively focused on program terms or on domain terms without building connections between the two “worlds.” Thus, it seems that connecting various domains and relating them to each other is crucial for arriving at a comprehensive understanding of the program and the underlying problem.
—from The Cambridge Handbook of Expertise

If translation is a component of compressing knowledge, a cheap way to implement this would be to transform ideas into words, writing. Consider how much of the scientific process centers around writing — authoring books, papers, taking notes. Is there evidence to suggest that writing facilitates knowledge compression? K. Anders Ericsson’s landmark paper, “The Role of Deliberate Practice in the Acquisition of Expert Performance” supports such a view:

The writing of expert authors on new topics is deliberate and constitutes an extended knowledge-transforming process, quite unlike the less effortful knowledge-telling approach used by novice writers (Scardemalia & Bereiter, 1991). In support of the importance of writing as an activity, Simonton (1988) found that eminent scientists produce a much larger number of publications than other scientists. It is clear from biographies of famous scientists that the time the individual spends thinking, mostly in the context of writing papers and books, appears to be the most relevant as well as demanding activity. Biographies report that famous scientists such as C. Darwin, (E Darwin, 1888), Pavlov (Babkin, 1949), Hans Selye (Selye, 1964), and Skinner (Skinner, 1983) adhered to a rigid daily schedule where the first major activity of each morning involved writing for a couple of hours.

We might suspect that writing is so effective because it forces knowledge to be retrieved and then restructured, sort of like taking iron ore, heating it, and then reworking it. Sounds a lot like compression, huh?

We can understand the benefits of creating and contrasting analogies through this notion of translation. After all, what is an analogy except a mapping — a translation — between two separate phenomena?

What I’m suggesting then, all together, is that knowledge compression can be understood as a process through which one takes dormant knowledge and transforms it. Among eminent scientists, this transformation has typically taken the form of writing with the intention of revealing the bones of some phenomena — discovering its skeleton. We need not limit this process of transformation to the written word, though. Transformation happens when translating an idea into mathematics, a computer program, drawing, music, or when attempting to teach it to another. (I know Dan Dennet writes about the benefits of explaining his ideas to bright undergraduates.)

In terms of subjective experience — what it’s like inside our mental workspace — we can think of it as the recall and then reinterpretation of a piece of knowledge. This reinterpretation need not be radical — it could be as simple as connecting two pieces of heretofore separate ideas, like compressibility and beauty, problem solving and graph search, or penalties as costs for implementing certain strategies.

The general, compressed principle, then is: to compress knowledge, recall that information (drag it into consciousness) and think about that information in some novel way.