Developing Good Research Skills: Compressing Knowledge

I wrote a couple of days ago about how we can think of humans as agents who take in information from the environment, compress that information, and then store it in long term memory. I argued that interesting knowledge is knowledge which improves our ability to compress other knowledge — interestingness signals something is an upgrade to our compressor module.

With that in mind, consider what John Baez recently had to say about developing good research skills.

Keep synthesizing what you learn into terser, clearer formulations. The goal of learning is not to memorize vast amounts of data. You need to do serious data compression, and filter out the noise.

John follows this up with an example out of algebraic topology, a field (pun intended) which I know nearly nothing about and could not follow. The gist seems to be, though, that a whole lot of knowledge is a special case of other, more general knowledge (at least in mathematics), and by climbing Mount Abstraction we can compress old knowledge.

Tools for Compressing Knowledge

I have a head full of junk — disconnected facts, half-baked social theories, psychological studies, programming trivia. It would be very nice indeed if this were all organized around some core guiding principles — if it were a beautiful graph of knowledge, spreading out in every direction, tended like a self-organizing garden, refactoring and elaborating itself. How could I go from the current mess to something more like that?

Well, we can imagine a body of knowledge as a literal body, a corpse. We want to figure out the bones of that knowledge, the deep structure, and hang the rest of it — the facts or “flesh” — on it. The trick to getting at such a skeleton, or building your own, is to seek models. Mental processes, for instance, can often be understood in terms of computation — like when I speak of habits as cache lookups.

Then, when new information comes in, one can hang it on an already built skeleton — connect it to what you already know. John suggests comparing and contrasting different phenomena as a means of compressing his own knowledge.

The effectiveness of both of these techniques — connecting and contrasting — may be a side-effect of the benefits of translating knowledge into new, novel forms.

Pennington (1987) compared highly and poorly performing professional programmers. When trying to understand a program, high performers showed a “cross-referencing strategy” characterized by systematic alterations between systematically studying the computer program, translating it to domain terms, and subsequently verifying domain terms back in program terms. In contrast, poorer performers exclusively focused on program terms or on domain terms without building connections between the two “worlds.” Thus, it seems that connecting various domains and relating them to each other is crucial for arriving at a comprehensive understanding of the program and the underlying problem.

—from The Cambridge Handbook of Expertise

If translation is a component of compressing knowledge, a cheap way to implement this would be to transform ideas into words, writing. Consider how much of the scientific process centers around writing — authoring books, papers, taking notes. Is there evidence to suggest that writing facilitates knowledge compression? K. Anders Ericsson’s landmark paper, “The Role of Deliberate Practice in the Acquisition of Expert Performance” supports such a view:

The writing of expert authors on new topics is deliberate and constitutes an extended knowledge-transforming process, quite unlike the less effortful knowledge-telling approach used by novice writers (Scardemalia & Bereiter, 1991). In support of the importance of writing as an activity, Simonton (1988) found that eminent scientists produce a much larger number of publications than other scientists. It is clear from biographies of famous scientists that the time the individual spends thinking, mostly in the context of writing papers and books, appears to be the most relevant as well as demanding activity. Biographies report that famous scientists such as C. Darwin, (E Darwin, 1888), Pavlov (Babkin, 1949), Hans Selye (Selye, 1964), and Skinner (Skinner, 1983) adhered to a rigid daily schedule where the first major activity of each morning involved writing for a couple of hours.

We might suspect that writing is so effective because it forces knowledge to be retrieved and then restructured, sort of like taking iron ore, heating it, and then reworking it. Sounds a lot like compression, huh?

We can understand the benefits of creating and contrasting analogies through this notion of translation. After all, what is an analogy except a mapping — a translation — between two separate phenomena?

What I’m suggesting then, all together, is that knowledge compression can be understood as a process through which one takes dormant knowledge and transforms it. Among eminent scientists, this transformation has typically taken the form of writing with the intention of revealing the bones of some phenomena — discovering its skeleton. We need not limit this process of transformation to the written word, though. Transformation happens when translating an idea into mathematics, a computer program, drawing, music, or when attempting to teach it to another. (I know Dan Dennet writes about the benefits of explaining his ideas to bright undergraduates.)

In terms of subjective experience — what it’s like inside our mental workspace — we can think of it as the recall and then reinterpretation of a piece of knowledge. This reinterpretation need not be radical — it could be as simple as connecting two pieces of heretofore separate ideas, like compressibility and beauty, problem solving and graph search, or penalties as costs for implementing certain strategies.

The general, compressed principle, then is: to compress knowledge, recall that information (drag it into consciousness) and think about that information in some novel way.

You've read this far⁠—want more?

Subscribe and I'll e-mail you updates along with the ideas that I don't share anywhere else.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.