Monday, October 22, 2007

Theory as software architecture (or the other way around)

I read Scientific American magazines out of order. I just read the March '06 edition wherein I encountered a piece called "The Limits of Reason", by Gregory Chaitin. Early in his article, Chaitin paraphrases Leibnitz as essentially stating that "a theory has to be simpler than the data it explains, otherwise it does not explain anything." Chaitin goes on to say, a bit further on, "Leibniz's insight, cast in modern terms [is that] if a theory is the same size in bits as the data it explains, then it is worthless, because even the most random of data has a theory of that size. A useful theory is a compression of the data; comprehension is compression. You compress things into computer programs, into concise algorithmic descriptions. The simpler the theory, the better you understand something."

These observations struck me as an interesting lens through which to view software development.

We start development with a problem to solve. In top-down and spiral development, we develop a high-level architecture to guide initial development. This architecture is typically an explicit, abstract depiction of the structure of the software we intend to build, along with some specifications of other essential properties, e.g. performance.

This high-level architecture is actually a theory. The theory claims that there exists at least one program with the specified structure (and other essential properties) which solves the initial problem. Of course, this theory (architecture) may or may not be correct. To find out, we attempt, via programming, to discover a representative of the type of program predicted by the theory.

 Problem statement ->  Theory (architecture) -> Existence proof (program) 

If our programming efforts succeed, we prove the theory. If we fail, however, this does not, a priori, invalidate the theory; our empirical search is hardly exhaustive. However, often we discover things in programming that solidly disprove some element of our theory.

On a good day, we use these discoveries to generate a new theory similar to the first one, but accomodating the new facts. This is, of course, spiral development. With our revised theory in place, we can begin the search for a proof once again. This is spiral development, of course, and a successful outcome will eventually give us both a high-level architecture/theory, and a program/existence proof whose structure and other essential properties are accurately described/predicted by the architecture/theory.

On a bad day, we aren't doing spiral development. We're in a hurry, and so the theory (architecture) is abandoned, and the programming turns into a search for a program whose existence proves a much less stringent theory, to wit: there exists a program which solves the original problem.

 Problem statement -> Program 

Unfortunately, if we succeed only at generating a program that solves the problem, we have no theory to tells us something about the structure of the program. The program we generate this way will be a classic example of big-ball-of-mud architecture: everything is connected to everything else, and no compressed description --- no theory, no architecture --- of the program's structure or other essential properties can be derived. This program is highly entropic in both a practical and an information-theoretic sense.

Confronted with such a program and some time to make things right, we might attempt refactoring. In the "architecture as theory" metaphor, refactoring is a heuristic search through program-space for a more-compressible program. Given a compressible program, we can derive an architecture --- a theory --- after-the-fact, as it were.

So why do we care if we have an architecture for a program? If it solves the problem, isn't that good enough?

A theory predicts properties and behaviors of, well, whatever it is the theory describes. Correspondingly, a high-level software architecture predicts the properties and behaviors of the system it describes. This isn't just useful for understanding the software itself --- although that is extraordinarily valuable. It's also useful in reasoning whether or not the program really solves the original problem.

In top-down development, we make a theory which explicitly says things about the program we intend to write. But the theory also implicitly encodes a lot of information about the problem we're trying to solve --- or at least, our understanding of the problem.

When we start with a big ball of mud, though, we don't have an implicit encoding of the problem which is smaller than the program itself. So through refactoring, we're not only trying to discover a program for which we can manifest an underlying theory, but we're also deriving a compressed description of the problem this program is solving. This compressed representation gives us a much better chance of detecting mismatches between the problem actually being solved, and the problem we want to solve.

One final thought on the implications of this metaphor: architecture is generating a theory. Coding is searching for a proof for the theory. The more incorrect the initial theory, the longer it will take to iterate to a correct one. This process is at least as much science, in the laboratory research sense, as it is engineering. Is it any wonder it's hard to predict how long a piece of software will take to write?

I'll wrap up with credit where it's due: my thinking in this post has been heavily informed by papers posted by Scott Rosenberg in his "Code Reads" series. If you've read this far without falling asleep, you clearly like this stuff and you should definitely be reading his blog if you're not already.