How to design software

How to design software   programming

In his classic essay Lisp: Good News, Bad News, How to Win Big, Richard Gabriel outlines two different approaches to designing software, called the MIT approach or the 'Right Thing' and the New Jersey approach or 'Worse is Better,' and describes why he thinks the latter approach to design, while worse on all the metrics he cared about – he was an MIT man himself – had "better survival characteristics."

In his estimation, this was because while the MIT approach, valuing correctness, completeness, and interface simplicity and consistency over ease of implementation or rapid design, tended to take a very long time to design, be difficult to implement, and require powerful hardware, the New Jersey designs tended to be able to come out quickly with just enough of the problem solved to be useful, evolve rapidly, and because they optimized for developer simplicity above all else, tended to be easy to implement and run in lots of places.

The result of this essay has been disastrous in many ways. Instead of listening to his call to action at the end, to try to figure out the best way to adapt some of the better aspects of the New Jersey approach into the MIT approach, almost everyone took this essay to mean that Worse is Better really was better. They got the idea in their heads that being worse (simpler, but less correct and complete) was a goal in and of itself, not a tradeoff to be made when necessary in order to make something feasible. This created an unthinking dogma in the industry that first-principles thinking was bad, that any attempt to come up with powerful fundamental concepts and use them right was wrong, and any criticism of UNIX, C, and anything else that was labeled as "New Jersey style" was heresy. "Worse is Better" became a culture, and a thought-terminating cliche, because people took "has better (short-term) survival characteristics" to mean "is good," due to the myth that the market chooses what's best.

./somewhat.jpg

The fact is that this is wrong. While Worse is Better solutions tend to be useful more quickly and better adapted to their contemporary surroundings – to grow like a virus, to use RPG's phrase – those solutions tend not to be built on correct fundamental principles in such a way that when they try to grow and change, to incorporate things people need from them in the future (usually retroactively adopting aspects of the Right Thing solutions they killed), they tend to become horrible shaggy monsters. These 'Worse is Better' solutions, while they might be improved over time once their success is locked in, will never become good. We will never reach the point he predicted where we have a good operating system and programming language, and they're UNIX and C++. This is because skin-deep correctness isn't correct: anything "better" built on top of fundamentally inimical primitives will be a leaky, half-baked abstraction at best that is full of gotchas and exceptions, and the heights we can ultimately reach will still be limited by the need to endure exponentially more effort to paper over the bad bedrock abstractions.

I think this approach is slowly strangling the software we have access to, and we need something better. Every platform around us has the feel of something slowly rotting, and that shouldn't be how it is. What we need to do is take up RPG's challenge: figure out how to incorporate the best aspects of both approaches into something better. Few people have tried to do this yet, because it's far easier just to relax into a philosophy that doesn't require real effort – since it prioritizes simplicity of implementation over all else – than to actually sit down and figure out a philosophy of software design that can get us where we need to go, but we really need to do it.

I don't fully know how to do this, but the first step is to figure out what the essential beneficial properties of the New Jersey method are, and which are accidental properties. Right out of the gate, counterintuitively, I don't think simplicity of implementation actually matters that much to the success of the New Jersey style.

If you look at the world around us today, every single 'Worse is Better' solution has become massively more complex than literally anything the Right Thing crowd has ever suggested, in part because of the amount of complexity that has needed to be slathered on top of the bad core to make everything work, so how complex something is to implement isn't a direct problem, and nobody finds the fact that a whole second GCC/Clang or GNU or Linux aren't implementable off the cuff a problem, because no one actually ports or spreads software by doing that. Instead, they port or spread such software by making minimal changes to the existing code, and have for a very long time. The success of a piece of software really isn't meaningfully tied to how easily people can reimplement it, and I don't think it truly ever has been, except perhaps in the exceptional circumstance of UNIX specifically, and nothing else, due to its licensing situation.

Nor is simplicity that important for portability: if you can design your system in terms of a very simple virtual machine, and implement most of the complexity of your system in terms of that virtual machine, all that needs to be ported from place to place is that one very simple program, and the rest of your complexity can follow easily. (This is in fact what the Smalltalk people did.)

What actually makes it successful then? The one consequence of simplicity that I wasn't able to dismiss as unimportant or reproducible by other means above: rapid development time. While Worse is Better solutions can often vastly outgrow the complexity of Right Thing solutions, and can often end up being less portable as well (while also being less correct and complete!), they don't have to pay those costs up front. They can get out the door very quickly with most of a solution, and then amortize those costs over time.

This, I think – being able to move quickly – sums up the rest of the benefits of the New Jersey approach as well. If you can get develop whatever you're doing quickly and get it out the door quickly, then you don't have to worry about falling behind or being made obsolete, or running out of funding. You can also develop and iterate faster, which will give you a better idea of the actual real world conditions and what's actually needed.

How can we integrate this into the Right Thing approach, though? I think the best way to do it is basically to try to get something out as quickly as possible, but to get those fundamentals right and provide a means for others to, as easily and naturally as possible, expand your quick solution to become the complete solution it was always meant to be. In essence, built the 80% solution, but with an explicit eye towards others eventually making it the 100% solution.

Essentially, the life cycle I envision is something like this:

  1. Sit down and figure out what your goals are for your software project.
  2. Figure out the minimal set of features you'd need to complete and correctly solve those goals. Don't get distracted with secondary things.
  3. Take great care to make sure the system is programmable, extensible, and malleable, so that it can be extended on a first-class basis to do more things, or improve the completeness of the system in the future. No extensions, add-ons, or piping to external programs – you should be able to directly hook into the internals of the program as needed and redefine or improve anything, shape it to whatever future need there is.
  4. When trying to find this minimal, powerful set of options, don't get caught up in it being pure/perfect/beautiful. This is the route to never getting anything done. Focus just on direct, straightforward, simple, accessible power to do things with.
  5. Release this solution. To others, it will look like an 80% solution to the larger problem domain, but those with a good eye will see the correctness and completeness of your solution to the core problem, and more than that, they'll see that unlike New Jersey tools, you've provided malleability and programmability so that it can become a full 100% solution for them. As a result, they'll choose your tool and start building the remaining 20% of the solution.
  6. Wait. Let the community that grows around your software build their own 100% solutions with it. Encourage them to collaborate to create standard 100% solutions for their overlapping problem sets, to make them as complete and rich and featureful as possible, to document them.
  7. Whenever one such solution emerges as the clear winner, as long as it is powerful and complete, integrate it into the core of your original system – not by rewriting it in the host language (if it's different from the language used to extend it) but just by distributing it and its documentation with the core, helping keep maintenance going if necessary, taking a bigger role in keeping them updated on future core changes, and advertising it as part of your system. When you accept this solution, try to keep a close eye on what the Right Thing is.
  8. Eventually, you'll have a Right Thing 100% solution that's also battle-tested, proven, responds to user needs, and can dynamically change over time.

The benefit of this is that you amortize the costs of designing and building something complete and correct over time, and also distribute them across many different interested groups, so you can get your initial design out as quickly as possible. The core being small and everything else being built on top in an interpreted runtime on top also has the benefit of automatically making everything as portable as possible.

The trick here, of course, is to strike that balance between a powerful core design that forms the correct foundations for the future, and actually being able to get your design out on time. Especially since, if all this code is going to be built on top of your core program, depending on it very tightly but not managed by you, you really need to get the basics right, because you'll never really be able to change them.

There are ways in which both Common Lisp and Emacs embody this model; I think both of them did a lot more poorly than they could have for mostly historical accident reasons (in Common Lisp's case, the performance of computers in the era it emerged in and where it happened to emerge, and in Emacs's case, simply not being the type of tool most are willing to learn and it just happened to come into existence prior to modern human interface guidelines.)

However, one of the historical accident reasons that Common Lisp didn't win out is relevant to the discussion: performance. the New Jersey approach starts with trying to make something performant, whereas the MIT approach doesn't care at all about performance until after correctness is done. The former makes for things that are beautifully adapted to their original environment and painfully outdated in the future, the latter leads to things that only become viable in the future, but by then have been sidelined by evolutions of worse past technology. The solution, I think, is to start with correctness, but then incrementally degrade it as much as necessary to achieve reasonable performance, while leaving interfaces or specifications general enough – or setting up an edition system or something – so that improvements to correctness and completeness can be made in the future without breaking everyone's code.

I really don't know. In the long run, I'm a proponent of the MIT approach. I want it to win. But there is wisdom in the virus-like properties of the New Jersey approach, and we have to figure out how to adopt that – to not let perfect be the enemy of "as good as we can make it right now, and prepared for the future."