Category: Programming and Software

1. A plea for simpler FP
2. Why I'm obsessed with Common Lisp
3. The two types of programmers
4. Formal methods and dependant types are a trap
5. We need to get away from "systems" languages
6. Programming languages are tools for thought
7. Limiting language expressiveness is a technical solution to a social problem (a category error)
8. TODO Structural type systems and multiple dispatch are the future of type systems
9. Why I prefer Fedora Atomic (plus BlueBuild) to Nix OS
10. Hand tools and workbench tools
11. TODO Stallman needs to die
12. How to design software
13. The beauty of Common Lisp
14. What makes a good programmer
15. The rise of "Whatever" Machines
16. Does AI make programmers slower?

1. A plea for simpler FP

Software development is a terribly young industry in comparison to other similar fields like engineering and architecture. In so many ways, we have no idea what we're doing; we can't even agree on basic methodological issues, let alone more complex questions, and we've got almost zero empirical evidence – even meaningful case-studies, let alone proper scientific studies with decent sample sizes and good design – one way or the other for most practical debates (memory safety and static types are really the only things we have empirical evidence in favor of).

Meanwhile the theoretical/academic branch most related to our industry, computer science, has almost totally left us to our own devices to climb the ladder of Platonic abstraction toward ever more pure and simple mathematics, and while concrete instantiations of those abstractions do regularly show up in everyday code, understanding the most abstract category a common thing belongs to is rarely practically useful in everyday life. More generally, the field just seems mostly concerned with problems different from those of day-to-day construction of good software people actually use, such as computability, models of computation, mathematical proofs, and so on. Most academics in computer science aren't even concerned with thinking about how to write software that runs consistently, interacts with users and the external world, has to adapt to unclear or dynamic requirements, or design programming languages in terms of holistic systems of software, practice, and people, preferring to work on software that runs once, with a given set of static inputs, and produces a final output, or never even runs at all.

Even the academic discipline that might come closest to finding something like real engineering practice for software development, formal methods, has tradeoffs that make it impractical or hard to accept for practical programmers (in fact, I'd argue, serious methodological flaws as well).

This has led to the software development industry, desperate for some guidance as to how to approach building software, to fall under the sway of an endless parade of absurd fashions, each pitched in breathless terms as the solution to having reliable, maintainable code, avoiding errors, delivering software on time, and a million and sundry other things. Each of these fashions had some good points – a kernel of truth – but because their ideas were applied dogmatically, their prescriptions treated as sacred, their doctrines treated as fixed ideas, they became more harmful than helpful.

Among all of these, Object-Oriented Programming has probably been the most devastating. The OOP craze had a few crucial features:

It was embraced with religious fervor, an ever growing set of hard-line activists pushing it as the ultimate sacred law and the secret to good software. Anyone who didn't agree was ridiculed as stupid or blind. Eventually, the industry at large was convinced, and OOP ruled as an iron-fisted god-king over the industry, its sacred laws absolute and incontrovertible, taught in schools, written into textbooks and required by managers, specifications, and even regulations.
Its solutions were applied like violence: more was always the answer. If you found it awkward, unwieldy, or ran into problems with its abstractions, either it was because you didn't understand it, or because you weren't using it right, or because you simply hadn't done enough of it and you needed to use more. Escape hatches were non-existent, or treated as Satanic anathema if they did.
It focused on modelling your domain, and the behavior of your program, through restricting you to a highly limited set of concepts that didn't necessarily map easily onto the most natural mental model of all problems or domains. It was this restriction of your applied ontology to this tiny set of entities that was supposed to bring all the benefits; somehow, sucking concepts and reality through this thin straw would prevent errors.
As a result of point (2), in an attempt to achieve things that would have been trivial had point (3) not also been the case, ever more complex and abstract constructs, known as design patterns, were created to compensate. Whole books were written about them. Often there would be claims that using these design patterns, instead of the more direct constructs other languages or methodologies used (such as procedures, first class functions, multiple dispatch, even careful usage of multiple inheritance, to name a few) to express the same concepts, was beneficial in some way, but whatever benefits could be had were vastly outweighed by the added complexity and abstraction.

Nowadays, in the 2020s, we know that OOP doesn't work as a primary methodology – that some of its core concepts were more harmful than helpful (trying to create a rigid is-a hierarchical taxonomy as the ontology of your software is unlikely to work out, because things don't work that way in the real world), that its attempt to rigidly limit the vocabulary software developers could use to conceptualize a problem was detrimental, and that it didn't deliver on its promises of decreasing bugs.

So that's all well and good right? We're safe now. No more dogmatic ideologies will come to take over the software industry, because we've learned our lesson?

I hope so, but I'm worried we haven't.

Everyone is so focused on being relieved that we finally left dogmatic OOP behind, taking with us only the individual constructs and techniques from it that were legitimately useful, writing retrospective articles explaining why it's bad, that they're not on the lookout for any other similar fads that may do equal damage to our industry. But I think I see dust clouds approaching on the horizon, and I don't think it's just the wind. See, this typed pure functional programming thing? I think it could easily turn into another OOP. It bears all the same signs, after all:

Dogmatic advocacy as the solution to all our problems.
Solutions applied like violence. It claims to be able to do things in an ideologically pure way that it actually cannot.
Focused on rigidly limiting the set of things your language can comprehend directly (such as side-effects and mutation), instead of trying to provide a powerful multiparadigm set of concepts for building a useful ontology.
Using an ever-higher tower of abstractions to claw back the features lost due to the previous point, in the process developing a library of extremely abstract and often confusing design patterns that everyone must become familiar with in order to even understand basic code. Whole books are written.

Maybe I'm wrong on this. Maybe, despite the warning signs, this time it's truly different. Maybe pure FP really is the one true ideology that will actually work, and we should all bow our heads.

I don't think so, though. Typed pure functional programing certainly doesn't suffer from the exact same problems OOP did, but I think we're going to have an extremely harsh awakening as an industry if we adopt it wholesale, as a sacred law, the way we did OOP. We're going to wake up after our pure FP bender with a massive hangover and our entire life savings drained for the second time, and we're going to have to slowly pick up the pieces and revamp our text books yet again.

The fundamental reason for this is simple. The core problem in software development is complexity: everything is about figuring out how to reduce accidental complexity and cognitive load to the minimum, and then efficiently and effectively manage the rest. And while typed pure functional programming has useful ideas to offer in reducing some forms of complexity – the dependencies between things, by sequestering mutable state and side effects, and preventing incorrect states from occurring – when taken as an end in itself, purity necessarily excludes being able to represent certain ontologies as well as certain operations like iteration, mutation, state, and side effects directly in the base language. This matters because you really do need things like iteration and mutation fairly often, requiring you to forge through multiple layers of abstraction and syntactic sugar to represent that functionality. Abstractions which do not compose, that leak¹, that are very difficult to explain because they have no concrete referent, and which have a huge number of other drawbacks. This is similar to how, since Java lacks multiple dispatch, it needed to invent the visitor pattern, and since it lacked lambdas, it had to invent the strategy pattern, and so on. Thus, trying to achieve total purity just adds another form of complexity back in through cognitive load and accidental complexity. Some claim Haskell is the "best imperative language" on the back of things like this, but these really have all the same drawbacks (as admitted in that blog post) under the hood, just with nicer looking syntax.

You and I might find concepts like monads, monoids, applicative functors, and so on, easy enough to understand when we're reading a Haskell blog post during lunch break, maybe while sipping coffee if you go for that sort of thing, but the set of abstractions that pure functional programmers need to use to get around purity is not closed – it is continually expanding – and we also need to remember that even if we can understand these abstractions under good conditions, such abstractions ultimately still impose cognitive load on us, still require effort to wade through. Do we really want to be adding insanely complex Haskell type system shenanigans, debates over type hierarchies, complex type inference errors, and pages of lambda calculus-looking type errors on simple imperative code on top of the already-existing complexity of our problem space and the existing code base? Do you really want to be wading through applicative functors at 9 A.M. when prod is down and you just need to fix a bug?

Especially when, in many cases, you don't need to use such abstract concepts to achieve the same goals. You can carefully control mutability – making sure it's demarcated wherever it occurs, and avoiding invisible dependencies between completely different places in your program thanks to mutable references, keeping the flow of data modification on a linear path through your program – without a State monad, via something like Rust's borrowing model and explicit mutability labels. You can make side effects explicit through object capability systems, which work not using abstract category theory types, but with simple straightforward tokens that you pass around and add to your type signatures. You can get rid of boilerplate with Lisp macros, instead of do notation (Haskell does not have macros) or with carefully-chosen simpler language features (Scala implicit parameters would go nicely with object capabilities!). It's not even clear you need typeclasses at all if you have basic functional programming features or generic methods with multiple dispatch, although I like them. Even the need for static types is unclear. Some research suggests that it isn't static types that make your code high quality (in both studies, the most reliable language on that graph is Clojure, which is immutable and functional, but not statically typed, and in the second study there's no statistically significant relation between static types and reliability), and you can get 80 percent of the benefits of dependent types (much more powerful in what they can specify than even Haskell's types) through things like Clojure Spec and Malli (which can look at your specs and automatically generate test values and verify them, intelligently searching through the space to find the minimal example that will violate your specs), or systems like Common Lisp's type system (see also) in SBCL, or Ada, where a fairly powerful but also quite simple and easy to understand type system is available statically, and then more complex things can be specified in the same way as regular types, but are asserted at runtime.

My hope is that things will keep on as they have been, with the pure typed functional programmers standing on the street corners foretelling doom and exhorting those on their way to their day to day jobs to mend their ways and join their religion, and there's a larger chance than with OOP that that will remain the case, since OOP was easier to understand and thus easier for people to get enthusiastic about, but I worry sometimes…

2. Why I'm obsessed with Common Lisp

Although I haven't had much chance to use it yet, one of the languages that I have been utterly obsessed with for years is Common Lisp. This, of course, is not a new phenomenon amongst hackers like me. However, I think it's worth articulating why I personally am interested in Common Lisp, because the programming language landscape has changed drastically since the earliest hacker essays singing its praises were written, which has led to some of the points they make becoming obsolete – or more nuanced – and many more of their points being claimed to be obsolete by those who for one reason or another can't or don't want to use Common Lisp (which is valid by itself, but leads to motivated reasoning).

I think my reasons may also be worth articulating because I come from a different background, and with different preferences, than many of the earlier hackers whose famous essays extolling Lisp have made their rounds for decades on the internet. For one thing, I'm much more familiar with ML-family typed functional programming, and have a strong affinity for it, despite my occasional criticisms of some of that world's excesses. For another, I have a strong affinity for soft-real-time programming like real time graphics and game engine programming, and experience with Rust, so I do actually care about performance and being able to access low-level constructs in my languages, and I don't buy into the idea of the "sufficiently smart compiler" with total religious fervour.

Also, it's worth saying a word about my background here. One of the earliest books I cut my teeth on was Land of Lisp, when I was in my early teens, and then I moved on to Realm of Racket, so I have a long history of on-again-off-again fascination with the Lisp family of programming languages. This biases me in the sense that, while I find visually matching deeply nested sets of parens as hard as anyone else, I don't actually find ignoring those parens for the most part and simply reading Lisp code difficult at all, and I actually find the clarity around which expressions end where and how they're nested that S-expressions provide very helpful. I leave dealing with the actual paren-balancing up to electric-pair-mode and puni-mode.

Now, on to the properties of Common Lisp that I personally find highly unique and fascinating.

2.1. Homoiconicity

Much ink has been spilled on this subject, but I think a lot of the conversation is so muddied in ambiguity and talking past each other that the point gets totally lost in the weeds. Let me try to state the idea as clearly and concisely as I can, specifically with an eye toward explaining how this is different from more traditional languages that may have things like eval and AST manipulation (e.g. Template Haskell), instead of just saying it is and then focusing on singing Lisp's praises.

Your language is homoiconic if and only if:

The syntax of your language for any given snippet of code A is identical to the syntax in your language for representing A as:

A tree of symbols and literals, thus carrying structure and differentiation which…

…/is not/ yet an abstract syntax tree (does not have particular semantics tied to any of the elements in the tree, thus not limiting how the tree can be constructed, deconstructed, or manipulated)

…is represented using one of the most common and well-supported-by-the-standard-library data structures in your language.

Thus a language is not homoiconic simply because it has a string type, because while in that case there is technically a data structure in the language that can represent code in that language with identical syntax, in order to do that, you're having to use the least meaningful, least structured data type of them all, an array of bytes. It's not a tree and it makes no distinction between names and concrete values, so it carries no structure and differentiation, and is thus infinitely harder to work with. Having to use strings for metaprogramming makes you do endless error-prone, tedious string munging with concatenation and regular expressions, instead of clear and clean structural manipulations.

Likewise, a language is not homoiconic simply because it has an AST that can be accessed at runtime or compile time. Every parser generates an abstract syntax tree where each syntactic element in that code is represented by a directly analogous element in the abstract syntax tree, but the syntax in the language for representing that AST is invariably different than the syntax in the language that leads to generating that abstract syntax tree, and an abstract syntax tree is too rigid – it already has specific semantics encoded into it, not just in the names, but in how the tree can be constructed, deconstructed, and manipulated, that are far too rigid for doing some of the things macros can do.

The benefit of homoiconicity is that it allows you to transform programs:

in an abstract, structural way, instead of awkward byte-munging or text-munging,
using the same features and library functions you use every day for other things, meaning both familiarity and wide support, instead of it being a weird edge case thing,
retains flexibility to modify or even ignore the language's normal syntactic constructs and semantics,
in a way where what syntax will produce which data structures and the general structure of programs is extremely clear and unambiguous, which is important for something like metaprogamming.

Note that even Scheme fails this definition on the final criteria: in removing slots from symbols in the name of theoretical purity, Scheme lost any way to store debug information on regular symbols, thus necessitating the introduction of syntax objects, which are a sort of mirror universe of lists, literals, and symbols that do have extra metadata (almost like slots!) attached to them, but which aren't compatible with all of the functions for manipulating lists and symbols and literals in the rest of Racket, necessitating extensive conversions back and forth (because Racket lacks generic programming facilities by default, which also makes it horribly verbose) or their own shadow library.

2.2. Procedural non-hygienic macros

Leading on from the previous section, the most important part of homoiconicity is really what it enables: macros. I'm not going to spend time explaining what Lisp macros are here, if you're on my website I'm sure you either know what they are and how they work, or can research it for yourself. Instead I'm going to spend some time here defending why Common Lisp macros, in particular, are interesting to me.

First of all, while many languages these days have macros, such as Rust and Haskell (via Template Haskell), without homoiconicity and a language that's built from the ground up to readily support them, these attempts are often hamstrung, awkward, and bolted-on, probably better replaced with something that integrates with their type systems like Zig's comptime. For a brief look at what I mean, I recommend checking out these links:

Second of all, while many languages have syntax that makes it easy to build domain-specific languages without needing macros – which is often claimed to be "80% as good" as having macros – these sort of DSL facilities often rely on extremely ugly and involved internal implementations, usually relying on obscure language edge cases and nearly accidental syntax rules that probably weren't really intended to be used that way, as well as a ton of hidden machinery and state, all of which is easy to break between language versions, difficult to understand, and can easily lead to bugs, instead of the extremely simple and straightforward concept of taking in a tree representing code and returning a tree representing the new code to replace the expression that called the macro as in Lisp. Thus even if these facilities can give you 80% of what you want from macros, they add complexity and confusion, and often require understanding a lot more, instead of reducing it, so this doesn't seem like a good 80-20 tradeoff, unlike for instance using clever language syntax features to eliminate the need for advanced category theory.

Then there's often the objection that macros are actually bad to have in a language, because they allow you to construct "mini-languages" with different syntax and semantics, thus making your code harder for other people to understand. I have a few responses to this.

At a most basic level, I think this is ultimately a misunderstanding of how code comprehension works. Any system of abstractions is a new language for talking about that thing – whether that system of abstractions is implemented using heavy-handed OOP, functional programming, procedures and structs, or macros – and any new language for thinking and talking about things requires you to learn how to communicate with it – its own syntax, semantics, and rules for operation, what is and is not valid. I don't see the difference between a language introduced via macros and one introduced via other means, except that macros tend to be a less leaky abstraction; you don't have to think about the underlying language machinery as much. They're all just forms of abstraction. Of course, you should use the right tool for the job – the least powerful construct that will let you efficiently and effectively achieve what you need to do, weighing tradeoffs between repetition, performance, developer ergonomics, complexity, and comprehensibility – so macros shouldn't be used all the time, because they are more powerful than many other language features, but I don't think they're a fundamentally different category of thing.

Furthermore, people manifestly do need to create such domain specific languages, even ones that seem to suspend the traditional semantics of the language, very often. Just because you don't use Lisp and don't have macros doesn't mean you're not free from that need. Wanting to be able to force-mulitiply your coding through metaprogramming, to have the compiler automatically deal with common patterns, or move up and down the ladder of abstraction, or improve your language to allow it to speak in terms of the domain you're dealing with, are often all very necessary things. So you don't avoid metaprogramming – you just end up using awkward hacks like templates, Java compiler plugins, decorators, or stringly-typed eval hacks, to get where you need to go, or a patchwork of different languages poorly duct-taped together, instead of a consistent, elegant, reliable, and generally easy to understand way of doing the same thing. Now you might argue that there's a benefit to a language making metaprogramming painful, in preventing people from doing it too much, but I argue that's a category error.

Another point is that yes, maybe having a more flexible language with a larger domain-specific vocabulary might make bringing new people onto a project harder. That isn't really a problem with Lisp itself, though – it's a problem with its fit for the modern software industry. And why take the modern software industry's method of treating people as interchangeable deskilled cogs to be swapped in and out, as bodies to be thrown en masse at a problem, instead of valuing small teams of dedicated hackers working for long periods of time on a project and in the process becoming domain experts, as an ultimate good? Maybe the fact that Lisp ill-fits the modern software industry is a basis for an indictment of the modern software industry, not Lisp.

Why am I specifically happy that Common Lisp's macros are unhygienic and procedural, though? Aren't Schemes macros better? I'm not actually so sure. For one thing, Common Lisp mostly solves the problem of symbol clashes through its package system – all quoted or quasiquoted symbols are automatically implicitly qualified to the current package, so as long as you define your macros in a separate package from where you use them, no clashes should be possible, and if you want to expose a macro-defined variable to the user code inside the macro, the user has to pass in the symbol they want you to use, just like in Scheme's hygienic macros. Even beyond that, it's relatively easy to avoid any other possible clashes through gensyms. Granted, gensyms are pretty awkward by default, but it isn't hard to define a macro that takes care of most of the problems for you (like a gensym-let) or even a reader macro that automatically gensyms any symbol you prefix it to, and stores that in a macro-local table so references to it with the same reader macro prefix elsewhere in the same macro will refer to the same symbol, but ones in other macros won't (this is how Clojure does things). And hell, as a last resort, the fact that Common Lisp is a Lisp-2 reduces the chance of an accidental collision by 50% and the chance of a cross-intent collision (where you get a type error for calling a variable as a function or vice versa) to 0%.

As to why I don't want Scheme's hygienic macros, my issue with them isn't that you can't do the same thing with them that you can in Common Lisp – such as anaphoric macros – because I'm well aware that you can. The issue is that thanks to the need for hygiene, macro systems in Scheme are much more complex, and have many more intermediate layers of abstraction. In Common Lisp, there's only one kind of macro, and one way to construct them, by default – defmacro – and it's just the basic concept of "a function that runs at compile time that takes some S-expressions, does some things, and then returns some S-expressions", and the full power of the macro system is immediately available at the surface. Meanwhile, Scheme has a whole complex tower of not-quite-orthogonal consecutive abstractions, from syntax-rules, to syntax-case, to using just raw define-syntax and syntax objects directly; and while syntax-rules is simpler and more concise than defmacro, it's not that much simpler and more concise, and it's a lot less powerful; meanwhile using raw define-syntax to get something like defmacro is painfully verbose and awkward, thanks to Scheme's compromised homoiconicity – and that compromised homoiconicity is, in fact, also made necessary by the hygiene (as well as the removal of property lists from symbols).

The reason I don't like this is not just the added number of macros, functions, and semantics needed to learn how to use macros, but the fact that it requires you to climb this ladder of abstractions to get at the core idea and functionality of macros, and experience this sudden discontinuity in the concepts when you go from template based macros to procedural ones, instead of it just all being direct, consistent, simple and laid bare from the start, all for very little gain. I willingly admit it isn't all that complicated in the grand scheme of things, but it's just extra ugly abstraction and concepts interposing themselves between me and the beauty of metaprogramming in Lisp. Moreover, the actual details of how hygiene works are often obscure, complex, and difficult to understand – multiple PhD theses have been forged just out of this subdomain of Scheme implementation, and the details are constantly shifting and getting more complex in most implementations.

Macros aren't just useful for writing domain-specific languages or adding a little syntactic sugar, either. Since they allow you to access the full power of the entire language – the same exact language you know and love, with identical semantics – at compile time, including the ability to change the state of the compiler, load or unload packages, and even run side effects, you can use macros to add entire new features to the type system of the language, or introduce new optimizations to the compiler (even ones from other languages), or even write entire new languages with completely different semantics that compile down past Common Lisp to faster assembly code than regular Common Lisp can produce or adding JIT compilation for array computations for highly parallel computer architectures. In Scheme, this is impossible thanks to the the phase level system, which also adds more complexity still to the macro system, further illustrating my point.

Despite all this talk about how important it is that macros operate on a tree of data, however, there is one more kind of macro that Common Lisp has that Scheme has nothing like: the reader macro. While regular macros, as I described above, operate at compile time, and see your code after it's already been parsed according to the existing S-expression syntax, reader macros see your program when it's still raw text, and can introduce entirely new syntax, breaking the regularity of S-expressions the same way the quote and quasiquote syntaxes in Common Lisp do – because nothing that CL the language does is off limits to you, the programmer! Obviously, the regularity of Common Lisp's syntax, and how directly it corresponds to its underlying data structures is a huge selling point, so it's best to do this rarely and with extreme care, but it's important not to hold regular syntax as some kind of sacred ideal that you never break either – sometimes it really is helpful to do so, and you can do so without sacrificing the benefits of regular syntax on the whole, something the designers of CL clearly saw themselves. Thus, for example, you can allow CL to have syntactically correct JSON literals if you want.

Thanks to all of this, Common Lisp is the ultimate hackable language.

2.3. Image-based development

One of the most compelling, and still to this day completely unique – outside of Smalltalk, anyway – aspects of Common Lisp is that it is image-based. This means that your compiler, interpreter, runtime, program (including dependencies), and parts of your development environment all share the same memory space in your running program, and you can save the running state of all of that to disk and restore it to memory at will, seamlessly – or reproduce it from scratch as needed. Furthermore, Lisp programs (as images) are designed to continue running indefinitely and have your development environment, including your editor and your REPL, connect to them while they're running so you can modify them as they run, live, through dynamic code reloading.

This has several amazing properties:

While many complain about the size of Lisp programs distributed as images – specifically, many Unix grognards complain about it – as we've seen with the rise of Docker, it's actually extremely useful to be able to distribute a program along with the exact dependencies and resources it expects, in the exact same environment that it was created by the developer in. (If you're worried about the repeatability of code produced this way, don't worry, there are build systems that let you specify dependencies and what code should be loaded and so on to generate the base image for your program reproducibly).
Being able to save-lisp-and-die means that you can easily create custom versions of the language, that will load your own custom packages and code when you start them up.
Your REPL and editor have full access to your entire program, as well as the state of the runtime, and the compiler, all in one coherent universe, which allows for incredibly powerful IDE like integration that's much smoother than what even the most advanced modern IDEs can offer and works from first principles instead of being a pile of hacks (such as with LSPs, which are typically full reimplementations of the compilers or interpreters of the languages they're for, since regular compilers aren't suited to language server work, that have to recompile your entire program every time you make a change in order to give you feedback).
You can use the compiler to modify your programs live, as they're running – recompile any function, variable definition, class or struct definition, or anything else, and send it to your in-progress running program and watch it change its behavior in real time. And unlike attempts to do similar things in C# or C++, it isn't a buggy half-working mess tacked onto a language that isn't meant to do it, full of exceptions and "yes, but…" potholes. This is true REPL-oriented programming: not what most call a REPL, which is just a tighter iteration on batch-processing – where you write code, submit it to the computer (at which point you can't change it), wait for it to run to completion, see output, and then write a new version and submit that – but dynamically changing the program as it runs and you see errors or have new ideas. Other languages may have "REPLs", but they don't have this. BEAM languages can come close, but only at the level of granularity of a whole module.

The last example here is the one that's the most important to me. I see most coding as a process of reflective equilibrium with the computer – you may have ideas for what you want to implement and how, but software is often so complex, and computers so alien to our natural ways of thinking even for the most experienced of us, that there are too many unknown unknowns to predict from the outset. Furthermore, most software must interface in some way with the external world, and especially with the needs and psychology of human beings, which means the requirements may not always be totally clear up front, because they're inherently fuzzy and ill-specified, meaning that you may only recognize that you have what you need when you stumble upon it. Therefore, being able to write down an idea quickly, test it out, experiment with it, rapidly changing it and molding it like clay on the potter's wheel, is very important to the development process, and there is literally no environment better suited to that than Common Lisp (or Smalltalk).

2.4. Type system

It might seem strange for me to reference Common Lisp's type system as a reason to be interested in it, since this aspect of the language is not oft-remarked upon even by its biggest fans. However, I think that's a shame, because it's severely underrated, largely because what exactly the type system means to the compiler isn't specified by the ANSI standard, and so varies so much by implementation. Now that Steel Bank Common Lisp has handily risen to the station of the premier, main, most advanced, and most updated free software implementation of CL, the one that everyone supports and writes documentation for, and SBCL also happens to make the strongest and most advanced use of the Common Lisp type system, I think it's worth discussing.

The first thing that's interesting about it is it's magical balance between simplicity, flexibility, and power. If you don't believe me, please go read "Typed Lisp, A Primer", which is truly excellent, and explains Common Lisp's type system from the perspective of a fan of Haskell. Common Lisp's type system has a rich panoply of simple type specifiers (and you can add more, including simple ADTs), bounded numeric types, bounded and polymorphic arrays, polymorphic vectors, bounded strings, the ability to define (some) polymorphic types, sum types, union, intersection, compliment, enumeration, and singleton types, and a lot more. Crucially, though, type specifiers remain just simple symbols and lists for easy compile time manipulation, and it's all pretty straightforward and easy to understand. SBCL can do fairly precise (including adding bounds!) type inference for all of these types, as well as compile time type checking for all of them as well. So right out of the gate, you've got a pretty decent type system to act as guard rails for keeping you out of trouble.

It gets better though, because, as the SBCL User Manual states, not only are types checked at compile time (and used for very powerful compiler optimizations, which is why they were originally added to the ANSI standard), it also treats these type checks as assertions at runtime. The reason this is helpful is because there is actually one part of the Common Lisp type system that can't be checked at compile time: satisfies conditions. Satisfies conditions allow you to check arbitrary requirements using the full power of Common Lisp. This would be essentially impossible to provide statically without dependent types at huge complexity and abstraction cost, so instead, Common Lisp allows you to specify these conditions and just checks them at run time – which it can do with little added complexity – instead. The cool part about this, though, is that these conditions are not assertions, they're still part of your type system. That means there's a lot more you can do with them in theory, with a little macro magic (which I'll get to in a second).

Incredibly, this is essentially the same strategy Ada, one of the languages that is most highly respected for formal assurances and reliability, uses: allow a supremely expressive type system (one of Ada's proudest features is its bounded integers, something CL also has, and can check statically!), and just check what can't easily be checked statically dynamically instead.

Even more interestingly, since types are just regular symbols and lists, and type definitions are essentially macros themselves already, and macros allow compile time computation, code generation, and even interaction with the compiler to reject code or create warnings and so on, it seems that the Common Lisp type system integrates extremely well with its macro system in a way that could, if someone wanted, open the door to some very powerful things.

Of course, there are limitations here due to the way CL was designed. Primarily, slots on classes can't be type-checked statically, as they're not really part of the type declaration system – since CLOS was added later in the second draft of Common Lisp – multimethods can't dispatch on built in types, and there are no traits or parametric polymorphism for classes or non-derived types. The type system also doesn't allow you to specify schemas for general data types such as lists and hashmaps and use them as types, nor do randomized property-based testing based on types like Clojure's spec does. However, as usual, many of these issues can be resolved with a touch of compiler wizardry and macrology:

defstar adds a more convenient type declaration syntax.
Peltadot adds:
- a version of generic methods that can dispatch on built-in types as well as classes, as well as providing inlining, static dispatch if the types of the arguments are known at compile time, and enhancement of the body of the statically dispatched method for particular calls with more specific type information if the provided arguments are more specialized than the types of the actual generic function being called
- a Haskell-style trait system (which could already mostly be done using CLOS and mixins, but now integrates with the full type system and lets you define traits / "mixins" for existing types)
- a powerful system for writing brand new top level parametrically polymorphic types that don't have to reduce to existing types,
- a parametric polymorphism through type templating system that doesn't have to monomorphize,
- extensible coerce
Lisp Interface Library, which provides an interface-passing style version of parametric and ad hoc polymorphism through passing first-class interfaces, like OCaml
Schemata provides schemas, like Clojure's spec, for Common Lisp, that can use Common Lisp types and classes, as well as be used as Common Lisp types, and can automatically create schemas for classes inheriting from the schema-class metaclass, It can also randomly generate data fitting a schema using check-it.
Quid Pro Quo, which is an Eiffel-inspired design by contract system for CLOS methods and classes.
A myriad of Common Lisp testing frameworks, which you can find a thorough and up-to-date comparison of here.
Serapeaum's etypecase-of, ecase-of, match-of, and defunion for lightweight ADTs and exhaustiveness checking based on the existing Common Lisp type system, as well as a Haskell-like type definition macro and a version of the (which lets you specify the type of an expression explicitly, like :: in Haskell) that actually checks the types of things.
cl-parametric-types adds parametric polymorphism to Common Lisp classes, structs, and functions using the C++ template model.
generic-cl wraps many built in Common Lisp functions (e.g. equality predicates and sequence operations) with generic versions you can add methods to, to make the language more consistent. Best used in conjunction with static-dispatch for performance.

And finally, if none of this is enough for you – because of the more fundamental limitations of gradual typing – and you need the strength of a full-blown Haskell-style type system, but don't want to give up the other features of Common Lisp, you can check out Coalton, which is a whole separate Lispy language with a full Haskell-like type system implemented in Common Lisp macros that compiles using those macros straight down to highly optimized low-level Common Lisp that ends up faster than hand-written code, allowing you to gain performance and ML static types while keeping conditions and restarts, macros, image based programming and the dynamic programming environment, and easy interop with the rest of Common Lisp when you want access to more dynamicism (and CLOS).

2.5. Generic programming (CLOS)

Although it's called the Common Lisp Object System, CLOS is your typical object oriented programming system. Instead, it takes the core ideas of object-oriented programming – dynamic dispatch based on the identity of one of the arguments to a method, and encapsulating multiple data slots together under a single identity – and integrates them with its more functional approach, by detaching methods from classes, allowing them to stand on their own as regular functions (called multimethods), just ones that have specialized implementations (methods) for types or classes (new instances of which you can specify anywhere you need to, not just in the definition of a class). This means that you can use methods in a completely syntactically and semantically consistent way with regular functions – not just in call syntax, but also in other ways; for instance, you can pass methods around to higher-order functions. This also has the knock-on effect of effectively freeing you from the Kingdom of Nouns problem other strongly OOP languages face, by allowing operations to be on the same level as classes as first class objects, even operations that dispatch on classes. This is a level of integration between object oriented and (classic) functional programming that very few languages can achieve, which is admirable in my opinion.

Just this idea of having functions that can specialize on the types of their arguments, and can have additional specializations for new types introduced by code the original implementation never even has to know about, having multiple dispatch without tying the function down to being defined in specific blocks or locations or associated with specific typeclasses, is pretty rare. I think this kind of generic programming is extremely powerful, because it allows you to define implementations of the core expected operations used by existing code (even code you may not have control over) for types that code may never have known about before, allowing that code to seamlessly interact with new things without having to hard-code the interaction between everything. For more on this, see this thread and the talk linked in it. Julia's multiple dispatch and type system were heavily inspired by – some would say they're almost identical to – Common Lisp's.

CLOS also expands dramatically on object orientation in so many fascinating ways. For instance, it has an algorithm that can resolve conflicts and eliminate duplicates and linearize class hierarchies using topology, thus allowing it to have generally safe and comprehensible multiple inheritance, allowing patterns such as mixins and even an Entity-Component system to be trivially represented without needing custom code or language features to do so. It also expands the number of arguments methods can dispatch on from just one, the implicit this or self argument in most languages, to the classes (basic types are also mirrored in the classes hierarchy) of all arguments (multiple dispatch). Additionally, instead of just having a concept of calling "super" (which it does have, under the guise of call-next-method), which allows only very limited composition between different implementations of the same method, it also has the ability to specify when a method is declared how it should compose with other versions of the method. For instance, it can:

primary: Act as the "primary" method, of which there must be at least one, replacing any other primary method if it is more specific in the types it specifies it can operate on than the others, or being overridden if there is another more specific method. (This is the default, and works like override in most other languages).
before: Run if its types are applicable before any primary method for this multimethod is called, irrespective of whether its types are more or less specific, and in addition to the primary method. There is actually a stack of these methods, and when the overall multimethod is called, all the ones that apply are selected and sorted in order from least to most specific and they're all run in order before you get to the primary method.
after: Same as above, but runs after.
around: Add a method which is run whenever the multimethod is called with applicable arguments instead of the primary method, with the primary method passed as an additional argument, and gets to conditionally choose how to run the primary method and filter its inputs and outputs. Represented as a concentric series of wrappers around the core of the most specific applicable primary method, that each receive the previous wrapper in turn.
(no term): Or use literally any arbitrary function as a composition method, which then receives all the applicable methods as arguments and decides what to do with them.

This is another aspect of how to solve the diamond problem: if methods can specify how they should compose with other methods preemptively, than method composition can be encapsulated and intelligently handled, instead of things just clashing and overriding in strange ways. This composition doesn't just solve semantic problems either, it unlocks massively powerful new horizons for composable, modular program development.

CLOS has many more features than I've covered here. Far too many, in fact – I'd run out of space. And that's not even to mention the Meta-Object Protocol, which is a pseudo-standard many CL implementations support that expands the power of CLOS even further.

2.6. Ugly

Criticisms of Common Lisp tend to fall into one of four buckets:

2.6.1. It's too big/complex of a language standard

While the first point may have been accurate for its time, the contemporary languages that Guy Steele had to compare Common Lisp to were things like COBOL, Fortran, BASIC, Pascal, Scheme, Ada (of the time), and C – in other words, languages that we would recognize today as woefully lacking in features, syntactic sugar, and standard library functionality, meanwhile Common Lisp is much closer to our modern standards. Standards for language design have simply changed significantly as the decades have rolled on and Moore's law has given us more compute power to play with.

So how does CL stack up now? Let's do a little informal comparison of Common Lisp with modern programming languages. I'll be using the latest standard or draft standard for each language I can access, or a reference for that language and its standard library if a specification is not available, if that reference appears to be sufficiently complete and formal and not in any way inclined toward a tutorial or guide. This source is then converted to plain text using either pandoc or ps2ascii to extract just the content without any typesetting or typesetting commands, and then run through wc to get the statistics shown.

Obviously, statistics for languages that didn't have a formal standard should be taken with a grain of salt, but I tried to be careful to only select references that, to my untrained eye, looked about the same level of detail, completion, and terseness as a specification, and explicitly stated their intention to be such in their opening introduction (the Racket, Python, and Rust references all do so).

Here are the results:

Language	Version Used	Lines	Words	Chars
Common Lisp	dpANSR3	45029	372985	3659804
C#	Draft standard of 9.0	32721	238747	1742402
JavaScript	ECMAScript 2025	52838	292051	2216995
C++	Working Draft of C++26	129735	866312	10366237
Ada	Ada 2022	92065	440991	3804499
Ruby	ISO Ruby 2012	10930	98630	836729
Scheme	R⁶ RS + R⁶ RS-lib	9108	96112	1026925
Rust	Rust Reference + Unsafe Rust Reference	23711	149739	1439574
Java	Java SE 23	31035	260664	2445031
Racket	Racket Reference	132920	620742	7427007
Python	Python Language Reference + Python Standard Library Reference	216310	937237	7113173

It's unfortunate that so few popular, mainstream languages are formally standardized, such that an accurate comparison can be made, but I think this should give you a ballpark idea of the size of Common Lisp relative to the modern world. Undeniably, Common Lisp remains a big language; however, in the context of today's programming language landscape, it isn't unconscionably large. It's only a fourth larger than Java, JavaScript, and an old version of C# (which has added many features in the few versions after the one listed here), smaller than C++ and Ada, and probably smaller than Racket and Python, two languages nobody complains about the size of. And really, it's not surprising that it's half the size of Racket and a third the size of Python: while Common Lisp has a reputation for complexity, and Racket and Python a reputation for simplicity, Common Lisp really has relatively few individual language features – those features are just very powerful, such as CLOS, the condition and restart system, the macro system, and the type system – and a positively anemic standard library by the standards of Racket and Python, while Racket and Python have far more individual language features (just look at the tower of macro abstractions Racket comes with!) and much bigger standard libraries.

In fact, the next Scheme standard – the one intended for practical software engineering and development work as opposed to language research, pedagogy, and embedded applications – will probably be much larger than the ANSI Common Lisp standard (this is even according to the R7RS committee), demonstrating, I think, my point that a language with sufficient built in features to allow for practical use is always going to end up pretty large. I think the size of Racket (and Guile, possibly, although its reference wasn't suitable for this comparison) demonstrates this further: although the core Scheme standard was very small, implementations of Scheme that wanted to be usable for practical programming had to expand massively on it, despite coming from the same community that values smallness and simplicity. As Kent Pitman once said, "large languages make small programs and small languages make large programs."

2.6.2. It can't be implemented in a way that's performant

This is simply false, as we'll see in the section below.

2.6.3. It isn't clean/elegant/orthogonal/beautiful

There are a lot of reasons that a language might be ugly:

it might be poorly thought through, having had insufficient design work go into it;
it might have odious or misguided design principles at its heart;
it might simply have grown organically over time away from the clean ideas at its core, or simply have a beautiful core and then a few award decisions on top due to path dependency;
or it might be ugly due to practical tradeoffs.

Languages that fall into the first two categories are basically unforgivable in my opinion. Such languages include PHP, Go, Java, and Perl. Languages in the third category are usually tolerable, and often in fact quite awesome, because while there is ugliness to them, if you can look past the surface level ugliness – some strange syntactic or semantic decisions here and there – there are beautiful or powerful ideas locked away within (Erlang, OCaml, Prolog). And languages in the fourth category we may gripe and grumble about, but will ultimately be more useful to us than a beautiful language that makes no compromises with practicality.

It is my contention that Common Lisp is in the third and fourth category, not the first two. It is not the Right Thing in an absolute sense, but it is the best combination of the Right Thing (a beautiful core set of ideas and a set of extremely powerful ideas and concepts built on top) and Worse is Better (a practical workhorse for the here-and-now, made of compromise and organic evolution) that exists in the present moment. If a language with all of its capabilities and benefits existed, without any of the warts, I'd switch in a heartbeat, but there isn't yet… and ultimately, I don't think there can be (maybe Jank will prove me wrong, who knows – I'm praying).

Let me put it this way: when you focus on purity, you get an overly abstract, annoying to use for practical work language and obfuscatory culture like Haskell's (stylistic neophilia, awkward tooling, and constant changes, to quote the second link), filled with undocumented, half-implemented libraries for doing abstract manipulations on types and little else, the dead littered remains of someone's PhD thesis.

And when you focus on simplicity and beauty, you get something like Scheme: a standard that is completely minimal, totally orthogonal, utterly beautiful… and so impossible to use productively for practical programs that almost every single implementation – Chicken, Chez, Racket, Guile, etc – has had to reinvent a set of multiparadigm language constructs and standard library batteries from the ground up on top of it, which has resulted in a total Balkanization of the community, leading to a dearth of substantial programs and practical libraries compared to Common Lisp:

awesome-cl (915 entries), or the curated quicklisp package repository (2279 packages)
awesome-racket (204 entries), or the uncurated racket package repository (2195 packages)
awesome-guile (213 entries)

and very few books or documentation on specific applications of Scheme outside the realm of PLT or just "learning Scheme."

Worse, this focus on beauty and smallness has meant that attempts to standardize enough of these features that the language can actually become a usable language for practical programming have run headlong into the community's obsession with simplicity, smallness, and beauty preventing making a language for practical programming, leading to the decades-long fiasco that has been R7RS (including Scheme being officially split into two languages with possibly totally different semantics on some fronts, and then the committee for the large version further splitting).

Meanwhile, in Common Lisp's case, it's ugly because it's a compromise between many very powerful predecessor languages that were all used extensively for serious programming and designed by smart people deeply invested in Lisp, and designed to be the foundation of an entire industry, to make it possible for all of them to continue doing their work and not give up any of the powerful features they actually used in the pursuit of some abstract purity. And while every member of the working group that standardized ANSI Common Lisp came away claiming that if it had just been them they could have made something far more beautiful, I think the resulting language, while ugly, definitely serves its purpose in exemplary fashion. So it has a huge and rich standard library; it is unabashedly multiparadigm – sporting a rich set of concepts with which to write programs – and each paradigm is fully implemented and powerful; every feature is fully rounded out (such as the list comprehensions with the LOOP macro), with all the corner cases accounted for and every peripheral feature thought of and added; and the whole thing is clearly, completely, and unambiguously described in a central reference. This means that while there are many implementations of Common Lisp (SBCL, CCL, Allegro, LispWorks, GCL, CLISP, ABCL, and more), all with their own pros and cons, you can actually port large scale, meaningful programs between them, so the community is much more unified even as they get the benefit of multiple implementations.

Yet, despite all the compromise and focus on practicality and history, the standardizers of Common Lisp seem to have put an unusual amount of effort into doing the Right Thing. Whether that's the package system solving macro hygiene problems, or the homoiconicity, or the use of bignums and rationals by default. Even the particularly ugly compromises have their reasons and defenses.

2.6.4. Common Lisp killed Lisp

This one doesn't require much of a response.

Just like the idea that Lisp died out because it was simply too powerful and flexible for programmers to grasp, or for the industry to adopt, or that its power and flexibility lead to fatal fragmentation, it's a narrative used to patch up the raw, unpleasant reality that history doesn't have neat lessons like "ugly languages are bad" or "programmers are stupid/Lisp was too good for this world" or "powerful languages can't work in The Industry." Ultimately, Lisp died for two reasons:

Lisp's fate was tied to the first AI boom because it was invented by the same people who were pushing the forefront of artificial intelligence, and soon became their favorite language, so that eventually all the research departments using and improving Lisp were dependent on AI work for their funding, and later all the companies selling and supporting Lisp depended on a customer base mostly composed of people doing AI work, and the broader opinion of Lisp was intimately tied to AI instead of the language's own merits, so that when the AI Winter came, it was totally wiped off the face of the map.
When the transition from minicomputers to microcomputers came, stock machines were still far too slow to run Lisp, so specialized Lisp machines that were rare and expensive had to be created to run it, limiting its reach. Meanwhile, languages that ran fast on the horribly limited hardware of the day spread everywhere, and everyone started hacking with them. This led to those faster languages, mostly C, coming to form the basis of modern computing infrastructure, and most programmers being familiar with them, and most pedagogy being oriented around them. As a result, when the day came when Lisp was fast enough – more than fast enough! – it was too late: everything was already written in C, everyone was already familiar with C, and so all anyone wanted was C-like languages. So few people were willing to learn Lisp, and worse, a culture of rationalizing and justifying a desire to not learn Lisp as being the result of inherent problems with Lisp sprung up, further scaring people away from it. So programmers stole a few ideas from Lisp for their C-like languages and moved on, and now it's too late to change.
By the time people had started to come around to some of its ideas it was no longer new. So while it was still very powerful, still largely a superset of the features of similar languages, still the ur-dynamic language from which all others pull features and ideas from, it was also somewhat crufty, somewhat held back by past technical debt and a lack of contributions, and most importantly, just exuded this air of being ancient – it isn't new or hip. All of this just made it very difficult to generate any meaningful pop software culture hype around it.

2.7. Profitably dead

One of the best aspects of Common Lisp is that it's "dead" – it was standardized once, as the ANSI Common Lisp standard, and has not been updated since, nor is it likely ever to be updated again.

I'll let Steve Losh explain the basics of why:

If you're coming from other languages, you're probably used to things breaking when you "upgrade" your language implementation and/or libraries. If you want to run Ruby code you wrote ten years ago on the latest version of Ruby, it's probably going to take some effort to update it. My current day job is in Scala, and if a library's last activity is more than 2 or 3 years old on Github I just assume it won't work without a significant amount of screwing around on my part. The Hamster Wheel of Backwards Incompatibility we deal with every day is a fact of life in most modern languages, though some are certainly better than others.

If you learn Common Lisp, this is usually not the case. In the next section of this post I'll be recommending a book written in 1990. You can run its code, unchanged, in a Common Lisp implementation released last month. After years of jogging on the Hamster Wheel of Backwards Incompatibility I cannot tell you how much of a relief it is to be able to write code and reasonably expect it to still work in twenty years.

Of course, this is only the case for the language itself — if you depend on any libraries there's always the chance they might break when you update them. But I've found the stability of the core language is contagious, and overall the Common Lisp community seems fairly good about maintaining backwards compatibility.

I'll be honest though: there are exceptions. As you learn the language and start using libraries you'll start noticing some library authors who don't bother to document and preserve stable APIs for their libraries, and if staying off the Hamster Wheel is important to you you'll learn to avoid relying on code written by those people as much as possible.

One of the great aspects of this is that sometimes Common Lisp libraries can just be done: they've fixed all the major or relevant bugs, implemented all the features necessary, and the basic operating system or FFI things they rely on aren't going to change out from under them anytime soon, since POSIX and C are also extremely backwards-compatible, nor is the language itself, or the package manager or build system, so they can just… let it be, and you in turn can use it without worrying about checking GitHub vitals or updates breaking anything, or documentation being out of date. Better yet, many of these libraries are older than entire other languages like Python, meaning that they're well developed and well tested. And on a more ironic note, this ecosystem stability is really important for a language that has such a slow moving, small community, since you don't have to worry as much if a library really is dead.

The fact that the language is standardized and hasn't changed since the 1990s might seem like a death sentence, but thanks to Lisp's power and flexibility, it isn't – because Lisp has non hygienic macros, reader macros, compiler macros, the meta-object protocol, CLOS, access to the compiler from within the language, and more, it can simply absorb any feature from any other language. Even better, any feature absorbed this way is just a (usually quite portable) package on top of a fully specified and stable standard with multiple conforming implementations. So the language can both evolve with the times however you need and also keep a permanently stable backwards compatible baseline target that doesn't break your code. And you can compose and mix and match language features, since they're just packages, and do so without messing up the language features used by other packages you want to use – since they can simply not import the macros that you're using, or even import their own without clashing with the features you're using – and there's always a common language under the hood that's big enough and stable enough for practical work all by itself.

Of course, this might bring to mind the dreaded "Lisp Curse", but here's the thing: the person who wrote that essay was a web designer with no documented experience with Lisp at all, and his essay doesn't really capture the realities of the Common Lisp world at all.

While what he says is somewhat accurate for the horribly fractured and confused Scheme world, where every implementation of Scheme is in effect a totally different incompatible language with its own small sliver of the overall community, it doesn't even really hit home there, since within the world of each Scheme dialect, there seems to actually be a clear and concerted effort to rally that sub-community around a core, unified set of batteries and language feature implementations, and the fact that the community is fractured into multiple dialects isn't the result of Scheme being "too powerful," but instead the result of other problems.

Meanwhile, his critique holds even less for Common Lisp: while the power and flexibility of Common Lisp certainly attracts a certain kind of mind (of which I am one) due to the power it gives individuals to achieve their visions alone, and the flexibility it gives them to mold the language they use in the image of their own ideas, preferences, and thought processes – essentially freeing you from the constraints of being stuck with another's design decisions or having to work with others – and that can lead to cultural problems, that doesn't mean the community can't unify around single solutions to problems eventually. In fact, it has: if you look at the Awesome CL repository and the 2024 Common Lisp Survey you'll notice that in almost every category, the community has unified around a single main implementation of something, with maybe one or two major alternatives, and where there is a major runner-up, it isn't just another incompatible but overlapping 80% solution, but has good independent reason to exist. Yes, there is always a long tail of other half-implemented libraries, but that's true in any healthy language ecosystem.

This freedom to experiment widely, to implement each individual's vision of how something should be done, and then to slowly unite on a single implementation of an idea, or a few meaningfully unique and different solutions that solve different needs, is actually a very good thing. Yes, it may lead to a community that does worse on some metrics, but there are reasons to prefer it, too, such as not having to make the choice between a feature being painfully absent from the language for those who need it right now or adding the feature prematurely, before it's been fully developed and thought through (something greater linguistic experimentation can help with as well, through providing more information), and when a feature is centralized on, that doesn't mean it's locked in – it can still be ditched. Compare this with, for instance, the situation with Async Rust.

Ultimately, I don't think the "Lisp Curse" is what killed Lisp. Instead, I think what killed it is a lot simpler and dumber than that – path dependency, accidents of historical context and development, and cultural issues.

The stability ("deadness") of the Common Lisp standard is great for implementations, too – it means that once an implementation conforms to the standard, it really needs very little work done on it, mostly just small bug-fixes and upgrades to stay able to run on modern hardware and OSs. There isn't a new edition every year, or even every five years, to bring along major new changes to the language that you have to keep up with; as long as you implement this standard once, you'll be good to go, a valid implementation anyone can use. That's why so many people were able to use Clozure CL even for the many years that it was unmaintained. That's also how new implementations of Common Lisp such as SICL and CLASP have a chance at actually working out. Of course, there are other pseudo-standards like MOP (from The Art of the Meta-Object Protocol), CLTL2 (from Common Lisp: the Language, 2nd Edition, which introduces environment reflection capabilities), and threading, but at least there's a fairly large, powerful, flexible, and practical foundational language for programmers to build on that they can be confident will transfer between implementations. Moreover, there are usually libraries (such as cl-environments, bordeaux-threads, closer-mop, uiop, cffi, and so on) that paper over implementation-specific deficiencies or incompatibilities in these pseudo-standards. To see what conformance looks like across the CL ecosystem, you can look here.

The benefit of having multiple implementations should be obvious. While obviously SBCL outshines the rest by a mile in terms of completeness, active maintenance, and most especially performance, and thus the vast majority of Common Lisp users use SBCL, the worst that does is put Common Lisp in the exact same position as any other non-standardized language with only one implementation. And in reality, other implementations shine in different ways:

ABCL lets you use CL on the JVM and get great interop with Java.
CLASP lets you use CL on top of the LLVM for seamless C++ interop.
CLISP is very fast at numeric code.
CCL compiles extremely fast and has great error messages, so many people use it in conjunction with SBCL.
ECL is very easy to embed in C programs (like Lua) and can also compile to C.
SICL is intended to be the cleanest and most correct Common Lisp implementation, fully implemented in idiomatic Common Lisp code.
LispWorks has a Lisp Machine-inspired GUI development environment fully implemented in Common Lisp and inside their Lisp image that puts even SLIME to shame, a portable GUI toolkit, and support for unusual platforms like Android, as well as having incredible commercial support.
Allegro is extremely fast, has commercial support, advanced symbolic AI and LLM support, and a powerful server and database.

Such diverse yet compatible implementations would be possible if the language standard was constantly changing and difficult to keep up with. (See LuaJIT vs Lua, or Clojure vs ClojureScript vs Jank, or CPython vs PyPy).

2.8. Performance and systems programming

Traditionally, in the world of programming, you have two choices: if you want to write something that requires performance, than you have to use a systems programming language, accepting worse development experience and speed, longer compile times, and either far less power, or far greater complexity, as well as either far less safety, or much greater difficulty in writing programs (to satisfy static verification systems); and if you don't care as much about performance, then your entire world opens up, with a huge panoply of pleasant, powerful, safe, and dynamic languages with fast development times, that are good for experimentation, to choose from.

The problem with this dichotomy is not only that it's simply unpleasant for those who care about performance, and annoying for those who want to use higher level languages but don't want to have to deal with the performance penalties (or annoying for users of projects written in those higher level languages), is that often you need both properties in the same project. For some parts, you need to buckle down and optimize those tight inner loops into oblivion,a nd for some parts, you really want the higher level programming constructs and faster turn around time. When faced with a problem like that, your only two choices are really to buckle down and pick one horn of the dilemma, or to try to embed a language, like maybe Lua. The problem here being, of course, that now you've got to decide which parts of your program go in which language, constantly spend time building interfaces between them and solving compatibility issues, and occasionally move pieces of your project back and forth between the sides. Worse still, most embeddable languages are only designed for lightweight scripting, more on the R7RS-small side of the language spectrum, which means that you can't actually use them too extensively without running into issues – and more complete dynamic languages are larger and slower and harder to embed.

Common Lisp largely resolves this dilemma by offering a language that is higher level, more powerful, more dynamic, and better for experimentation than basically any other, that can also seamlessly deal with low level optimizations through features like:

marking functions to be inlined,
using type declarations to eliminate dynamic dispatch and indirection,
adjusting compilation speed and safety settings (can be done on a function-by-function level),
selectively turning off late binding (in SBCL),
manually JIT compile code by calling out to the compiler at runtime,
view the raw assembly of your code to see how the compiler is optimizing it,
inline assembly (in SBCL),
SIMD support (in SBCL),
arena allocation (in SBCL),
access to pointers through generalized references,
avoid heap allocation entirely via mmap,
specify custom compile-time inline replacements for regular functions,
low-level high performance threading primitives,
fine-grained control over when allocation occurs through:
1. forcing values within a scope to be allocated on the stack,
2. the ability to return multiple values without allocating memory,
3. faster low-level arrays, vectors, strings, and bit vectors as well as regular ones,
4. unboxed types (such as fixnums, floats, double-floats, etc),
5. and explicitly non-consing (destructive) list operations,
an extremely powerful LINQ-like DSL for iteration over various data structures that allows the compiler to generated optimized low-level looping code under the hood unlike mapcar and friends,
the ability to create complex objects at load time and use them like literals at run time,
a built-in facility for inspecting memory usage, composition, and management (sort of like a built-in Valgrind),
stand-alone executable delivery,
jump ("goto") instructions (lexically scoped, so they're safer to use than classic GOTOs),
1. dead code elimination (in SBCL),
and a lot more I'm probably not even aware of.

Of course, memory usage and garbage collection pauses will always be a possible problem if you don't want to have to go full arena allocator, but you can get around this by manually triggering the GC every tick of your program if there's extra time left, using object pools, and using non-standard tricks like temporarily pausing the GC (something SBCL allows you to do).

All of which allows Common Lisp to be pretty damn fucking fast for a language that can also seamlessly transition to extremely high level and dynamic outside of the hot loop, and provides an unmatched development experience. It's pretty consistently only around 4x off C/C++ on most benchmarks, and it can be optimized to be just as fast as the same algorithm in C in some cases. It can even occasionally beat C++'s performance with clever use of the tools given:

We develop tiny SQL-like query engine from scratch in Common Lisp which performs on par with a cheating C++ version and handily outruns even more cheating Go version.

This is possible because CL compilers are competent, blazing quick and can be programmatically evoked at runtime over arbitrary just-in-time generated functions. Generation and native compilation of specialized for any concrete query code is deferred right back to query (run) time. Where other languages must pre-compile recursive interpreter, CL compiles down a single if-condition.

As for code generation we have the full power of the language plus whatever we've additionally defined, we show off arguably the most powerful Common Lisp Object System (CLOS) in use. This combined with the fact that generating code in Lisp is isomorphic to writing "regular" code, makes the solution so simple, concise and elegant, it's difficult to imagine it does the same thing as those unsung geniuses writing low-level optimizing compilers in all those powerless non-Lisp languages.

There are also a fair number of high performance libraries for various purposes available for Common Lisp as a result of all this, such as numcl, lparallel and lfarm, Petalisp, cl-async, woo, stmx, and Sento. And thanks to its flexibility and how introspectable the compiler is, new performance optimizations can be added to Common Lisp trivially, such as:

specialized-function, which offers Julia-like JIT compilation of type-specific versions of generic methods when they're called;
static-dispatch, which statically dispatches calls to generic methods where the types of the arguments are known at compile time;
loopus, which can do (the README isn't helpful)
- type inference and specialization,
- loop invariant code hoisting,
- common subexpression and dead code elimination,
- automatic vectorization;
tree-shaker, which can reduce executable sizes by up to 30%;
memory-regions, which offers manual memory management;
and trivial-garbage, which offers weak hash tables and vectors, as well as access to finalizers, which are handlers that run when an object is garbage collected, which are more useful than they sound,

to name a few.

That's not it, either – if Common Lisp native libraries aren't enough – as is often the case in performance sensitive work or systems programming, where the vast majority of existing work has been done in C and most major, powerful, fast, and well-maintained libraries are written in C – CL also has excellent C FFI capabilities through three libraries:

CFFI, a Common Lisp library that provides a way to link to and call C shared libraries in an implementation-independent way (using each implementation's specific methods for doing so under the hood, but providing a consistent interface with consistent semantics). You still need to explicitly write bindings for each and every function you want to be able to access from Lisp manually, essentially writing a second header file. You also need to manually convert back and forth between Lisp and C types, and although an extensive set of foreign types are provided to represent C types on the Lisp side. Moreover, you need to be very careful about how you manage memory at the border between Lisp and C. However, this level of bindings is already superior to what most dynamic languages with managed runtimes can provide – the Java Native Interface requires a custom (and verbose) wrapper to be written on the Java side, the Go FFI is slow and awkward to use (1 2 3), and while Python isn't that bad, it requires you to use a special Python script to compile a C extension for your interpreter to interface other Python code with C, which is flimsy and doesn't generalize to other implementations like PyPy, as well as being awkward.
cl-autowrap is able to parse header files Automatically generate function bindings for Lisp, allowing you to use C functions without needing to manually write bindings. It can also generate thin and performant lightweight wrappers for C structs and unions and various other sea types so you don't need to manually write wrappers for those. It will also generate recursive accessors and so on. It also annotates all of the things that generates with full-type information from the C side. This is already better than anything Rust really has. Rust has Auto-CXX, but it doesn't seem to work as well.
cffi-object, which automatically wraps CFFI pointer types in structs that then automatically free the equivalent C memory when the garbage collector frees that list struct using finalizers. This essentially automates What you typically do in rust, which is similar, wrapping C pointers in rust structs that have a finalizer that frees the C memory when the rust struct is destroyed by RAII. It's just done for you automatically and integrates with a full garbage collector instead of just RAII. So it's a fuck ton better.

To get a sense for the performance of C FFI in Common Lisp, we can take a look at the FFI overhead benchmark.

While they haven't updated their list to show SBCL, they did include an SBCL implementation, so I cloned the project, got SBCL and Go working in a container, and got these times:

Common Lisp via SBCL:
934
934

go:
21186
21053

This implies that SBCL is around 21 times faster than Go at interfacing with native code. For a more complete comparison, let's find the performance scaling factor between my computer and the computer the results in the README were gotten on using the common denominator – Go – and then scaling the SBCL results by that factor and placing them in the list. That ends up looking like this:

Language	time in ms
luajit	891
julia	894
c	1182
cpp	1182
zig	1191
d-ldc2	1191
rust	1193
haskell	1197
nim	1330
d	1330
ocamlopt	1634
sbcl	1674
v	1779
csharp-mono	2697
ocamlc	4299
java7	4469
java8	4505
node	9163
wren	14519
node-scoped	15425
dart	31265
elixir	23852
go	37975
dart-scoped	61906

That's the fastest managed language, aside from Haskell! I'd say that's plenty fast enough for most tasks. Now, this is using sb-alien instead of the CFFI library, but CFFI is built on top of sb-alien but treats everything as a void pointer for greater performance, so raw sb-alien, used properly, is actually supposed to be slightly slower than CFFI because it tries to be safer, so I think this is a fair comparison.

All this means that while it won't be used for hard-real time or embedded programming (due to the memory footprint) any time soon, it's almost certainly good enough for the vast majority of applications, including games, if you put in some elbow grease, and much faster than similarly dynamic and high level languages like Ruby and Python. And of course being able to use such a highly dynamic language for performance oriented code has a lot of benefits.

(The only language that beats Common Lisp in the "dynamic, high level, but fast" world is Julia (by a factor of 2 or so), but it does so through mandatory just in time compilation that's slow to start and warm up and difficult to control or predict, a highly managed runtime, and generally preferring to be automatic over giving you fine grained control over optimization, all of which adds up to prioritizing throughput over latency, meaning that its performance is applicable in fewer cases than CL's. CL gives you the low level tools to decide what kind of performance you need.)

This has a few benefits for me:

You successfully write a much wider array of software with it, including games.
You can worry about performance much later than with something like Python, you can progressively improve performance from easy-to-write prototype to industrial-grade performance beast, and it'll take a lot longer for you to hit a hard performance wall where you're forced to rewrite.
In theory, I think it is acceptable to give up some performance in the pursuit of malleable systems (see also).

2.9. The condition system

The last feature of Common Lisp that really intrigues me is also the one that stands out the most, as a very singular, unique, and unmatched feature in today's language landscape. Who knows, maybe in another 30 years it will start percolating in half-hearted, poorly-imitated fits and starts into other languages the way macros have been, but for now, no other language has anything like it.

This feature is the Common Lisp condition system. At first glance, the condition system might seem like just another exception system – if you run into an error, you throw an exception, which makes a non-local exit to the lowest level wrapping expression that indicates it can catch that type of exception, at which point the code can decide what to do. However, what's unique about Common Lisp conditions is that they don't unwind the stack.

What this means is that while, by the time an exception is caught (or makes it to the top level) in a language like Java or Python, it's essentially too late to meaningfully recover without restarting some coarse-grained chunk of the process, like a whole function call, and often far too late to actually try to figure out exactly what went wrong, in Common Lisp, when you catch a condition, all of the information necessary – the call stack, local variables and function definitions currently available, even what part of the current expression the execution was in when the condition was signaled – is all intact. Combine this with the runtime malleability, ability to generate new code and execute it on the fly, and access to the compiler, and your language gains with conditions an incredibly powerful tool for essentially being self-repairing: detecting errors and fixing them when they occur.

This also means that when a condition reaches the top level, instead of the program irretrievably crashing, like it would in another language, it can just drop into an interactive menu asking the user or programmer what they want to do to recover from the condition that was thrown. This allows you to literally open a REPL at the exact location and point in time when the condition happened, so that you can explore, experiment by running different code, or redefine whatever types, functions, classes, or variables you need to; or you can ask it to literally replace the expression that signalled a condition with a different expression and just rerun that innermost, most specific part of the program and continue on from there; or any combination thereof. This actually came in handy during the first mission of NASA's New Millennium program, the flight of Deep Space 1, as Ron Garret tells it:

The Remote Agent software, running on a custom port of Harlequin Common Lisp, flew aboard Deep Space 1 (DS1), the first mission of NASA's New Millennium program. Remote Agent controlled DS1 for two days in May of 1999. During that time we were able to debug and fix a race condition that had not shown up during ground testing. (Debugging a program running on a $100M piece of hardware that is 100 million miles away is an interesting experience. Having a read-eval-print loop running on the spacecraft proved invaluable in finding and fixing the problem. The story of the Remote Agent bug is an interesting one in and of itself.)

The Remote Agent was subsequently named "NASA Software of the Year".

This might sound like the capabilities of a debugger to you, and so it might not seem that novel, but what you have to understand is that this means that Common Lisp essentially has an extremely powerful debugger built into the language runtime, that is active at all times without a significant performance penalty, that can also be programmatically controlled to debug or recover errors dynamically. This adds a whole new dimension to what can be done to recover from errors, similar to some of the impressive dynamic error recovery feats Erlang can perform, but with even more surgical precision.

2.10. Other Lisps

I won't speak too much on this here, since I have a rant about Racket in the works. Instead, I recommend you read these:

Common Lisp versus Racket (from a Lisper's perspective)
Common Lisp versus Clojure (from a Lisper's perspective), Common Lisp from a Clojuran perspective, part 1 and part 2
Switching from Common Lisp to Julia, response, a response to one particular point, A Lisper's first impression of Julia.
- My own notes: due to the lack of multiple inheritance in Julia's type system, you can't do mixins, which are the natural way to get composable interface-like polymorphism in languages based on type hierarchies and multiple dispatch, since they allow you to attach names to particular protocols/sets of behavior. The lack of this causes severe problems in being able to specify, as a function, what behavior you expect from a type in a way that isn't either too specific, too general, or too broad, and causes issues for being able to specify, as a type, to be able to specify clearly what you actually provide. See more here. I think this and the fact that all functions are generic, instead of it being an explicit API choice to make a function generic, plus the lack of aspect-oriented programming (before, after, around method combinations) is why Julia has such serious correctness and reliability issues.
- Lack of homoiconicity.
- It is so fucking fast though.

As well as this essay, which makes some good points about Common Lisp's magic being in its holistic system, not in its pieces, although I'm not totally a fan of the tone (since I respect languages like Clojure):

Selling Lisp By the Pound

Another huge benefit to Common Lisp is that it has so many excellent, nay, legendary books written about it, that will also expand your mind and teach you a ton about programming itself in the process. Scheme has SICP and HtDP, which are far better than most of the options Common Lisp has, to be fair, but CL has them beat in number and variety: LISP: 3rd Edition, PAIP, On Lisp, Let Over Lambda, ANSI Common Lisp, The Art of the Meta-Object Protocol, Recursive Functions of Symbolic Expressions and Their Computation by Machine, Common LISP: A Gentle Introduction to Symbolic Computation, CLtL2…

2.11. Conclusion

So, while none of the features that Common Lisp has are today as totally unique as they once were, as various languages over time have slowly adopted bits and pieces of them, none of those languages has the same intersection of all of these features, or any of them in as complete and integrated a form. And in some cases, such as homoiconicity, the condition system, and image-based development, I don't think they ever will, because the familiarity tradeoffs are too great for modern languages to make if they hope to gain traction. So even if the list of features that are totally unique to Common Lisp has dwindled, I still think it has a lot to offer.

It is, in this sense, the lost Atlantis of programming languages: it invented so many ideas long before their day would come for the rest of the world, but those ideas were lost to the accidents of history, and the rest of the programming language civilizations are left finding artifacts of that lost civilization and reverse engineering them in bits and pieces, but never so great and powerful as they once were.

Another way to look at it, in view of the fact that Common Lisp is not a beautiful pearl of programming language design like Scheme or Smalltalk, is that it's a terrifying, hideous, cthonic monster that you can barely stand to look at, branded with strange symbols and chanting strange abbreviations, which you can summon from the depths of ancient lost history to grant you untold power to manipulate the fabric of reality.

Just beware: it may drive you mad in the process!

If you're interested in Common Lisp, I recommend you use Emacs with SLY as your development environment, SBCL as your implementation, CIEL as your starter pack (includes more batteries for Common Lisp), the Common Lisp Nova Spec and the Common Lisp Technical Reference as your docs.

3. The two types of programmers

I think there are really two types of programmers in this world: programmers who program because they enjoy constrained logic puzzles like the kind you would find in To Mock a Mockingbird, and programmers who enjoy making the computer do cool things. The kind of programmer who gets really deep into type-level programming and formal methods and so on tends to be the former, and the kind of programmer who wants to pragmatically adopt those things, if necessary, but generally opt for the simplest solution that allows them to express what they need to, even if it sacrifices an acceptable (context dependent) level of correctness, or who is willing to move certain checks to runtime instead of compile time to avoid complexity (so contracts and specs instead of dependent types), is the latter.

This is because, when you're programming at the type level, you're often presented with a lot of these really gnarly and interesting puzzles, where you have an extremely limited set of transformations you can apply, and extremely harsh constraints, and it's all heavily abstracted, where you've also got, along the way, to figure out how to encode all sorts of invariants in deeply abstract mathematical terms, which all requires thinking at several levels of abstraction greater than the actual thing you're trying to create, and often requires the same kinds of intuitive leaps as proofs. So they're deeply attractive to the first kind of programmer. And of course, if you're attracted to proofs and abstract mathematical reasoning, perfectly, verifiably, provably correct properties are what are most interesting, feel the most clean and Right and assured, to you. So you're going to want to maximize the amount of proof and verification in the code you write, everywhere you can, and evangelize it to others, both because to you it's just obvious – isn't it just correct engineering, the Right Thing, to write software that's proven correct as much as we can with current knowledge and technology right now? Isn't everything else irresponsible? – and because it's inherently fun, inherently rewarding, for you to do.

Meanwhile, what the second kind of programmer keeps in mind is that types don't exist at runtime. Unless you've completely lost it and are just running programs at compile time – in which case, what's the point? – types don't actually do anything, ever. They don't run, because they don't describe the behavior you want the computer to perform, they only describe generalized and abstract invariants you want the actual behavior to uphold. So the more time you spend on working out logic puzzles in the type system, the less you spend actually writing code that actually directly does anything meaningful to the people who want to use your programs. So if your ultimate goal, the one you keep in mind at all times, is that in the end you want your program to do something useful, then you'll realize that there is actually a tradeoff to highly abstract type system finagling and formal methods and so on. Because while those things do tend to make your program more likely to do the correct thing, but they don't do anything themselves, and usually have a high cognitive load overhead, so the more time you spend on them, the less time and brain space you have to spend on actually making your program do something, and the more time you're spending on checking whether hypothetical code actually does something. So yes, if your goal is to make a computer do something useful, then you're going to need some kind of verification to make sure it actually does the right thing – otherwise you're up shit creek, because you've certainly made the computer do something, but it isn't clear how useful that "something" is, exactly. But the second kind of programmer wants to maximize the amount of time they spend writing code that will actually run and do something useful, so, since working with verification comes at the cost of time and energy and mental space for writing running code, they'll try to minimize doing type system and formal verification stuff as much as possible, with the only factor pushing that mount above zero being the degree to which verified correctness is helpful or necessary for the specific problem at hand.

4. Formal methods and dependant types are a trap

4.1. The First Trap

A response to the previous section might be that there doesn't have to be a distinction between formal methods/proofs, or writing a lot of code at the type level, and writing code that gets shit done. After all, there are things like Algebra Driven Design, where you specify your data types and function signatures to such an minute degree of detail that you can sort of derive the actual program by rote from your types.

The problem with this approach is that is that it doesn't actually solve the problem. Yes, you've essentially turned types into behavior, but in the process you've had to specify your types at such a greater degree of detail and complexity, with so much specific behavior encoded into them, that you've turned your types into code and just pushed the problem back another step.

Think about it: code itself is merely a "specification" of the behavior you want out of your computer. No matter what kind of unexpected bug you have in your code, it's ultimately a problem of your code, as a specification, not being what you imagined it was – incomplete or self-contradictory in some way you missed when you wrote it, not because the computer is mysteriously "disobeying your orders" or something. As we all know, it always does exactly what you tell it to. So code is a formal specification of behavior. Code is just a specification that needs to be fractal in its complexity in order to encompass and contain every behavior needed and every edge case possible, because it is designed to be run on things that are, quite literally, as stupid as rocks.

So the only reason that types and specifications help is that they merely allow you to express a specification for behavior at a higher level of abstraction than your code specifies that behavior – it lets you put things in simpler general terms as a set of declarative constraints. But abstraction is not actually necessarily more amenable to human reasoning at all than particularity: abstraction is actually much harder for humans to reason about. It's just that abstraction lets you compress things so that you can fit more in your memory and get thereby a higher overview of things, whereas a lower level of abstraction is often much easier to reason about on the particulars, but can get much harder to fit in your head. So there's already a penalty for trying to gain easier reasoning properties through abstraction. So I don't think there's anything inherently easier to reason bout in type systems or declarative specifications or whatever. Types are only useful because they are simpler and smaller, and thus easier to reason about, than your code, so you can use them to double-check your code. They're like a checksum that way. But you can still get lost in types, or get them wrong, or put a bug in your type system, or struggle to reason about the type level invariants.

So in fact the more complex you make your types – especially the more specific behavior and invariants you encode into them, as in dependent types – the more the properties of types that make them a useful sanity check will evaporate, until types are no better (or different) from the code that you were trying to check, because they're encoding just as much behavior, and are just as complex and sprawling to keep track of, and you've just pushed your problem another step back, because the type-level language itself will need a type checker. As Simon Pyton-Jones, creator of Haskell, says:

There are programs you can write which can't be typed by a particular type system but which nevertheless don't "go wrong" at runtime, which is the gold standard - don't segfault, don't add integers to characters. They're just fine.

I think to try to specify all that a program should do, you get specifications that are themselves so complicated that you're not longer confident that they say what you intended.

And this isn't something you can really avoid in dependently typed languages: in them, your sanity-checking system (your type system) is your entire, full, term-level programming language, except the entire thing is lifted up one level of abstraction because now it operates on types as values instead of concrete values! This means that the upper bound of the easily accessible complexity of your type system is now the same as your code. And if you're the first type of programmer, then you're going to want to prove all the behavior of your code correct, which means a nearly 1:1 correspondence between your code and your types, which means you're going to always be asymptotically approaching dependently typed languages, and using those types to their maximal amount.

4.2. The Second Trap

The second problem with formal methods and dependent types is that they don't take into account the real world nature of programming in most cases.

First, most programming is not done in a domain where the requirements and design goals are well understood enough in advance that a specification can be written up front. The actual human beings who are asking for the software to be built won't be able to just sit down and create a perfect idea of what they want the end product to be in their heads. They're going to want to give an initial direction, and then iterate in conjunction with prototypes produced by the software development team. Formal methods are deeply hostile to this. They want all of the basic properties and features of the system provided up-front, so that you can either derive a verified program, or verify and existing program, using them; moreover, they'll want a complete and internally-consistent design from the start, when we're not even going to have that to begin with.

And any attempt to try to get the customers to plan out everything in advance – as some particularly dogmatic FM people might be tempted to argue, with shades of Djikstra – is likely to lead to disaster: not only will they ask for the wrong things, things they don't want or need, the process will likely take so long that their actually requirements will have changed by the end – or the entire project could've become obsolete by that time. This is not even to mention the fact that if customers, or even programmers, are given free reign to design what they think is needed up front, they're likely not going to be able to accurately estimate what will be possible to implement, and so they'll end up with a giant stack of things you can't possibly actually do, instead of going out and prototyping each feature quickly and getting a sense for the problem.

Second, as programs develop, our knowledge of the specific domain, and how to solve problems within it, evolves. This means our program's ontology, including the data structures, functions, and software architecture it uses to model the domain, will need to change drastically many times over the course of development before they reach their final form. Programming isn't just rote, mindless construction work. It's often an exercise in design and applied ontology, and problem domains are often nuanced and complex enough that that work can't be done going in – there are too many unknown unknowns. Programmers need to be free to experiment and explore and prototype during the development of a piece of software. Any static type system will significantly hinder this kind of code, but more rigid ones, or anything that requires code to be perfectly correct by construction, will be worse for this.

Third, requirements change over time, even after the first release of a piece of software. If the way your software is written requires it to be this perfect, beautiful piece of logic, then it is going to be incredibly brittle and non-extensible if any changes need to be introduced that fall outside of the narrow set of requirements that have already been predicted. Any choice in how you model things with types and formal methods essentially makes strong assumptions about what the future requirements of the software can be allowed to be, locking off whole swaths of the future.

4.3. Conclusion

Of course, the point I'm not trying to make here is that, as a result, we shouldn't use formal methods, or even depedendent types, at all. What I'm suggesting is merely that formal methods and powerful type systems such as dependent types have significant tradeoffs, and that as a result they shouldn't be viewed as "just good software development practice" or "the inevitable future" or as an "inherent good," and should not be applied like religion where more abstraction and complexity in your types is viewed as an inherently good thing. Instead they should be used carefully, on a case-by-case basis, when merited by the software's domain, and most people should consider 80/20 alternatives such as randomized property-based testing, gradual typing and schemas, deterministic simulation testing, design by contract, simple ML-like type systems used responsibly, and lightweight model checking when necessary with things like TLA+ or Alloy.

Notice that a lot of my suggestions here work by verifying that some code you've actually written is correct after the fact through checking, instead of proving it correct by construction as formal methods and type systems do. This is because one of the most important things about problems in fields such as mathematics, computer science, and programming, is that merely checking that a solution is correct is a lot easier to do, both computationally and mentally, than constructing a provably correct solution.

Moreover, checking doesn't require any kind of meddling with the interior implementation or even architecture of a software component – it can treat components, subsystems, or even whole systems as complete black boxes, which you are free to add extra features to, rearchitect, or optimize at will, while continuing to verify whatever properties you thought were important enough to test, preserving the flexibility necessary for most software projects. While this approach is of course less powerful, with randomized property based testing, deterministic simulation testing, and design by contract you can be thorough enough to get 80% of the benefit of proof by construction, and crucially, with a much lower cognitive and labor burden. This is essentially the crucial distinction between the kinds of methods I endorse, which either only require very basic and easy proof by construction (such as an ML-family type system) or focus on checking things, when compared to dependent types and most formal methods, which operate by full proof by construction.

5. We need to get away from "systems" languages

There's been a lot of discussion over the years of how static languages effect the experience of programming. The summary is that high performance, static, compiled, typed languages come at the cost of being able to have a tight feedback loop and freely experiment as you're writing code, which is typically a bad thing because programming is not an act of mere construction following a specification, but an act of design, where we're feeling out how things should be executed and constructed as we go, because there are far too many unknown unknowns prior to actually trying something out, and our brains aren't good at simulating computers, and any attempt to plan a piece of software up front that's detailed enough to remove the need for exploration and experimentation in how we implement something will just end up being code we need dynamic experimentation to design anyway.

I'm not here to talk about that, though. What I'm here to talk about is the fact that the use of static, compiled, high performance programming languages effects how users experience software, not just how developers experience writing it. Such software, by its very nature, is a closed black box to users – difficult to modify, difficult to inspect and understand, difficult to compile. It may offer extension frameworks you can hang off it, but unless so much of the software is implemented in the extension system, and the extension system is so powerful and extensive, and so intimately involved in everything, that the program essentially becomes a somewhat specialized language runtime and development environment more than an application for a particular purpose – e.g., the Emacs and web browser strategy – the program will always remain opaque and limited from the perspective of the user. This leads to all the shortcomings of UNIX and most other modern software, the shortcomings that drive people like me that want integration and flexibility and powerful programmability and malleable environments to Emacs (or to browsers!).

If we had environments that were written in a language more like Common Lisp, which offers runtime dynamicism, malleability, power, and access to low level concepts like pointers and manual memory management, and the ability to do high performance optimizations when necessary, we could have the extensibility and malleability of things like Emacs for our whole operating system, everywhere. And development would be much faster and more pleasant, too.

So why don't we?

Generally the argument that people make is that Common Lisp isn't fast enough to write operating systems and drivers and browsers and shells in. That was once true, twenty or thirty years ago, when most operating systems we use today were being created, and that's why we're locked into this statically compiled program black box hellscape, with separately compiled processes communicating via pipes and IPC instead of just calling each other's functions and passing data structures, because you can't compile programs in a way that would allow that level of interaction, and you couldn't get away with not compiling your programs. But I don't think that's true anymore. Our computational budget has increased incomprehensibly since the late 1990s, to the point where what once was considered a horribly slow and inefficient language like Common Lisp is actually one of the fastest languages out there, and can readily be used for high performance work.

You might still argue that Lisp is too slow – that we shouldn't accept that 2x performance hit versus C in our most basic bedrock layers, because that would make everything else too slow, especially since Moore's law is largely dead, but I don't think that's actually a good argument – I think it's penny wise and pound foolish. Because what's happened in reality is that we've written all those tools in C or C++ or similar for performance, but then, because they're terrible, awful, insufficiently reliable and extensible abstractions, we've written new layers of abstractions on top of them just to get away from them, like Electron and web browsers and IDEs and so on. And those abstractions aren't free: not only do they have a performance tax just by virtue of the many extra levels of indirection involved, but they're often written using languages that are both slower and less dynamic in a way that a user could use than Common Lisp and similar languages. This means that in the end, we're probably worse off, performance-wise, than if we just implemented a good set of bedrock abstractions from the start, even if they were a bit slower, and then had to build fewer layers of abstraction and indirection on top just to escape bad abstractions. What if we spent more of the wondrous performance budget of our computers on creating better abstractions in safer, more dynamic, and more powerful languages from the ground up, instead of gaining a ton of performance budget and then wasting it all later trying to claw back safety, dynamicism, and power later on?

Crucially, I want to say that I don't believe in the myth of the "sufficiently advanced compiler" – if your programming language's paradigm and fundamental mode of operation is completely contrary to how computers actually compute – for example, if it's pure, immutable, and lazy – and it can't express more directly-mapped computations without further layers of abstraction and indirection – such as refs in OCaml or monads (which are boxed types) in Haskell – then you're never going to be able to make it fast. Nor do I think that any amount of performance loss is acceptable to achieve the goals of a dynamic, malleable computing environment – this is a tradeoff, since no one will use a computing environment that's too slow – so you need to have a language that's fast enough but also dynamic. As far as I know, only Common Lisp meets these requirements, and maybe eventually something like jank – although how fast exactly a language for what I'm proposing would need to be is an open question.

6. Programming languages are tools for thought

Written language is a tool for thought.

When thoughts are in your head, they exist as a shifting cloud of ideas and connections, never fully locked down to a static web of meaning and ideas, and with different parts of the larger thought and context continuously shifting in and out of your focus as you analyze different aspects of the problem. This is because human working memory is generally too small, and too unreliable in particular details, and to prone to cognitive dissonance, to nail down a precise set of ideas and connections, expressed in specific detail, and hold them perfectly still, while still having cognitive space left over for manipulating them in some way.

When you're writing your thoughts down, however, the page is acting as an external brain: you express your thoughts on the page and then instead of having to hold them in your working memory in order to analyze them, it holds them for you, ensuring clearer recall and more mental space for actual consideration.

Of course, we can't just directly dump our thoughts onto a page. We have to use a language of some kind. This entails a few things. First of all, we have to perform the act of "collapsing the wave function": deciding which specific ideas and connections, which specific pieces of context and larger thoughts, are relevant, and at what level of detail. Second, we have to express those things in terms of the language we've chosen to note them down with: choosing what words, what grammar, what structure, what order, and so on. The act of expressing these thoughts in a language can bring even sharper focus and clarity to what precisely we mean and how it's arranged, and how our logic flows.

This can then make our understanding and manipulation of our own ideas clearer.

I think the same is true for programming languages. They, too, are a language for expressing ideas about behavior, operations, categories and ontologies, relations, and abstractions. And they, too, are written down using an "external brain." The only difference between a programming language and something like mathematics, or even natural language, is that it is designed to be even more precise and unambiguous, enough that a computer can execute it.

This might seem like a handicap for the expressivity of a programming language compared to other formal languages to those who are used to seeing code in traditional languages like Java, but if you've seen good high level code in Lisp, Scheme, Haskell, or APL, then you'll know that they can be just as beautiful and comprehensible (and concise, depending on your taste in languages – I prefer lots of full words, like in Scheme and Lisp, over terse point-free programming like in Haskell or APL) for expressing ideas as any other formal notation.

Moreover, a powerful (so that you can express any abstraction and mental model you need), high level (so that you're freed from accidental complexity in your expressions), symbolic (so that you have a way of representing unique concepts – essentially "proper nouns" – in a terse language level way, and doing symbolic manipulation for things like mathematics and logic) or logical (so that you can speak declaratively about the problem space and constraints) computational language can actually be a far more efficient vehicle for expressing these things than either mathematics or natural language.

This is due to the fact that programming languages are executable. If the measure of truly understanding something is being able to do it in all general situations, and teach it to others, then being able to write working code to represent an idea represents the strongest form of understanding of all: being able to write a description of the idea that is so precise, yet so general, that it can actually teach a computer to do it, and it actually works. For example, Sussman, Wisdom, and Mayer's Structure and Interpretation of Classical Mechanics is a graduate physics textbook expressed entirely in terms of generic programming in Scheme, instead of mathematics. In the Preface, they state why:

Classical mechanics is deceptively simple. It is surprisingly easy to get the right answer with fallacious reasoning or without real understanding. Traditional mathematical notation contributes to this problem. Symbols have ambiguous meanings that depend on context, and often even change within a given context. … [Therefore] [w]e require that our mathematical notations be explicit and precise enough that they can be interpreted automatically, as by a computer. As a consequence of this requirement the formulas and equations that appear in the text stand on their own. They have clear meaning, independent of the informal context. …

Computational algorithms are used to communicate precisely some of the methods used in the analysis of dynamical phenomena. … Computation requires us to be precise about the representation of mechanical and geometric notions as computational objects and permits us to represent explicitly the algorithms for manipulating these objects. Also, once formalized as a procedure, a mathematical idea becomes a tool that can be used directly to compute results.

Active exploration on the part of the student is an essential part of the learning experience. … That the mathematics is precise enough to be interpreted automatically allows active exploration to be extended to it. The requirement that the computer be able to interpret any expression provides strict and immediate feedback as to whether the expression is correctly formulated. Experience demonstrates that interaction with the computer in this way uncovers and corrects many deficiencies in understanding.

Use of programming languages this way is not limited to a few academics, either. Programmers in the industry really do it too: all the knowledge that goes into any reasonably sized program is going to be far too much for anyone to actually hold in our heads, especially over long periods of time. And comments, while they can help, violate the principle of a single source of truth: they can end up encoding misconceptions about what the code actually does or the knowledge it encapsulates, or they can get out of sync. So when we write code, we have to employ the same dynamic with it that we have with writing, relying on it as a sort of cybernetic extension of our minds, holding our thoughts about whatever we're programming as we do other things, relying on the mental models and knowledge encoded in it to think with.

This is totally unconscious, most of the time: once you've integrated well into a codebase, the surrounding code with its accompanying ontology and knowledge just automatically filters into and structures your thinking about whatever you're writing or reading. And whenever you're writing code, you will tend to structure it and organize it to match your unspoken ontology of the problem, and encode your knowledge about the behavior required and the other behavior and properties of the surrounding system.

This means a few things.

Even if you don't use them often, or even most of the time, your language needs to be able to express powerful abstractions when necessary. Limiting the range of abstractions your programming language can express, or the level of abstraction of those concepts, is like giving your programmers brain damage. They will no longer be able to think as well, because they'll have gaps in their external minds where concepts that might be useful for modelling the world should go.
Conversely, using overcomplicated ways of expressing ideas is just defeating the purpose of expressing them. You want to express ideas as clearly as you can, just like in writing.
You want, as much as possible, a language that can express and manipulate whatever the primary elements of the ontology you're modeling are as first-class entities. But importantly, you don't want the mechanisms that let you talk about those entities to be highly abstract – you want them to be concrete, specific, like symbols in Lisp, not monads in Haskell, because speaking all in abstractions is not a good way to think. Humans tend to get tangled up in abstractions when they're not combined with concrete details.
Your language should be highly multiparadigm – as long as a decent level of orthogonality is maintained – because not only do you need powerful concepts and high level abstractions to be available when necessary, just like they are in people's heads when they're thinking, but having different ways of expressing concepts and abstractions is also important. This is because different problems are most amenable to different kinds of ontologies – for instance, a system with a ton of stateful components interacting is probably best represented by objects – and different people think about things in different ways. Being able to express things in a diverse way is important.

7. Limiting language expressiveness is a technical solution to a social problem (a category error)

The common objection to the idea that programming languages should be powerful enough to express almost any idea or abstraction is that this will result in certain programmers on a team – who are either much smarter than everyone else, overestimate their own intelligence, or are simply undisciplined – writing code that nobody can understand using all sorts of unnecessarily advanced features. Thus, they suggest, the correct solution is to dumb down a programming language, to hobble it, cutting out power and abstractions and concepts left and right to prevent bad programmers from getting their hands in the cookie jar.

The downsides to this approach are:

You often ultimately end up needing these features anyway, and so they slowly creep back into the language by the back door, often in a more complex and ad hoc way than they would have been if they'd been included from the start. Just look at the development of Go (the recent introduction of generics and iterators) and Java (where do I even start) over time, or the way Java programmers liberally sprinkle compiler plugins and decorators everywhere.
It makes your programmers dumber.
The consequence of your language not having a concept or abstraction when it's really needed or particularly natural for a given problem or domain is that you end up writing more code, that uses awkward and ill-fitting, counterintuitive ontologies, and can't communicate tacit knowledge well because it's so caught up in low level details and getting around the language, so that the resulting code is often far harder to understand than it needs to be.

However, all of these problems will probably be a smaller issue and/or occur less frequently than undisciplined programmers doing ill-advised tricks on the company codebase. So we do need to figure out a way to handle those programmers while avoiding these problems. How?

Luckily, there is a solution: these downsides to the language-level technological approach to solving the problem of undisciplined programmers are actually symptoms that can lead us to a better solution. They're symptoms of the fact that this language level solution is actually a category error: applying a technological solution to a social problem. The solution to undisciplined programmers is to discipline them, not to try to eliminate the need to discipline them in the first place, because that solution is like chopping down a tree in order to trim it – it gets at the root of the problem, sure, but it's too rigid, too absolute, too far-reaching for the actual category of the problem it's trying to solve.

How might we go about disciplining the undisciplined programmers, so that they don't use unnecessarily powerful features? In my opinion, the core of the process should be code reviews. The whole problem with using overly powerful features in your language, after all, is that you might produce code that your peers can't easily understand – and by extension, that future you probably won't be able to understand either. So before you can merge any code, why not ensure that a certain number of your peers have to sign off on it, and that they have to actually read and review your code for over complication before they do so? Why not set up a linter that flags the use of any features the project team as a whole has decided are likely to be misused, and directs the attention of the code reviewers to those locations in the code, and if they can't immediately see the need of whatever construct you're using, or can't immediately understand it, or if there is even one workable alternative to using that feature that doesn't have significant other downsides, then you have to go back and rewrite it using a simpler feature? Perhaps we could even have linters that automatically suggest using less complex features, when that can be ascertained, and that can be put into the CI system?

The benefit of this is that there's more flexibility – if a feature or abstraction is truly need, it can still be used – but also that it essentially "magically" adapts to the level of experience, technical expertise, intelligence, and preferences of your team, instead of having to use the coarse-grained proxy of a language with limited features. That's the benefit of using a social solution to a social problem: it tightens up the gap between the desired outcome (your team members being able to understand what you've written) and the metric/process used to achieve it (actually just asking your team members), and allows for the necessary level of flexibility.

An objection to this might be that the peers one relies on in code review processes might be anti-intellectual, themselves undisciplined, or blindly following some cargo cult software practice (such as Clean Code), such that the process becomes a farcical waste of time. However, again the solution to that is not to accept that programmers will always on average be bad and dumb as some immutable state of the world that we're just helpless to do anything about, and use absolute technological solutions to solve the problem. The solution, once again, is to use social solutions to social problems – namely, invest in your programmers. Give them mentoring, on the job training, set aside a portion of their paid time to learn and improve, perhaps with textbooks or even courses you offer them, things like that. Try to foster an environment where programmers want to stay working on a project for a very long time and become experts in that particular project – maybe offer pensions if they work somewhere long enough? – so that all the developers can get to know each other and form a common understanding about how to write and design code. Why in the hell are we as an industry optimizing for a revolving door of strangers unfamiliar with a code base and domain and unfamiliar with each other, who we assume are all under-trained, because we don't invest in their training at all, to be doing all of our huge, complex, and long term projects? It's absurd! (Well, until you remember this is capitalism we're talking about – don't want those workers to get too much leverage!)

8. TODO Structural type systems and multiple dispatch are the future of type systems

9. Why I prefer Fedora Atomic (plus BlueBuild) to Nix OS

All changes are automatically tracked and versioned when using Fedora Atomic via layers or BlueBuild modules. With Nix, all your state is in one place, yes, but it can still easily become a gigantic mess of state that's difficult to untangle, since it's an actual codebase that you have to maintain over time. You can of course layer a version control system (probably git) on top, but then you have to remember to uphold one-change-per-commit and so on.
Changing and rebuilding your image (as opposed to just layers) takes place on a server, air-gapped from your PC. This means:
- It's much more difficult for malicious software to effect your system.
- It's easier to insulate your system from possibly broken changes since there's a CI element.
- It's easier to use your computer and build the next version of your image asynchronously.
The core image uses a mostly traditional file layout, it's just that everything outside of /var and /etc is immutable, as opposed to the insane complete remodelling Nix has going on. This means things are much more likely to just work on a Fedora Atomic system, whereas Nix OS systems require all packages to be custom-patched to work, and this can cause bugs, lack of availability of packages, and so on.
The Fedora Atomic file system is pretty much human readable and human navigable, although of course not perfect, since it follows the standard UNIX filesystem layout. The Nix OS file system is totally opaque to human beings.
You can just inherit from upstream images, thus automatically getting updates from them as well, instead of having to copy and paste configuration manually.
Although less theoretically pure, Fedora Atomic's use of simple composable YAML and Bash scripts makes what everything does very clean and easy to understand, whereas Nix OS is a lazy purely functional language, which is a pain to work with. Guix of course solves this, but Guix has other problems.
Nix also has a serious cultural problem.

10. Hand tools and workbench tools

It might seem strange that I am a big proponent of malleable systems like Emacs and Lisp, and also of Fedora Atomic and GNOME. Someone who disagreed with me on one of these preference might even be tempted to accuse me of hypocrisy, or otherwise use one preference to undermine the other. However, I'd argue it's perfectly consistent due to a crucial distinction I talk about in my GNOME Is Not 'Mobile-First' essay: the split between hand tools and workbench tools (I didn't use exactly these terms in the original section, but I think these are better ones).

Hand tools are the tools you actually open your computer every day to use. They're things like your productivity system (calendar, to-do app, project manager, what have you), note taking system, text editor, email client, web browser, IDE, word processor, your drawing software, image editor, video editor, 3D modelling software, animation studio, and so on. These are the tools that you actually use to do the work you consider central, whether that's for your job or a hobby that you care about. These are the ones that are "in your hand" all the time, the ones that you really care about.

Then there's workbench tools. These represent all the infrastructure that gets you to the point where you can easily and effectively use the power tools. It's the plumbing in the background. Things like Wi-Fi, Bluetooth, file management, window management, cross-device syncing, VPNs, battery management, process management, display management and graphics, drivers, the hardware, and so on. You might think of these as the things you usually put on your desk or workbench while you're using the really important hand tools, that you pick up occasionally to do something secondary, in a supporting role to your main tools. You might also think of these as your workbench or toolbox itself.

Generally, for me, hand tools are best when they're powerful and flexible, because I don't want to have to constantly switch to other tools – with inconsistent interfaces and capabilities, and coarse-grained integration – to do related tasks. These tools also really need to be configurable, so that I can mold the tool to my individual needs and workflow over time, grow with the tool. The power comes with a steep learning curve, the flexibility and customizability come with the possibility for unreliability and error, and the customizability opens up a huge new hole to sink time into, but these drawbacks are worth it for the power I gain, because the whole point of using my computer is to access these tools, and they're deeply important to me and I'm going to spend a lot of time using them for complex and advanced tasks in the long term. Emacs and Lisp fit this description for me.

Meanwhile, workbench tools are at their best when they're simple, maximally reliable, intuitive, and distraction-free, even when that sacrifices power, completeness, flexibility, and customizability to some degree. This is because the tasks I do with these tools are totally secondary to what I actually want to get done, so I don't want to be forced to spend time learning and maintaining them, and I simply won't use any extra more powerful features even if they are there, so them being there just adds complexity and unreliability; and moreover, I don't want to have to worry about getting sucked into some customization or perfectionism rabbit hole when trying to use the tool. If I open my computer to program and I want to listen to music on my Bluetooth earbuds while I do it, the last thing I want is to have to fiddle with my Bluetooth daemon or something. I just want my workbench to fade away, to be the invisible background for the tools I actually care about. GNOME and Fedora Atomic fit this description for me.

This doesn't seem to be only my opinion, either – for instance, many Emacs users who highly value its flexibility use it on a Mac, probably precisely because they want this stable, reliable core to operate on top of.

Now, I don't think this dichotomy is absolute: some workbench tools, depending on which hand tools you use, are so intertwined with your use of hand tools that you really need them to become hand tools too. For instance process management, file management, and version control happen so often when you're a programmer that you really want them to be hand tools too – but in my opinion, when this happens, what you should do is to integrate them deeply into your existing hand tools, so that you don't really expand the surface area of the drawbacks of your hand tools, while still increasing the surface area of their benefits, and maximizing integration between tasks you often do together and intertwine, instead of adding more hand tools on top. In fact, merging as many of your hand tools into one environment as possible is usually a good thing for the same reason. This is why I integrate so much into Emacs. This is also why IDEs exist. I call tools that need to be integrated like this "secondary hand tools."

Now, does this mean that I think the way that Emacs plus Fedora Atomic and GNOME achieve my goals perfectly? No, I'm not deluded. My ideal system would have these properties:

Composed of a core operating system image (containing the entire set of infrastructure and workbench tools) that:
1. Is immutable at runtime.
2. Is updated atomically, through swapping out the core image.
3. Can be modified arbitrarily at run time through layering changes on top, where each change is transactional, version controlled, kept separate (only shadowing the core underneath), and reversible.
4. Has a way to specify in detail how to build the image with custom modifications, preferably byte-for-byte reproducibly, which allows inheritance from upstream images.
5. Has an easy path to transition from run-time layered changes to changes that are part of your image specification.
6. Is built offline, while the image is not in use, and automatically checked to make sure it will work, and then and only then provided as an update to the running image.
7. When a new upstream image is available, has all your changes rebuilt on top of the upstream image, checked, and provided as an update.
8. Can easily be built either locally on your computer and then swapped in or on your own server or in the cloud.
9. Is built from a system that fundamentally understands and is intended to work with images.
10. Can be easily reset to factory settings, removing all layered changes, at runtime.
11. No changes to the core image should effect the user's file system.
12. All the workbench tools are maximally reliable and cover all the basic use-cases, but are as simple and intuitive to use as possible, even if that leaves out some edge-case features. Something like the GNOME DE, but run by Transient menus as well as point-and-click.
…which provides a runtime userland image that is composed of a highly integrated, creation-first computing environment that combines all your hand tools into one deeply integrated thing, and provides secondary hand tools so deeply integrated, and so conveniently accessible at any point, with the rest of the hand tools in the computing environment that you barely even notice they're there or separate from the primary hand tools.
Which has the concept of "local" changes (to secondary configurations, or for installing small tools or programming toolchains) that aren't even modelled as layers on the core image, but instead as something else, maybe user-local and localized to a specific part of the fileystem, not effected by resetting the image.
Has a deep level of communication and integration between the core image and the runtime userland.

I think Emacs gets very close to this in some respects – you can't actually modify the core sources, even the Lisp ones, you can only add modifications dynamically on top at runtime, and then easily migrate those modifications to your init file to persist them, but you can even then still always remove all these modifications by starting Emacs with -Q, and it provides a deeply malleable, creation-first computing environment – and Fedora Atomic with BlueBuild gets close in others – actually tracking, versioning, and making individually reversible the runtime changes, providing a coherent way to make changes to the specification of the image instead of just things that have to get re-applied at start time every time, providing off-computer image building and checking for the built images, and actually providing all that's necessary for an OS in the image, providing atomic updates – and Lisp gets some others – an easy way to go from runtime changes to full modification of the image with save-lisp-and-die – but none of them have all of it quite yet. I think any modern stab at a Lisp OS would have to get this right from the start, to avoid turning into a totally unreliable ball of mud.

11. TODO Stallman needs to die

12. How to design software

In his classic essay Lisp: Good News, Bad News, How to Win Big, Richard Gabriel outlines two different approaches to designing software, called the MIT approach or the 'Right Thing' and the New Jersey approach or 'Worse is Better,' and describes why he thinks the latter approach to design, while worse on all the metrics he cared about – he was an MIT man himself – had "better survival characteristics."

In his estimation, this was because while the MIT approach, valuing correctness, completeness, and interface simplicity and consistency over ease of implementation or rapid design, tended to take a very long time to design, be difficult to implement, and require powerful hardware, the New Jersey designs tended to be able to come out quickly with just enough of the problem solved to be useful, evolve rapidly, and because they optimized for developer simplicity above all else, tended to be easy to implement and run in lots of places.

The result of this essay has been disastrous in many ways. Instead of listening to his call to action at the end, to try to figure out the best way to adapt some of the better aspects of the New Jersey approach into the MIT approach, almost everyone took this essay to mean that Worse is Better really was better. They got the idea in their heads that being worse (simpler, but less correct and complete) was a goal in and of itself, not a tradeoff to be made when necessary in order to make something feasible. This created an unthinking dogma in the industry that first-principles thinking was bad, that any attempt to come up with powerful fundamental concepts and use them right was wrong, and any criticism of UNIX, C, and anything else that was labeled as "New Jersey style" was heresy. "Worse is Better" became a culture, and a thought-terminating cliche, because people took "has better (short-term) survival characteristics" to mean "is good," due to the myth that the market chooses what's best.

The fact is that this is wrong. While Worse is Better solutions tend to be useful more quickly and better adapted to their contemporary surroundings – to grow like a virus, to use RPG's phrase – those solutions tend not to be built on correct fundamental principles in such a way that when they try to grow and change, to incorporate things people need from them in the future (usually retroactively adopting aspects of the Right Thing solutions they killed), they tend to become horrible shaggy monsters. These 'Worse is Better' solutions, while they might be improved over time once their success is locked in, will never become good. We will never reach the point he predicted where we have a good operating system and programming language, and they're UNIX and C++. This is because skin-deep correctness isn't correct: anything "better" built on top of fundamentally inimical primitives will be a leaky, half-baked abstraction at best that is full of gotchas and exceptions, and the heights we can ultimately reach will still be limited by the need to endure exponentially more effort to paper over the bad bedrock abstractions.

I think this approach is slowly strangling the software we have access to, and we need something better. Every platform around us has the feel of something slowly rotting, and that shouldn't be how it is. What we need to do is take up RPG's challenge: figure out how to incorporate the best aspects of both approaches into something better. Few people have tried to do this yet, because it's far easier just to relax into a philosophy that doesn't require real effort – since it prioritizes simplicity of implementation over all else – than to actually sit down and figure out a philosophy of software design that can get us where we need to go, but we really need to do it.

I don't fully know how to do this, but the first step is to figure out what the essential beneficial properties of the New Jersey method are, and which are accidental properties. Right out of the gate, counterintuitively, I don't think simplicity of implementation actually matters that much to the success of the New Jersey style.

If you look at the world around us today, every single 'Worse is Better' solution has become massively more complex than literally anything the Right Thing crowd has ever suggested, in part because of the amount of complexity that has needed to be slathered on top of the bad core to make everything work, so how complex something is to implement isn't a direct problem, and nobody finds the fact that a whole second GCC/Clang or GNU or Linux aren't implementable off the cuff a problem, because no one actually ports or spreads software by doing that. Instead, they port or spread such software by making minimal changes to the existing code, and have for a very long time. The success of a piece of software really isn't meaningfully tied to how easily people can reimplement it, and I don't think it truly ever has been, except perhaps in the exceptional circumstance of UNIX specifically, and nothing else, due to its licensing situation.

Nor is simplicity that important for portability: if you can design your system in terms of a very simple virtual machine, and implement most of the complexity of your system in terms of that virtual machine, all that needs to be ported from place to place is that one very simple program, and the rest of your complexity can follow easily. (This is in fact what the Smalltalk people did.)

What actually makes it successful then? The one consequence of simplicity that I wasn't able to dismiss as unimportant or reproducible by other means above: rapid development time. While Worse is Better solutions can often vastly outgrow the complexity of Right Thing solutions, and can often end up being less portable as well (while also being less correct and complete!), they don't have to pay those costs up front. They can get out the door very quickly with most of a solution, and then amortize those costs over time.

This, I think – being able to move quickly – sums up the rest of the benefits of the New Jersey approach as well. If you can get develop whatever you're doing quickly and get it out the door quickly, then you don't have to worry about falling behind or being made obsolete, or running out of funding. You can also develop and iterate faster, which will give you a better idea of the actual real world conditions and what's actually needed.

How can we integrate this into the Right Thing approach, though? I think the best way to do it is basically to try to get something out as quickly as possible, but to get those fundamentals right and provide a means for others to, as easily and naturally as possible, expand your quick solution to become the complete solution it was always meant to be. In essence, built the 80% solution, but with an explicit eye towards others eventually making it the 100% solution.

Essentially, the life cycle I envision is something like this:

Sit down and figure out what your goals are for your software project.
Figure out the minimal set of features you'd need to complete and correctly solve those goals. Don't get distracted with secondary things.
Take great care to make sure the system is programmable, extensible, and malleable, so that it can be extended on a first-class basis to do more things, or improve the completeness of the system in the future. No extensions, add-ons, or piping to external programs – you should be able to directly hook into the internals of the program as needed and redefine or improve anything, shape it to whatever future need there is.
When trying to find this minimal, powerful set of options, don't get caught up in it being pure/perfect/beautiful. This is the route to never getting anything done. Focus just on direct, straightforward, simple, accessible power to do things with.
Release this solution. To others, it will look like an 80% solution to the larger problem domain, but those with a good eye will see the correctness and completeness of your solution to the core problem, and more than that, they'll see that unlike New Jersey tools, you've provided malleability and programmability so that it can become a full 100% solution for them. As a result, they'll choose your tool and start building the remaining 20% of the solution.
Wait. Let the community that grows around your software build their own 100% solutions with it. Encourage them to collaborate to create standard 100% solutions for their overlapping problem sets, to make them as complete and rich and featureful as possible, to document them.
Whenever one such solution emerges as the clear winner, as long as it is powerful and complete, integrate it into the core of your original system – not by rewriting it in the host language (if it's different from the language used to extend it) but just by distributing it and its documentation with the core, helping keep maintenance going if necessary, taking a bigger role in keeping them updated on future core changes, and advertising it as part of your system. When you accept this solution, try to keep a close eye on what the Right Thing is.
Eventually, you'll have a Right Thing 100% solution that's also battle-tested, proven, responds to user needs, and can dynamically change over time.

The benefit of this is that you amortize the costs of designing and building something complete and correct over time, and also distribute them across many different interested groups, so you can get your initial design out as quickly as possible. The core being small and everything else being built on top in an interpreted runtime on top also has the benefit of automatically making everything as portable as possible.

The trick here, of course, is to strike that balance between a powerful core design that forms the correct foundations for the future, and actually being able to get your design out on time. Especially since, if all this code is going to be built on top of your core program, depending on it very tightly but not managed by you, you really need to get the basics right, because you'll never really be able to change them.

There are ways in which both Common Lisp and Emacs embody this model; I think both of them did a lot more poorly than they could have for mostly historical accident reasons (in Common Lisp's case, the performance of computers in the era it emerged in and where it happened to emerge, and in Emacs's case, simply not being the type of tool most are willing to learn and it just happened to come into existence prior to modern human interface guidelines.)

However, one of the historical accident reasons that Common Lisp didn't win out is relevant to the discussion: performance. the New Jersey approach starts with trying to make something performant, whereas the MIT approach doesn't care at all about performance until after correctness is done. The former makes for things that are beautifully adapted to their original environment and painfully outdated in the future, the latter leads to things that only become viable in the future, but by then have been sidelined by evolutions of worse past technology. The solution, I think, is to start with correctness, but then incrementally degrade it as much as necessary to achieve reasonable performance, while leaving interfaces or specifications general enough – or setting up an edition system or something – so that improvements to correctness and completeness can be made in the future without breaking everyone's code.

I really don't know. In the long run, I'm a proponent of the MIT approach. I want it to win. But there is wisdom in the virus-like properties of the New Jersey approach, and we have to figure out how to adopt that – to not let perfect be the enemy of "as good as we can make it right now, and prepared for the future."

13. The beauty of Common Lisp

It's often said that Common Lisp is ugly. I don't necessarily disagree, but in this short essay I want to briefly defend the beauty of Common Lisp, precisely in the places where we usually find it most ugly.

13.1. Lisp-2

One of the reasons most often cited for Common Lisp being ugly is that it is a Lisp-2 – namely, a Lisp where symbols don't have only one universal meaning, but two separate meanings, depending on where they appear in a form: at the car of a form, where whatever value the symbol holds will be executed as a function with the rest of the form as arguments, the symbol's function-value is used to determine what it means; anywhere else in a form (or when the symbol is by itself), the symbol is looked up in the lexical and dynamic scope to determine its value as a variable.

There are two reasons this is often found to be ugly:

Theoretical reasons: the distinction between function and variable values is found to be arbitrary – an unfortunate exposal of an underlying optimization that the user of the language shouldn't have to see. If functions are first-class values, just like any other value in the language, as they should be in a functional language like Lisp, then we should just store functions as regular values in the value cell of a symbol, and just look in the value cell of a symbol when we want to use it as a function. This seems more consistent.
Ergonomic reasons: this two namespace system makes higher order functions more awkward, because you have to worry about the distinction between the two values of a symbol, and while values can't ever be assigned to the function cell, you can regularly get function's assigned to the "wrong" cell while using higher order functions, which discourages functional programming, because then if you accidentally get a function storied in the variable, instead of (my-variable) you have to do (funcall my-variable), and if you want to get the function value that's stored in a symbol properly, instead of (another-function a-function) you have to do (another-function #'a-function).

However, I think there's a certain beauty to making this distinction: since the semantics of a symbol – the way in which it will be actually used – differ based on where it appears in a form, it makes sense for its sense to change to something applicable to how it will be used as well, just like how a word in English that appears in a verb position won't use the same definition as it would if it appeared in a subject or object position, but a different one, because it will be used differently. So when a symbol in a Lisp-2 is in a position where it will be used as a function, its meaning as a function is used, and when it appears as a variable, its meaning as a variable is applied. This also helps guarantee that we're always calling a valid function, since calling some other value is never valid, even in Lisp-1s, since only functions can be assigned to the function value of a symbol.

It's not just that it makes sense in an abstract way, either – the commonly cited benefit of Lisp-2s falls right out of this. Just as, in a natural language that didn't allow words to have different senses depending on where a word appeared, you'd need double the words, in a language where each symbol can only have one meaning, even if it appears in a totally different place and will be used in a totally different way, you suddenly have to worry about name collisions even in totally different semantic contexts from where the other use of the name exists, which forces you to think more about things and use tons of abbreviations and synonyms. Why would you want to have to deal with that for the relatively-rare case where you actually do what to use a variable as a function?

We can demonstrate that having names mean different things in different contexts (where the meaning will be used differently) is useful by just thinking about other examples, as well. For instance, imagine if you could accidentally clobber a module name with a variable name! Or, think of all the minor annoyances that could be resolved in many languages if variables didn't have to share a namespace with types and keywords, so that you could name a variable "class" or "int" if you wanted to. Even in Scheme, the poster child for Lisp-1s, module and macro names live in a different namespace than variable names to avoid clashes!

There are many other technical reasons that one might prefer a Lisp-2, at least at the implementation level, but one can always argue that those technical implementation issues shouldn't be exposed to the user, so I don't personally find them quite as interesting for their own sake as the aforementioned considerations.

(Plus, with sufficiently powerful macros, all language changes are possible)

Does this mean I think Lisp-2s are actually better? No, not really. But it's worth thinking about.

13.2. Multiple equality operators

One of the other things people seem to find most ugly about Common Lisp is the fact that it has multiple equality and copy functions, instead of having a single generic one. The reasons behind this are solid, however: what kind of equality or copy operation the programmer wants depends on the semantic intent behind a type, which can't necessarily be determined from the type itself. Thus if only one copy and equality operator can be provided, the choices are either providing only one type of equality/copying and leaving it totally up to the programmer to implement the rest, or providing generic ones that try to guess what the programmer wants based on the type and hope you get it right, and if it isn't the right test, they're just out of luck. (For a more detailed discussion of this, see this post by Kent Pitman.)

This could of course be solved by copious usage of newtypes and custom algebraic datatypes, for which you implement custom equality and copying methods, which seems to be how ML-family language programmers solve it, but this has its own downsides. First of all because overly structured data adds verbosity and lacks flexibility; and secondarily because this requires you to implement your own equality checks over and over again; and finally because it means that you can't make contextual decisions on what sort of equality check to make without introducing your own zoo of equality checking predicates.

Instead, what Common Lisp decided to do was take all of the copying and quality operators that all the different dialects of Lisp had discovered were useful and ended up using in the real world, and formalize them into a strict hierarchy of equality checking, from most to least strict, so that programmers themselves can choose up front what kind of operation they want to use in each case.

13.3. The ultimate postmodern programming language

Notes on Postmodern Programming, which I otherwise enjoy, claims that Common Lisp is a modernist programming language. While there are certainly ways in which that's true, namely that it enforces S-expressions as the universal syntax, and the fact that it's standardized, I believe that these things actually only exist to facilitate Common Lisp's perfection as a postmodern programming language. Here are the reasons I think it qualifies as postmodern:

The language has no modernist narrative into which all programming must be fit. Instead it comes with a rich set of capabilities for solving the problem using any set of concepts that will apply, often richer and better capabilities than most languages that choose only that one capability. See for instance the co-existence of low level and high level, of procedural, functional, and object-oriented, and of dynamic typing with gradual static typing.
The language is literally unmatched in its ability to allow you to mold it to the problem domain. It does not enforce any limitations – even S-expression syntax, thanks to reader macros – on you at all. You can grow and change and develop it to express any kind of way of talking about a problem at hand that you need.
The language is pluralist: all the ways of molding the language to your needs and thought processes and specific context are just language constructs that can be package namespaced, imported as libraries, mixed and matched, aliased, turned off and on. Whatever modifications you use in one part of a program need not apply to another part, and the dialect that another programmer uses in a library you depend on uses does not effect you, nor does the features anyone wants, because they don't have to be added to the language standard, those who want them can have them and those who don't don't have to worry about them.
The language was not designed up front, absent experience and response to human needs, in the abstract pursuit of beauty and perfection, like Scheme was. Instead it was designed as a way to encode the practical, real world industry experience and needs of real people doing real things. As such its design itself doesn't reflect the sort of modernist attempt to construct and control reality of most languages, but a more postmodernist "writing down of people's experiences."
The language standard only allows the language to be more pluralist than it would otherwise be: everyone can use their own favorite language extensions and domain specific languages and so on, but there is always a common, well understood bedrock beneath it all that everyone can always rely on and communicate using, which means everying is only more interoperable, which is needed for healthy bustling pluralism instead of dead fragmentation and subsequent stagnation. This also allows multiple implementations of the language to flourish as, at least ostensible, equals, where most languages are limited to only one implementation, and all those implementations can live in different places and make different tradeoffs.
Likewise, the simplicity and uniformity of the S-expression syntax is not only just a suggestion – something that can be broken of if you so desire – but is what directly enables the language to be as malleable as it is.

13.4. Cons cells

Xah Lee often complains that the cons cell is a fundamental mistake of Lisp. The Racket language hides it from the student version of their language, referring to it as merely an "accident of history." Clojure doesn't have cons cells at all.

From one point of view, this makes sense – all they are is a building block for linked lists, namely a struct holding a pointer to a piece of data, and a pointer to the next piece of data – so exposing them directly to the user to the level that Lisp does is simply unnecessary. Yes, linked lists themselves are deeply useful data structures for a functional language, but the building block they're implemented out of doesn't need to be exposed, and the equivalence of right-nested cons cells and a list means that one can seemingly (for those who don't understand these data structures, usually those coming form other languages) "magically" turn into the other without warning.

However, I actually think this is a slightly misguided way of looking at it. Cons cells are not "just a linked list building block" that have gotten accidentally exposed. The cons cell actually does some pretty nice things! Sure, other languages have better options now, but the key is that you shouldn't get rid of cons cells until you actually include those other nice things. And at least one of these doesn't actually have an alternative.

By having the concept of cons cells natively in the language means that you can actually manipulate how linked lists are structure directly, such as swapping out a new head or tail for a list. This means you can essentially use lists as a persistent data structure if you want to. Clojure of course has superseded this by making everything persistent automatically, with no work on the part of the programmer, but other languages don't have that.
Cons cells are also just useful data structure building blocks in general, since you can actually control where the pointers in it point, and lists are composed of them and you have first-class access to them even in lists, so you can take lists and build new data structures out of them. You can make binary trees with them, without the added layer of indirection of using lists. You can also use them to build circular lists and queues, which would be much more difficult in a high level language without the ability to unzip linked lists and modify where their pointers are pointing.
Being able to represent the general idea of "these two (pointed-to) objects are related to each other" in an efficient way is very useful. In this sense, though, they're like inferior tuples.

13.5. Property lists

One of the things that Scheme specifically removed from Lisp was the idea that every symbol has a property list attached to it. This was done because the idea that mutable data could live on something like a symbol, which was supposed to be this atomic self-evaluating data structure, seemed like an impurity to the creators of Scheme.

I don't think this is the case. Being able to attach properties to symbols means that you have an automatic way to create, essentially, a proper-named entity referred to at a first class level in your source code that can have metadata attached to it. While that's not going to be useful to most programs, it seems like exactly the kind of thing that would make using a programming language as a tool for thought more useful. Instead of having to create a hash map registry of entities and then use a getter to access them, you could just refer to things by name, directly, and have meaningful information attached to that.

14. What makes a good programmer

Disclaimer: this is just based on my personal experience programming, and knowing one or two other decent programmers, as well as a few people who bounced off programming entirely. This is not based on extensive industry experience, nor any kind of science. If you're interested in the opinions of someone with much more exprience "in the industry" than I, I recommend checking out "The 10x Programmer Myth" which, despite the clickbait headline, doesn't deny the fact that there are some programmers that are much more productive than others, but just seeks to explain why that may be the case, and some confounding factors that may make people think someone is more productive than they really are.

To be a good programmer, in my opinion:

You have to be the kind of person that enjoys thinking about the complex technical details and minutiae of a topic…
…while at the same time being able to keep the bigger picture in mind, not getting lost in technical details while forgetting what your overall goals are, or what the actual practical consequences or real world applicability of those technical details are…
…and also having a sense of good taste, style, craft, beauty, and symmetry.
You can't just think of what you're creating as a one-and-done object like a scientist, mathematician, or formal methods person might think about their code; you need to be able to think about how your code will need to change over time, or how the complex environment will change around it over time, and write something with the flexibility to deal with that.
You need to be able to think in terms of preconditions, postconditions, invariants, and unknowns, but also understand that not everything can be known, and how to pragmatically focus on acting on what is known and prepare for the unknown, instead of trying to know everything up front.
You have to be the kind of person that, when presented with a problem or mistake in their or other people's code, no matter how small, annoying, or minor, sees it in one of three ways:
1. As a chance to improve – improve the code, improve your understanding of things, or improve your knowledge of things to avoid
2. As an affront to the quality and craftsmanship of what you've created that you become obsessed with needing to buff out
3. As an interesting or funny puzzle to solve
You also need to be relentlessly, if grimly or ironically, optimistic: you should be able to see ruling out explanations for a bug, or getting a different error message, and other similar kinds of failure, as forms of progress.
You also have to have the sort of mind that's extremely good at applied ontology: figuring out how best to describe a domain, a behavior, a problem, an idea, or whatever, in a way that can get to some kind of useful core and set of relations to other ideas, without getting bogged down in trying to find absolute eternal essences or perfect hierarchical taxonomies.
You also have to be very good at breaking down a problem into smaller problems, using various strategies.
You also have to be the sort of person that can really enjoy and become invested in somewhat abstract and immaterial objects and things and can get extremely interested in and enjoy reading very highly technical articles and/or studies.

Many of these are personality traits that will be difficult to change once they're locked in during childhood based on your culture, surroundings, parenting, etc, but many of them are skills that absolutely can be improved dramatically with regular, focused practice. Being smart will help you get good at these things faster, and be better at them in the very long run, but it can also blind you to the forest in favor of the trees or help you get lost in abstractions and categories without wisdom and other personality traits, and that doesn't mean someone who isn't as smart can't learn these things and get good enough at them with some extra time or effort.

15. The rise of "Whatever" Machines

Rewrite: refactor to remove all the border properties. If this results in args being merged with an empty :style map, then remove the styled-args variable entirely and have elem-wrapper use args directly instead

I've increasingly used my built-in LLM editor integration to essentially replace emacs macros and regular expression transformations with a syntactically and semantically aware fuzzy transformation system, because it's just infinitely more robust — yes, LLMs are stochastic, but so is my ability to make a very complex regex do what I want the first time — and it requires less finger contortions and looking shit up.

And before you complain, I'm actually extremely good at regular expressions compared to the average programmer, and I've taken my time to learn them. However:

they're just not suited to some tasks:
- sometimes it takes a very long time to find that creative side-step that allows a regex to represent some transformation you want
- sometimes doing it requires using obscure or non-standard features that aren't in my immediate memory
- they can often get very complex, which can lead to a lot of time wasted debugging them
- sometimes what you want to do can't be represented by them at all, but needs a combination of regexes, elisp, macros, and manual editing, so then you've got to load all that into your memory and deal with the impedence mismatches between each and the individual time sink each represents
sometimes it actually genuinely isn't worth learning the Nth intricacy of something, or even if you know it, spending the time remembering the specifics and debugging it when it goes wrong
they're hard to type
there are a million different incompatible versions of regular expressions

For all of these reasons, it can often be a very sound strategic choice, even if you care deeply about your craft as a whole, not to get sucked into the rabbit hole of using them. And whereas usually the answer to that was just to make all the manual edits by hand, LLMs save you that.

So this is not a question of "not wanting to learn", in the abstract, as if I'm lazy and don't care about anything, as it is often disingenuously framed. It's that sometimes it isn't worth learning some particular minutia: it's a distraction from your overall goal — and that overall goal can actually be improving at your craft or making a good thing for people! We have to stop fetishizing Work for the sake of Work in this Puritan way.

Also, this just saved me so much pain and heartache. I went an entire day trying to solve it on my own, and it helped me solve it in like an hour:

transcript

It's the world's best rubber duck, because instead of saying nothing, it can summarize and rephrase what you've said before, add new ideas and insights that can be wrong, but at least jog you off the one track you're stuck in and send you in new directions to explore, and can — extremely fast and pretty accurately — search the web and extract useful nuggets of information on topics from places I at least would never have thought to look or spotted them (like offhand React Native dev comments in a github issue).

It's incredible to me how extremely useful this is, compared to the twisted reality people like this portray based on one or two desultory, hostile attempts to use these things, where they purposefully refuse to learn how to use them, and their vague impressions from bad AI applications. It feels extremely disingenuous, because it relies on a set of assumptions, namely:

a machine must be perfect and make no mistakes to be useful
that our usage of even deterministic machines is, as a cybernetic feedback system, itself deterministic in achieving even sub-goals of the original aim
that it must be an all-or-nothing thing, where either you use the tool to do everything, and no longer care about your craft, or you care about writing every single class and utility function in your codebase equally and maximally
that if it's bad at one thing — or implemented badly in various places — then it must be universally bad
that a lot of our work isn't just rote work that can be enhanced by machines that mostly just do pattern matching and remixing
that it can't be useful at all if it isn't as good or powerful as the most insane hypsters claim it is (that e.g. semantically-aware text transformation using fuzzy natural language instructions isn't a massive breakthrough on its own)

None of these assumptions of which apply to anything else they look at.

16. Does AI make programmers slower?

Recently, a new study has been released, Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity:

Despite widespread adoption, the impact of AI tools on software development in the wild remains understudied. We conduct a randomized controlled trial (RCT) to understand how AI tools at the February-June 2025 frontier affect the productivity of experienced open-source developers. 16 developers with moderate AI experience complete 246 tasks in mature projects on which they have an average of 5 years of prior experience. Each task is randomly assigned to allow or disallow usage of early 2025 AI tools. When AI tools are allowed, developers primarily use Cursor Pro, a popular code editor, and Claude 3.5/3.7 Sonnet. Before starting tasks, developers forecast that allowing AI will reduce completion time by 24%. After completing the study, developers estimate that allowing AI reduced completion time by 20%. Surprisingly, we find that allowing AI actually increases completion time by 19%–AI tooling slowed developers down. To understand this result, we collect and evaluate evidence for 20 properties of our setting that a priori could contribute to the observed slowdown effect–for example, the size and quality standards of projects, or prior developer experience with AI tooling. Although the influence of experimental artifacts cannot be entirely ruled out, the robustness of the slowdown effect across our analyses suggests it is unlikely to primarily be a function of our experimental design.

This seems pretty solid, right? Not only is it a nail in the coffin for the idea that AI can meaningfully increase developer productivity for experienced developers and large scale projects — which is typically the most impactful area of the industry, as greenfield projects aren't common, and when they are, aren't as profitable, and newer developers are already less productive — but it lends credence to a belief that many anti-AI advocates have been beginning to spout, which is that developer estimates of their own productivity are not only generally innacurate (true) but should be uniquely and specifically distrusted and or outright rejected in the context of AI usage.

This is how it's been reported in the media:

Anti-AI advocate extraordinaire Gary Marcus frames it along similar lines:

BREAKING NEWS: AI coding may not be helping as much as you think

These are all decent headlines, assuming that the study's own assessment of its results are correct. They use words like "may," and at least one of them remembers to specify that it's specifically experienced developers that are under consideration. The problem is that the study… doesn't exactly find what it claims to find.

To see why, go read this article, in full. It's written by Cat Hicks, a PhD psychologist who specializes in studying software developers, who is very much against "tech bro" culture, so it's not like she has a vested interest in buying what OpenAI and co. are selling. Really, go read it, it's a very good post. I'll wait. Hell, don't even come back here, it rather speaks for itself.

But anyway, to summarize what she points out, in order of importance:

Post-study surveys asked developers to estimate AI's average impact, while pre-task assessments were done on a per-task basis, meaning that the claims of developers being wrong about their productivity are based on wholly separate metrics, where their estimates are artificially subject to biasing issues such as memory and recency bias.
When accounting for task heterogeneity, AI showed speedups in some areas and slowdowns in others, contradicting the idea that even senior developers were consistently slowed down by AI across the board.
The study claimed developers' initial predictions about AI's impact were inaccurate, but developers' qualitative reports did actually match the actual measured times when those times were split up by general task type as mentioned in the previous point, since their estimates often mentioned those differences in their estimates of whether they were sped up or not.
Developers were not told how much, or even whether at all, to use AI — just that they could use it. This violates a core invarient necessary for an RCT.
"[T]he intervals around the slowdown effect are large, while on the negative side, still quite close to overlapping 0%"
Only 44% of developers had prior experience using the AI tool (Cursor), and no training was provided on its use.

(Annoyingly, the counter-argument to this point is "well what if being trained to use the tool makes you slower without it, so it would artificially look like a speed boost" — yeah, that's the problem, because this study is using within-subject testing, we can't tell the difference between those scenarios. Maybe it's a bad study? Also, isn't it awfully convenient to assume that for this one tool, you don't need to be trained to use it to become productive with it, and training with it is some kind of irreversible effect that suddenly makes it impossible to assess your productivity with it?)
The study's tasks were highly heterogeneous, drawn from developers' personal work queues, all from highly heterogenous repositories/codebases.
Developers chose which tasks to perform and their order (even doing multiple tasks at the same time!), introducing potential skewing and order effects.
The design mixed AI-assisted and manual tasks for the same developers, deviating from a true randomized controlled trial (RCT).
The way the study and questions were framed verbally, and the introduction of monetary rewards, introduce possible skewing factors into developer estimates of AI's probable productivity.

What the actual study shows, when they control for all the variables their main most-referenced graph didn't, looks a lot more like what people like Simon Willison are saying:

https://lh7-rt.googleusercontent.com/docsz/AD_4nXeW8zy5o2gkRrWigTXlDAexF4cZv_lf5kRUaeZ6trayYW1pxfgvC3gymtKsLM3_vihkXAaxCU6RwZIYhccHyQaaPB2FFEQ2cdl_nZJaRlkkVw7f_x6Nnqdsqp4QO7AQjdsp_Pfd?key=HYycTz68z328S1ovjHk8Lw

https://lh7-rt.googleusercontent.com/docsz/AD_4nXcvbcHMP6rdAD10Aov46BnUhd2KrPhqpUg2qHJpbLAqzW-2rmb4PfdCYoN-Molo6DagYfOEbR9EmxCxQJdygk2TibydsuFX6ULyQUXXr38sCEDRjdZKgbNtGVm1x2Em8wbZa7_b?key=HYycTz68z328S1ovjHk8Lw

Ultimately, this only tells people who have been sane about AI what we already know, but it looks to anti-AI advocates like it confirms their beliefs, and that's annoying.

Footnotes:

In that, if you mess up some imperative-looking, seemingly simple do notation in Haskell, you're going to get probably a page or two of dense type system notation about abstract type algebra.