Stop Writing Dead Programs

Jack Rusher

My talk today is Stop Writing Dead Programs. This is sort of the thesis statement for the talk, even though it’s 40 years old, this Seymour Papert quote saying that we’re still digging ourselves into a kind of a pit by continuing to preserve practices that have no rational basis beyond being historical.

A strong runner up was this quote, which captures the essence of what we should be trying to do when creating new languages:

“The Self system attempts to integrate intellectual and non-intellectual aspects of programming to create an overall experience. The language semantics, user interface, and implementation each help create this integrated experience.” (source)

I will start with a somewhat personal journey in technology. I’m going to ask you for some feedback at some different places, so first off – by applause – how many of you know what this is? Okay, okay, that’s actually more than I expected. Now, how many of you actually used one of these? Okay, so what I can say is that I am part of the last generation of people who were forced to use punch cards at school. I still had to write Fortran programs with punch cards, and this thing is a card punch. It’s like a keyboard, except when you press the keys you’re actually making holes in a piece of paper, and then you feed them into this thing in the back, and the pieces of paper look like this. So each one of these vertical columns is basically a byte, and you’re stabbing through the different bits of the byte to indicate what letter it is. If you look at the top left corner, you see Z(1) = Y + W(1). This is one line of code – a card is one line of code. Something to notice about this card it’s 80 columns wide. We’re going to come back to that later.

Some commenters were confused that we still used punched cards in thes, when display terminals already existed. This was in the context of a required class for engineering students to prepare them for the possibility that they would encounter punch cards in the wild. Most of us never did, beyond this one class. This design dates from 1928. This is a Hollerith punch card, the same one used forever.

Now, what does a program look like if you’re programming like this? It looks like this: it’s a deck. Now notice the rubber band. When you’re doing this, you live in terminal fear that you will drop the deck of cards. It is a terrible experience resorting the cards. That long diagonal stripe there is so that this person, who made this particular deck, could put it back together without having to look at every single line in the process. And the words written on the top of the deck are sort of indicating where different subroutines are located within this program.

Now, to give you a sense of how long these programs can get, this picture (forgive me, it’s a low quality picture). This is the actual reader I used and that in the front there is an actual program I wrote. The lower right hand corner one, which was a Fortran program to simulate rocket flight, because my my particular school had a connection to NASA and we did a lot of Rocket-y things. Right, so can you imagine how long it took me to punch all these and put them in there, and what we would do is give them to a system operator who would feed them into a computer.

In this case the computer I personally used was this one. This is a VAX-11/780. This machine cost nearly a million dollars, and had 16 megabytes – that’s megabytes – of RAM, ran at 5 megahertz – that’s megahertz! This thing in front of me here is thousands of times more powerful than the machine that I was using then – that the whole campus was using to do these kinds of things then – and what would the output look like that came from sending this enormous deck of cards in?

Well, it would come out on a line printer that looks like this. And you wouldn’t get it right away. An operator would give it to you later. Note the vintage haircuts, the fellow in the middle there is the actual operator who was handing me these outputs, and he’s the person who gave me these photos of this equipment. So this process, as you can imagine, was hard, but it was hard in a dumb way. Some things are hard because they have to be, and I really support the idea of overcoming challenges and doing hard things, but this was hard for reasons had nothing to do with the actual problem you’re trying to [solve].

Like, something with a rocket and a simulation, and you’re thinking about not dropping your punch card deck, and it’s taking you forever to find out what happened. So, it really hinges on your ability to emulate the computer in your head because the computer’s not going to help you in any way. There’s not an editor, there’s nothing, and that in turn hinges on working memory, which is something that is not very well distributed among humans. There were a small number of us for whom this whole thing came pretty naturally, and we were treated as, like, special people – as kind of high priests with magical powers, and this is how we came to think of ourselves, right, that we’re special [because] we can make it work.

But the truth is we were less priests like this than we were monks like this – hitting ourselves in the head. Right, but the problem is – as Peter Harkins mentions here – that programmers have this tendency to, once they master something hard (often pointlessly hard), rather than then making it easy they feel proud of themselves for having done it and just perpetuate the hard nonsense.

And I’m going to argue that a lot of what we still do today is very much like what I was doing on that old VAX. For one thing, there’s a lot of batch processing going on, and what’s wrong with batch processing? Hella long feedback loops. It’s no good, takes you forever – it took me 45 minutes to find out what a one card change would do in the printout that I would get back, because that was the loop. You’re thinking: well, it’s not like that for us, right, we’re not poking holes in paper cards – we have display terminals!

But how many of you guys have compile cycles that can take 45 minutes? Famously, the Go team wrote go because they were so angry about waiting for an hour, because they wanted to see what was going to happen with some C++ code they’re running on some horrible giant Google codebase. Maybe you want to deploy your stuff and see if it works, because we’re all running web apps now. So do you, like, stuff it in a Docker container, and then ship it out to the cloud and wait for a CI job? How long does that that take? Two hours for this guy! I mean why do we tolerate this? This is crazy! Docker shouldn’t exist. It exists only because everything else is so terribly complicated that they added another layer of complexity to make it work. It’s like they thought: if deployment is bad, we should make development bad too. It’s just… it’s not good.

So, what kind of things do we inherit from this way of thinking about the world? We get funny ideas that are built into programming about time and state. Ideas like, there should be a compile/run cycle. This is a terrible idea, but it’s an ancient idea, that you’re going to compile the thing and you’re getting an artifact and you’re going to run the artifact over there and those two things are completely different phases of your process. There’s going to be linear execution – most programming languages assume that there’s only one thread and you’re going to run straight through from the beginning to the end; that your program is going to start up from a blank State and then run to termination. Now, how many programs that we actually write do that? We’ll revisit that in a moment.

This really only works if your program is some kind of input/output transformer. So there’s no runtime introspection, because runtime is happening over there and your actual work is happening over here, and you just have to kind of guess from what happened, how it might be related to your code, and if there’s a bug – well, sorry, failures just halt your program. You get maybe a core dump, or you get a log message somewhere with a stack trace in it.

Now, what kind of programs do we really write? Mostly long-lived servers. I’ve got server processes with uptimes of a thousand days. They don’t work the same way /usr/bin/sort works. I don’t want a process that’s optimized for writing that. We also write GUI programs. GUI programs are more intense than this, even. So you’ve got all of these different kinds of input coming into the program, and it’s maybe it’s talking to the keyboard, it’s talking to the mouse, it’s talking to the network, if it’s Zoom it’s talking to the camera, it’s talking to the microphone – it’s crazy.

So this approach to programming just doesn’t work well for the things we actually build. It also infected programming language theory. So, if the program is a static artifact, what does that mean? It means we’re mostly going to concentrate on algebraics, so we’re going to talk about syntax and semantics and very little else. There’s going to be no concern really for pragmatics – and what I mean here by pragmatics is what it’s actually like to interact with your programming environment, and this leads to mathematics envy and a real fixation on theorem proving.

So, to give an example of what happens when people actually concentrate on a part of programming and make progress, we’re going to take a quick tour through syntax and semantics. We’re going to do a simple transformation here. We’ve got 1 through 4, we want it to be 2 through 5. We want it to be relatively general. I’ve written some example programs that do this in a variety of programming languages. The first one here is it is in ARM64 machine language, because my laptop happens to run this processor now. As you can plainly see from this code, it starts off Oh wait! Does everyone here understand ARM64?

Okay, all right, it’s a little easier if I do this, so you can see where the instructions are within these different words. This is a cool instruction set. It’s not like x86. [In x86], all the instructions are different lengths. In ARM64, they’re all the same length because it’s RISC, but we’ll do it in assembly language – it’ll be easier, right. So we we’ll start with this label here, add one, and we’ve got the signature of what it would be as a C program after that.

What am I actually doing when I write this program? Well, the first thing I’m doing is moving things from registers onto the stack. Why am I doing this? I’m doing this because the ABI says I have to. No other reason. It’s nothing to do with my problem. And then I want to call malloc because I have to allocate some memory to return the new, you know, array, with the new stuff in it. So what I have to do… I’m doing crazy things. Look down here, you see the registers are all called with X names? That’s because there’s 64-bit registers at X, but I get down here to set up for malloc and now I’m using W names. Why? Well, I just have to know that I have to do something special if it’s a 32-bit number, and it’ll mask off 32 of the bits and still work great.

Now I have to stuff things in these registers. I have to multiply one of the variables. Do I use a multiply for that? No, I’m using it with a bit shifting operation because that’s what’s faster on this processor. And then I call malloc, and I get back what I want. Great. Now, I want a loop. This is what a loop looks like. Notice we’re on the second page, and all I’m doing is incrementing some numbers.

So, I come through and I do a comparison. Okay, is this register that I put this value into zero? If it’s less/equal, then I jump to return. You can’t see return, it’s on another page. There’s a third page. So, I move zero into this other register and I go through here and bang bang… I’m I’m not going to bore you with the whole thing. I’m bored just talking about it. Imagine how I felt writing it! And then at the end I have to do the reverse of the things I did at the beginning to set everything back into the registers from the stack where I saved them. Why? Because I have to have the right return address to give this thing back. I have to do this like a voodoo incantation, because it’s what the processor wants. Nothing to do with the problem I’m trying to solve.

How can we do it better? Hey look – it’s C! This is exactly the same program. Many fewer lines of code. However, it has a load of problems that have nothing to do with what I’m trying to accomplish as well. For one, I have to pass two things. I have to pass the length of the array separate from the array. Why? Because there’s no sequence type in C. Great work guys! 😆 So then, from there, I want to return this value. This modified sequence. And what do I have to do?

Well, I had to do this in assembly too, but this is crazy. I have to allocate memory give it back and then hope that the other guy is going to free that memory later. This has nothing to do with what I’m trying to accomplish. I want to increment each of these numbers. I do it with a for loop that counts from one to the length of the array. Is counting to the length of the array relevant? Right, no. No, this is not relevant. In fact, essentially one line of code of this whole thing – the actual increment – is the only thing that actually matters.

On the other hand, I can complement C as a portable assembly language because you see I don’t have to do the stack nonsense by hand, and instead of telling it that it’s four bytes wide, I can actually use sizeof to know that but that’s about the only way it’s really an improvement. Now let’s look at [Lisp](https://en.wikipedia.org/wiki/Lisp_(programming_language). Note that Lisp is about 10 years older than C. Here I have a sequence abstraction. I have four numbers and I can use a higher order function to go over it and add one to each of them. This is a tremendous improvement by going back in time.

But we can do better. We can do better than this notation. We can go to Haskell. So, in Haskell what do we have? This is really lovely. We have this thing where we auto-curry the (+ 1), and we get a function that adds one. This is getting pretty concise. Can anybody here quickly name for me a language in which this exact operation is even more concise? I’ll give you a moment. I hear [APL](https://en.wikipedia.org/wiki/APL_(programming_language), and indeed APL! So here we have [rank] polymorphism. I have a single number – a scalar – and I have a set of numbers. Note that there’s no stupid junk. I don’t have to put commas between everything. I don’t have to wrap anything in any special [delimiters] or anything of this nature. I just say add one to these numbers, and I get what I was after.

So if we start from the assembly language and we come to the APL, which is – you know – again – you know – like eight years older than C, we find that syntax and semantics can take us a long way. But there are other things that we care about where no one has put in this much effort. And one of those things is state and time. Almost every programming language doesn’t do anything to help us with managing state over time from multiple sources. There are some notable exceptions. I will talk about them now. So, Clojure, because Rich Hickey – he really cared about concurrency – he included immutable data structures. So now you don’t have constant banging on the same things and crushing each other’s data. This is very helpful.

What else? He’s got atoms. These are synchronized mutable boxes with functional update semantics. Everybody uses these. These are great. He has also a full Software Transactional Memory implementation that frankly nobody uses, but it’s still great. It just has a more complicated API, and the lesson from this probably is: if you want people to do the right thing, you have to give them an API simple enough that they really will. Then on top of this, we have core.async. Now, I have less nice things to say about core.async. I like Communicating Sequential Processes, the way everybody else does, but this is implemented as a macro and as a consequence when it compiles your CSP code you end up with something that you can’t really look into anymore.

Like, you can’t ask a channel how many things are in that channel. You can’t really know much about what’s happening there. And I would say that in the JVM, I agree with what rich said the year before he created core.async, which is that you should just probably use the built-in concurrent queues. Now, in ClojureScript, of course, these things were more useful because everyone was trapped in callback hell. We’ll see what happens moving on, now that we have async/await in JavaScript. Moving on to another implementation of CSP, Go. Go actually did something good here, right, they – and I’m not going to say much else that’s great about Go – is that the Go team includes several all-time great programmers. I respect them all. But I do feel that they had a chance to be more ambitious than they were with Go, which – with the weight of their reputations and the might of Google behind it – could have shifted the culture in a better direction. they built a fantastic runtime for this stuff. It’s really lightweight, it does a great job.

The bad news is that Go is a completely static language, so even though you should be able to go in and ask all of these questions during runtime while you’re developing from within your editor, like a civilized person, you can’t. You end up with a static artifact. Well, that’s a bummer. Okay. And I would say, actually, before I move on, that anytime you have this kind of abstraction where you have a bunch of threads running, when you have processes doing things, you really want ps and you really want kill. And, unfortunately, neither Go nor Clojure can provide these because their runtimes don’t believe in them. The JVM runtime itself thinks that if you kill a [thread] you’re going to leak some resources, and that the resources you leak may include locks that you need to free up some other threads that are running elsewhere, so they’ve just forbidden the whole thing. And in Go you have to send it a message, open a separate Channel, blah blah blah.

Erlang, on the other hand, gets almost everything right in this area. In this situation, they’ve implemented the actor model, and they’ve done it in a way where you have a live interactive runtime, and because they’re using shared nothing for their state and supervision trees, you can kill anything anytime and your system will just keep running. This is fantastic. This is great. Why doesn’t everything work like this? It also comes with introspection tools, like .65 Observer, that should make anyone using any other platform to build a long-running server thing fairly jealous. Now, when I say this, I’m not telling you you should use Erlang. What I’m telling you is whatever you use should be at least as good as Erlang at doing this, and if you’re developing a new language – for God’s sake – please take notice.

I can talk now about something that I worked on with my colleague Matt Huebert. This is something that I particularly like. This is a hack in the cells project was Matt’s baby. He did almost all the coding. I worked with him as a mentor because I had already implemented a number of dataflow systems. ClojureScript. We call it cells, and it takes spreadsheet like dataflow and adds it into ClojureScript. This resulted in a paper that was delivered at the PX16 workshop at ECOOP in Rome in 2016. You’ve got things like this, right. So, you say: here’s an interval, every 300 milliseconds give me another random integer, and it does. And then you can have another thing refer to that, in this case consing them on, and now we build a history of all the random integers that have happened. What else can you do? Well you can refer to that, and you can (take 10) with normal Clojure semantics, and then map that out as a bar chart. What do you get? A nice graph. A graph that moves in real time. Or we can move on to this. We added sort of Bret Victor-style scrubbers into it so that you could do these kinds of things. I’ll show you instead of telling you, because it’s obvious if you look at it what’s going on here. We did this partially to show people that you can just really program with systems that have all those features that Bret was demoing. Source code’s still out there – anybody wants to do that, you can do that. We moved on from that to maria.cloud, which

Maria was a joint project of Matt, Dave Liepmann, and myself. We wanted a good teaching environment that requires no installfest for ClojureBridge. takes all of that code we wrote for cells and turns it into a notebook. We actually did this for learners. Take a look at this. This is a computational notebook. It has the cells, it gives you much better error messages than default Clojure, and so on. We used this to teach. It was a great experience, and currently – this year – thanks to Clojurists Toegther, we have some additional funding to bring it up to date and keep it running. I encourage everybody to check it out. The last thing here on this list is the propagators.

The propagators come from Sussman – this is Sussman’s project from around the same time that actors were happening and Alan Kay was first getting interested in Smalltalk. This was a really fertile scene at MIT in the early 70s. It was actually the project he originally hired Richard Stallman, of the GNU project, as a grad student, to work on, and then later did some additional work with Alexey Radul, which expanded the whole thing. I can’t tell you all about it here. There’s just too much to say, but I can tell you there was a fantastic talk at the 2011 strange Loop called We Really Don’t Know How to Compute!, and I recommend that you watch that when you get out of Strange Loop. Just go home and watch that talk. It’s amazing. A side thing is that the propagator model was used by one of the grad students at MIT at the time to make the very first spreadsheet. VisiCalc was based on this model. This is a really useful abstraction that everyone should know about. It’s data flow based, it does truth maintenance, and it keeps provenance of where all of the conclusions the truth maintenance system came from, which means it’s probably going to be very valuable for explainable AI later.

There are a number of other approaches I really like, but which I didn’t have time to get into here. FrTime, from the Racket community, is great. In terms of formalisms for reasoning about this sort of thing, I really like the Π-calculus.

We’ll move to another area where there’s been even less progress. Now we’re getting to the the absolute nadir of progress here, [and] that’s in program representation. Let’s look at that punch card again: 80 columns, there it is. Now look at this. This is the output of a teletype. Notice that it is fixed width and approximately 80 columns. Notice that the fonts are all fixed with. This is the teletype in question. This looks like it should be in a museum, and it should be in a museum, and – in fact – is in a museum. We got these. So, this is the terminal in which I did a lot of hacking on that VAX that you saw earlier (when I wasn’t forced to use punch cards), and a lot of that was in languages like VAX Pascal – yeah – but also Bliss, which was pretty cool. So you’ll notice that this is a VT100 terminal.

And all of you are using machines today that have terminal emulators that pretend to be this terminal; that’s why they have VT100 escape codes, because those escape codes first shipped on this terminal. Now we’ll move on to another terminal.

This is the one that I used when I was doing all of my early Unix hacking back in the 80s. This is called an ADM-3A.

Now, by applause, how many of you use an editor that has vi key bindings? Come on! Yeah, all right, yeah. So then you might be interested in the keyboard of the ADM-3A, which was the one that Bill Joy had at home to connect to school through a modem while he was writing vi. So here it is. Note the arrow keys on the h-j-k-l. They are there because those are the ASCII control codes to move the roller and the printhead on the old teletype that you saw a moment ago. So you’d hit control plus those to control a teletype. We used to use CTRL-h to back up over printed characters to then type a sequence of dashes as a strikethrough on these old printers. We also used the same trick on display terminals to make fancy spinning cursors. It happened to have the arrow keys, he used them.

Look where the control key is. For all you Unix people, it’s right next to the a. To this day, on this supercomputer here, I bind the caps lock key to control because it makes my life easier on the Unix machine that it is. Look up there, where the escape key is, by the q. That’s why we use escape to get into command mode in vi, because it was easily accessible.

Now scan across the top row just right of the 0. What’s that? The unshifted * is the :. That’s why [it does] what it does in vi, because it was right there. And now the last one, for all the Unix people in the audience, in the upper right hand corner there’s a button where when you hit control and that button, it would clear the screen and take the cursor to the home position. If you did not hit control, instead hit shift, you got the ~. Notice tilde is right under home. If you’re wondering why your home directory is tilde whatever username, it’s because of this keyboard. Now here is Terminal.app on my mega supercomputer. Notice 80 Columns of fixed width type.

Notice that when I look at the processes they have ttys – that stands for teletype. This machine is cosplaying as a PDP-11.

Now, whenever I get exercised about this, and talk about it, somebody sends me this blog post from Graydon Hoare. He’s talking [about how] he’ll bet on text. He makes good arguments. I love text. I use text every day. Text is good! The thing about it, though, is that the people who send me this in support of text always mean text like this – text like it came out of a teletype – and never text like Newton’s Principia, never text like this from Wolfgang Weingart. That is, these people don’t even know what text is capable of! They’re denying the possibilities of the medium! This is how I feel about that:

💩

I’ve promised Alex I will not say anything profane during this talk, so you will be seeing this 💩 emoji again.

The reason I disagree with this position is because the visual cortex exists, okay? So this guy, this adorable little fella, he branched off from our lineage about 60 million years ago. Note the little touchy fingers and the giant eyes, just like we have. We’ve had a long time with the visual cortex. It is very powerful. It is like a GPU accelerated supercomputer of the brain, whereas the part that takes in the words is like a very serial, slow, single-thread CPU, and I will give you all a demonstration right now. Take a look at this teletype-compatible text of this data and tell me if any sort of pattern emerges. Do you see anything interesting? Here it is plotted X/Y. Your brain knew this was a dinosaur before you knew that your brain knew this was a dinosaur.

This dataset is Alberto Cairo’s Datasaurus. That is how powerful the visual cortex is, and there are loads of people who have spent literally hundreds of years getting very good at this. Data visualization. If I gave you a table talking about the troop strength of Napoleon’s March to and from Moscow, you’d get kind of a picture. But if you look at it like this, you know what kind of tragedy it was. You can see right away. This was 175 years ago, and we’re still doing paper tape. Graphic designers – they know something. They know a few things.

For instance, they know that these are all channels. These different things: point, line, plane, organization, asymmetry – that these things are all channels that get directly to our brain, and there is no need to eshew these forms of representation when we’re talking about program representation. I recommend everyone in this audience who hasn’t already done so, go just get this 100 year old book from Kandinsky and get a sense of what’s possible. Here’s one of his students working on some notation. Look how cool that is! Come on!

All right, so another thing with text is that it’s really bad at doing graphs with cycles, and our world is full of graphs with cycles. Here’s a Clojure notation idea of the the taxonomy of animals, including us and that cute little tarsier. And it works fine because it’s a tree, and trees are really good at containment – they can do containment in a single acyclic manner. Now this sucks to write down as text. This is the Krebs cycle. Hopefully, all of you learned this at school. If not maybe read up on it. If you imagine trying to explain this with paragraphs of text you would never get anywhere. Our doctors would all fail. We would all be dead.

So instead, we draw a picture. We should be able to draw pictures when we’re coding as well. Here’s the Periodic Table of the Elements. Look how beautiful this is. This is 1976. We’ve got all these channels working together to tell us things about all these these different elements, how these elements interact with each other.

Another area that we’ve pretty much ignored is pragmatics, and what I mean by that – I’m borrowing it from linguistics because we’ve borrowed syntax and semantics from linguistics – pragmatics is studying the relationship between a language and the users of the language, and I’m using it here to talk about programming environments. Specifically, I want to talk about interactive programming, which is I think the only kind of programming we should really be doing. Some people call it live coding, mainly in the art community, and this is when you code with what Dan Ingalls refers to as liveness. It is the opposite of batch processing. Instead, there is a programming environment, and the environment and the program are combined during development. So what does this do for us? Well, there’s no compile and run cycle. You’re compiling inside your running program, so you no longer have that feedback loop. It doesn’t start with a blank slate and run to termination. Instead, all of your program state is still there while you’re working on it. This means that you can debug. You can add things to it. You can find out what’s going on, all while your program is running. Of course, there’s runtime introspection and failures don’t halt the program. They give you some kind of option to maybe fix and continue what’s happening now. This combination of attributes, I would say, is most of what makes spreadsheets so productive. And it gives you these incredibly short feedback loops, of which we’ll now have some examples. If you’re compiling some code, say, in Common Lisp, you can compile the code and disassemble it and see exactly what you got. Now the program is running. The program is alive right now, and I’m asking questions of that runtime. And I look at this and I say, okay, 36 bytes – that’s too much – so I’ll go through and I’ll add some some, you know, optimizations to it, recompile, 16 bytes that’s about as many instructions as I want to spend on this. so I know a bunch of you are probably allergic to S-expressions. Here’s Julia. You can do exactly the same thing in Julia. Look at this. You get the native code back for the thing that you just made, and you can change it while it’s running.

A lesser form of livecoding is embodied in PHP. We could spend an hour discussing all the weird, inconsistent things about that language, but I’d argue that the short feedback loops it provides are why so much of the internet still runs on it today (Facebook, Wikipedia, Tumblr, Slack, Etsy, WordPress, &c). Now what about types? This is where half of you storm off in anger. So, I’m going to show you this tweet, and I wouldn’t be quite this uncharitable, but I broadly agree with this position. It’s a lot of fun like. I have been programming for 45 years. I have shipped OCaml. I have shipped Haskell. I love Haskell, actually. I think it’s great. But I would say that over those many decades, I have not really seen the programs in these languages to have any fewer defects than programs in any other programming language that I use, modulo the ones with really bad memory allocation behavior. And there has been considerable empirical study of this question, and there has been no evidence.

(I was going to do a little literature review here to show that development speed claims for dynamic languages and code quality/maintenance claims for static languages appear to have no empirical evidence, but Dan Luu has already done a great job of that, so I’ll just link to his page on the topic:

“[U]nder the specific set of circumstances described in the studies, any effect, if it exists at all, is small. […] If the strongest statement you can make for your position is that there’s no empirical evidence against the position, that’s not much of a position.” doesn’t seem to matter. So if you like programming in those languages, that’s great! I encourage you to do it! You should program it whatever you enjoy, but you shouldn’t pretend that you have a moral high ground because you’ve chosen this particular language. And I would say really that if what you care about is systems that are highly fault tolerant, you should be using something like Erlang over something like Haskell because the facilities Erlang provides are more likely to give you working programs. )

Imagine that you were about to take a transatlantic flight. If some engineers from the company that built the aircraft told you that they had not tested the engines, but had proven them correct by construction, would you board the plane? I most certainly would not. Real engineering involves testing the components of a system and using them within their tolerances, along with backup systems in case of failure. Erlang’s supervision trees resemble what we would do for critical systems in other engineering disciplines.

None of this is to say that static types are bad, or useless, or anything like that. The point is that they, like everything else, have limitations. If I’d had more time, I would have talked about how gradual typing (e.g. Typed Racket, TypeScript, &c) is likely an important part of future languages, because that approach allows you to defer your proofs until they can pay for themselves. You can throw fruit at me – rotten fruit – at me later. You can find me in the hallway track to tell me how wrong I am.

So, I’ve said that, but I’ll also show you probably the most beautiful piece of [code] that I’ve ever seen. Like, the best source code in the world. And that’s McIlroy’s Power Serious, which happens to be written in Haskell. So, this is a mutually recursive definition of the series of sine and cosine in two lines of code. I want to cry when I look at this because of how beautiful it is. But that has nothing to do with software engineering. Do you understand what I’m saying? There’s a different question. The beauty of the language is not always what gets you to where you need to go. I will make a an exception here for model checkers, because protocols are super hard! It’s a good idea to try to verify them I’ve used Coq and Teapot [for example] for these kinds of things in the past, and some systems do have such a high cost of failure that it makes sense to use them. If you’re doing some kind of, you know, horrible cryptocurrency thing, where you’re likely to lose a billion dollars worth of SomethingCoin™, then, yeah, you maybe want to use some kind of verifier to make sure you’re not going to screw it up. But, that said, space probes written in Lisp and FORTH have been debugged while off world.

(Had I had more time, I would have done an entire series of slides on FORTH. It’s a tiny language that combines interactive development, expressive metaprogramming, and tremendous machine sympathy. I’ve shipped embedded systems, bootloaders, and other close-to-the-metal software in FORTH.)

If they had if they had proven their programs correct by construction – In fact, they did prove their program correct by construction. But there was still human error! shipped them into space, and then found out their spec was wrong, they would have just had some dead junk on Mars. But what these guys had was the ability to fix things while they are running on space probes. I think that’s actually more valuable. Again, throw the rotten fruit later. Meet me in the hallway track. I would say overall that part of this is because programming is actually a design discipline. It — oh, we’re losing somebody – somebody’s leaving now probably out of anger about static types.

As a design discipline, you find that you will figure out what you’re building as you build it. You don’t actually know when you start, even if you think you do, so it’s important that we build buggy approximations on the way, and I think it’s not the best use of your time to prove theorems about code that you’re going to throw away anyway.

In addition, the spec is always wrong! It doesn’t matter where you got it, or who said it, the only complete spec for any non-trivial system is the source code of the system itself. We learn through iteration, and when the spec’s right, it’s still wrong! Because the software will change tomorrow. All software is continuous change. The spec today is not the spec tomorrow.

Which leads me to say that overall, debuggability is in my opinion more important than correctness by construction. So let’s talk about debugging! I would say that actually most programming is debugging. What do we spend our time doing these days? Well, we’re spending a lot of time with other people’s libraries. We’re dealing with API endpoints. We’re dealing with huge legacy code bases, and we’re spending all our time like this robot detective, trying to find out what’s actually happening in the code. And we do that with exploratory programming, because it reduces the amount of suffering involved.

So, for example, in a dead coding language, I will have to run a separate debugger, load in the program, and run it, set a break point, and get it here. Now, if I’ve had a fault in production, this is not actually so helpful to me. Maybe I have a core dump, and the core dump has some information that I could use, but it doesn’t show me the state of things while it’s running. Now here’s some Common Lisp. Look, I set this variable. Look, I inspect this variable on the bottom I see the value of the variable. This is valuable to me. I like this, and here we have a way to look at a whole set of nested data structures graphically. We can actually see things – note in particular the complex double float at the bottom that shows you a geometric interpretation.

(This object inspector is called Clouseau. You can see a video about it here. This is amazing! This is also 1980s technology. You should be ashamed if you’re using a programming language that doesn’t give you this at run time.)

Speaking of programming languages that do give you this at runtime, here is a modern version in Clojure. Here’s somebody doing a Datalog query and getting back some information and graphing it as they go. I will say that Clojure is slightly less good at this than Common Lisp, at present, in part because the Common Lisp Object System (CLOS) makes it particularly easy to have good presentations for different kinds of things, but at least it’s in the right direction.

As we talk about this, one of the things in these kinds of programming languages, like Lisp, is that you have an editor and you’re evaluating forms – all the Clojure parameters here are going to know this right off – you’re evaluating forms and they’re being added to the runtime as you go. And this is great. It’s a fantastic way to build up a program, but there’s a real problem with it, which is that if you delete some of that code, the thing that you just evaluated earlier is still in the runtime. So it would be great if there were a way that we could know what is current rather than having, say, a text file that grows gradually out of sync with the running system.

And that’s called Smalltalk, and has been around since at least the 70s. So this is the Smalltalk object browser. We’re looking at Dijkstra’s algorithm, specifically we’re looking at backtracking in the shortest path algorithm, and if I change this I know I changed it. I know what’s happening if I delete this method the method is gone. It’s no longer visible. So there is a direct correspondence between what I’m doing and what the system knows and what I’m seeing in front of me, and this is very powerful. And here we have the Glamorous toolkit. This is Tudor Gîrba and feenk’s thing. They embrace this philosophy completely. They have built an enormous suite of visualizations that allow you to find out things about your program while it’s running.

We should all take inspiration from this. This is an ancient tradition, and they have kind of taken this old thing of Smalltalkers and Lispers building their own tools as they go to understand their own codebases, and they have sort of pushed it – they’ve pushed the pedal all the way to the floor, and they’re rushing forward into the future and we should follow them. Another thing that is very useful in these situations is error handling. If your error handling is ‘the program stops’, then it’s pretty hard to recover. But in a Common Lisp program like this – this is an incredibly stupid toy example – but I have a version function. I have not actually evaluated the function yet. I’m going to try to call it. So, what’s going to happen, well, the CL people here know what’s going to happen, it’s going to pop up the condition handler.

So this is something that – programming in Clojure – I actually really miss from Common Lisp. It comes up, and I have options here. I can type in the value of a specific function, say ‘hey call this one instead’ for the missing function. I can try again, which – if I don’t change anything – will just give me the same condition handler. Or, I can change the state of the running image and then try again. So, for example, if I go down and evaluate the function so that it’s now defined and hit retry, it just works. This is pretty amazing. We should all expect this from our programming environments.

Again, when I talk about Smalltalk and Lisp, people say ‘well, I don’t want to use Smalltalk or Lisp’. I’m not telling you to use Smalltalk or Lisp. I’m telling you that you should have programming languages that are at least as good as Smalltalk and Lisp. Some people, when I show them all this stuff – all this interactive stuff, they’re, like, ‘Well, what if I just had a real fast compiler, man? You know I can just just change and hit a key and then the things that –’ Well, we’re back to that 💩 again, because if you have a fast compiler you still have all the problems with the blank slate/run-to-termination style. Data science workloads take a long time to initialize. You might have a big data load and you don’t want to have to do that every single time you make a change to your code. And the data science people know this! This is why R is interactive. This is why we have notebooks for Python and other languages, because they know it’s crazy to work this other way. Also, GUI State – oh my word! It can be incredibly tedious to click your way back down to some sub-sub-menu so that you can get to the part where the problem is. You want to just keep it right where it is and go in and see what’s happening behind the scenes, and fix it while it’s running.

Someone came up to me after the talk and described a situation where he was working on a big, fancy commercial video game. He had to play the same section of the game for 30 minutes to get back to where the error occurred each time. 😱

Also, you should be able to attach to long-running servers and debug them while they’re in production. This is actually good! It’s scary to people who are easily frightened, but it is very powerful.

I’ll say after all of this about interactive programming, about escaping batch mode, that almost all programming today is still batch mode. And how do we feel about that? I kind of feel like Licklider did. Licklider funded almost all of the work that created the world we live in today, and Engelbart built half of it, and one of the things that Licklider said that I found – I just love the phrase – is ‘getting into position to think’. That is, all of the ceremony that you have to go through to get ready to do your work should go away, and that was their whole mission in the 60s. We almost got there, but then we have languages like C++. I could say a lot of mean things about C++, but I used to work at the same facility that Bjarne did, and I kind of know him a little bit, so I’m not going to do that. Instead, I’m just going to quote Ken Thompson.

(This is a really funny situation, because I worked [using] some of the early C++ compilers because I was excited about the idea of having decent abstractions in a low-level language that I could use [at work]. But I will say that it was never great, and that it has gotten worse over time, paradoxically by adding good features to the language. But if you keep adding every feature that you possibly want, you end up with a language that is not in any way principled. There is no way to reason about it. It has too much junk in it. And if you’d like to see this happening in real time to another language, I recommend that you read what’s going on in TC39 with JavaScript, where they are adding every possible feature and muddying an already difficult language further.

In all fairness, TC39 is in a terrible position. They can’t remove features from the language because there’s such a large corpus already in the world. At the same time, the language has a bunch of ergonomic problems that they want to fix. I wish they had frozen a primitive version of JS and added a marker at the beginning of scripts to switch out which language is used, much in the way #lang does in Racket.)

So, what about Go? Well, I admire the runtime and the goroutines, the garbage collector, but it’s really another punch card compatible compile/run language. It also shares with C++ the problem that it’s not a great library language, because if you want to write a library in Go and then use it from say a C program, or whatever, you have to bring in the entire go runtime, which is a couple [of megabytes]. not mostly what I want.

So what about Rust? Well, I mean it’s a nice thing that Rust is a good library language. I like that about it. But it’s also a huge missed opportunity in terms of interactive programming. They just went straight for the punch cards again. And it’s a super super complicated language, so it would be nice when trying to figure out which of the 40 different memory allocation keywords you’re going to use to tell it how to do its thing if you could explore that interactively instead of going through a compile/test cycle. And another way that I feel about it – I have to quote Deech here – which is that you know some people hate stop the world GC, I really hate stop the world type checkers. If it’s going to take me an hour to compile my thing, I just want to give up. I’m going to become a carpenter or something. In this family of languages, I’ll say that Zig is more to my taste. I actually like Zig more than I like Rust. This will anger all of the Rustaceans. I apologize, but it is true.

But, Zig people – for goodness sake – why is there no interactive story there either? You’ve got this nice little language that has multi-stage compilation. It can learn a lot from Lisp, and it just sort of ignores all that and goes straight to the 1970s or before.

So what do future directions that don’t suck look like? Well, I’ll give you some examples that try to use some of the things I’ve talked about as underexplored areas. So, this is a structure editor for Racket, which is a dialect of Scheme, and it was built by a fellow called Andrew Blinn, and it’s still Racket underneath. That is, it’s still a lot of parentheses – it’s still S-expressions – but when you’re editing it, you have this completely different feeling where you’re modifying this living structure and it’s quite colorful and beautiful – probably for some of you garish – but I like it. And I recommend having a peek at how that works, and compare it to how you’re editing code now.

Another example that I think will be more accessible to this audience is this one from Leif Anderson. This is also Racket, and this is doing a define using pattern matching for a red black tree balancing algorithm. And it is an ancient practice of many years to document gnarly code like this with a comment block over it, but you have a couple of problems: (1) the comment block is ugly and not completely obviously meaning what it’s supposed to mean; but also (2) it can grow out of sync with the code itself so. Leif has made this fine thing that reads the code and produces these diagrams, and you can switch the diagram view on or off. So this is what – if we want to talk about self-documenting code, I would say something like this that can actually show you what the code does is better than what most things do.

In the same vein, we’ve got this piece. This is called Data Rabbit. Data Rabbit is a crazy data visualization thing written in Clojure. Each one of these little blocks that are connected by these tubes is actually a little piece of Clojure code, and they can do data visualization, they can do refinement, they can do all of these nice things. I’m not a huge, you know, box and arrow programming language guy, but I think that Ryan has done great work here and that everybody should take a look at it.

There’s also Clerk. I’m a bit biased here. This is something I work on. This is something I’ve been working on for the last year with the team at Nextjournal, but I think it is actually very good, so I’m going to tell you a little something about it. This is this is what it looks like when you’re working with Clerk. You’ve got whatever editor you want on one side and then you’ve got a view onto the contents of the namespace you’re working on off to the side. This has some special properties. It means, for one thing, that you can put these notebooks into version control. You can ship these notebooks. These can be libraries that you use. You don’t have this separation between your notebook code and your production code. They can be the same thing, and it encourages a kind of literate programming approach where every comment along the way – or every comment block along the way – is interpretered as markdown, with LaTeX and other features. It’s a very nice way to work. I encourage the Clojure people here to check it out. It is of no use to you if you’re not a Clojure person, because it’s very Clojure-specific. And I’ll show you a couple of other screenshots here, like this we’re doing some data science and you’ve got – that’s my emacs on the right hand side, and I’m able to do all of the things, like pretty printing data structures, and inspecting them, and then sending things over and seeing them in Clerk. It is a very cozy way to work. There’s also, for instance, this example where in around six lines of code I do a query for some bioinformatic information that shows me what drugs affect what genes that are known to be correlated with what diseases, so we can see what drugs might be interesting targets for genetic disorders of differing type. Twenty years ago, if you would have told people they’d be able to do a single query like this and find these kinds of things out they would have looked at you like had two heads, but here it is and it’s no code at all. Or this, which is a port of Sussman’s Structure and Interpretation of Classical Mechanics library into Clojure that you can use inside of Clerk.

This is very nice work by Sam Ritchie. In addition to porting the libraries, he’s working on an open edition of Sussman’s textbooks using Clojure. And then [you can] do things with physics – real things. This is emulating a chaotic system, and you can actually – you can’t see on this – but you can actually grab sliders and move them around and change the state of the system in real time. It’ll show you what’s happening.

Or this. Martin here in the front row wrote this. This is an example of Rule 30, which is a cellular automaton, and he’s written a viewer for it, so instead of looking at 1s and 0s, you can actually see the thing he’s working on. And the amount of code this takes is almost none. This is a regular expression dictionary that I wrote. This thing – one of the nice things about Clerk is you have all the groovy visualization [and] interactive things that come from having a browser, but you also have all the power of Clojure running on the JVM on the other side. So you can do things like talk to a database on the file system, which is a revelation compared to what you can normally do with a browser. With this kind of thing you can do rapid application development. You can do all kinds of things, and I will add that clerk actually improves on the execution semantics that you normally get with emacs and Clojure. This is inside baseball for the Clojure people, sorry for everybody else, but that thing I was talking about – about how you can add things to the running image and then delete the code and then they’re not there and you don’t know it and maybe you save your program it doesn’t work the next time you start – Clerk will not use things that you’ve removed from the file. It actually reports that, so you get errors when you have gotten your text out of sync with your running image.

Now, obviously, I have a huge Lisp bias. I happen to love Lisp, but it’s not just Lisps. There are other people doing good things. This is called Hazel. This is from Cyrus Omar’s team. You see those little question marks after the function there? This is an OCaml or Elm-like language, and they do something called typed holes where they’re actually running interactively their type inference and using it for its in my opinion strongest purpose, which is improving user interface. So here, when you go to put something into one of these typed holes, it knows what type it’s going to be, and it’s going to give you hints, and it’s going to help you do it, and they’ve taken that to build this nice student interface. If you’re going to teach students through design recipes that involve type-based thinking, then you should have a thing like this that actually helps them in some way, and the one they’ve made is very good I recommend reading the papers.

[Cyrus] has a student called David Moon who has made this. This is called Tylr. I can’t really show you this in a good way without [many videos]. So I recommend that you go to David Moon’s Twitter, and you scroll through and you look at some of these things. It’s got a beautiful structure editing component that prevents you from screwing up your code syntactically while you’re working on it, and gives you advice based on type information. Here this is my absolute favorite from Cyrus’s group. This is also by David Moon who did the structure editor and Andrew Blinn who did the nice editor for Scheme that we saw at the beginning of this section. Here we have, again, an OCaml or Elm-like language, but you can put these little widgets in. These are called livelits, with the syntactical affordance here [that] they begin with a dollar sign. He’s got some data here, and the data showed as a data frame. It’s actually a convenient, nice to edit thing, and it’s in-line with the source code. This is a thing where you can have more expressive source code by allowing you to overlay different views onto the source code. You can also see there’s a slider in there, and the slider is [live]. [It] immediately computes. The rest of the values are immediately recomputed when the slider slides in a data flow kind of way. This is a great project. I hope they do more of it.

Here’s something a little crazier. This is Enso. Enso is groovy because it is a functional programming language that has two representations. It is projectional, so it is not just this kind of lines between boxes thing. It’s line between boxes, and then you can flip it over and see the code that corresponds to those things. You can edit either side and it fixes both. And now we’ll go on to our last example from this section, which is also the craziest one.

And that is Hest by Ivan Reese. Here we’re computing factorial, but we’re doing it with animation, so we see these values flowing through the system in this way and splitting based on uh based on criteria that are specified in the code, and we’re working up to a higher and higher factorial now. I look at this, and I don’t say ‘yeah, that’s how I want to program; I want to spend every day in this thing’, but what I’ve learned – if nothing else – over the very long career that I’ve had, is if you see something that looks completely insane and a little bit like outsider art, you’re probably looking at something that has good ideas. So, whether or not we ever want to work like this, we shouldn’t ignore it. This was my last example for today.

I had to stop because I was already slightly over time, but there a number of other systems that I would like to have mentioned:

Ink & Switch has funded a number of pieces of infrastructure that could be building blocks for new environments. They’ve also funded a team that has produces one of the more interesting tablet interface experiments in recent years, shown in this video by Szymon Kaliski.
Subtext, by Jonathan Edwards, is full of interesting ideas.
I’d like to mention Julia again here. It’s a lovely programming language that tries to bring many nice aspects of Common Lisp to an infix language. It was originally designed with data science applications in mind, but there’s no reason it couldn’t grow up to become a very popular general purpose language. The Pluto notebook and visualization system is also quite good, and shares many of the goals of Clerk.
Natto by Paul Shen is another take on node/arrow systems.
Tree-edit, a structural editor that combines Treesitter and miniKanren to give the same sort of experience that Lispers have long taken for granted to programming languages with more complicated syntax.
Darklang has many features that I find admirable.
Unison is a staticly typed language that uses the concept of AST-hash as code identity, which has many interesting properties for things like dependencies, revision control, distributed systems, &c.
Lamdu is a structure editor and environment for a Haskell-like language.
JetBrains MPS provides an environment for building tooling for programming languages, including projectional features.
Mathematica is, of course, another well known interactive system.
This collection of videos contains interesting ideas for visualizing programs and their execution.
If you’re interested in visual programming languages, check out this collection by Ivan Reese.

In this talk, I stayed away from artistic livecoding systems because many programmers can’t see themselves in what artists are doing. However, I would be remiss not to show you these systems:

Andrew Sorensen’s Extempore (video).
Sonic Pi, by Sam Aaron. A Ruby dialect on an Erlang runtime for teaching programming through musical composition (and more).
Olivia Jack’s Hydra, a collaborative environment for livecoding.
Orca, by Hundred Rabbits. A 2D programming language with a built-in clock for livecoding music. The rest of their projects are also worth your time!
Tidal Cycles is a music livecoding DSL in Haskell.

I livecode most of my own artwork in Clojure and Scheme. I have some thank yous to do. First, I’d like to thank Alex for inviting me to give this talk. I’d like to thank Nextjournal for sponsoring my work, including the writing of this talk. And I would like to thank all of you for watching! Thank you very much!