The Unreasonable Effectiveness of Dynamic Typing for Practical Programs

Okay.

Your software has been deployed. It’s in production. Your program is a type theoretical time bomb. What could go wrong? Anything could go wrong. Everything could go wrong. Will it be broken verification of malformed user input? Yet another JVM security alert? Perhaps one of the hundreds of to-dos you should have fixed before you deployed. That change you did on Friday didn’t break any tests, but it didn’t get reviewed either, did it? Will it be a cosmic ray dumping a few giga electron volts of energy into a RAM chip?

No, it might just be a type error.

My name is Robert Smallshire. I work for and part own a consultancy called 60 North based here in Oslo and Stavanger. We do consulting, training in various languages. I’ve been programming for nearly 30 years now. These days I mostly use Python, C++ and various dotnet languages including C-sharp and F-sharp. and I’ve of course experimented in various contexts with a whole range of languages with different characteristics. And I’ve become very interested in the characteristics of those languages and how they affect the productivity of what we do and the amount of evidence there is for how they affect the productivity of what we do. There’s an awful lot of hearsay in this industry about what works well, lots of very strong opinions that can be very very very well articulated and frankly precious little actual evidence of what’s going on. So I’d like to… if not answer those questions in depth today, at least move the debate forward a little bit and at least frame the kinds of questions we need to be asking when we’re choosing the technologies that we choose to use.

So I’m curious to know how many of the people in this room think that dynamic languages are, or programs written in dynamic languages, are just a little bit less reliable than those written in static languages.

Okay, so most people think they’re a little bit less reliable. How many people here think they are perhaps half as reliable with, say, twice as many defects? Hardly any of you. So somewhere in there is some measure of how much more reliable static programs are or statically checked programs than dynamic. The type check programs. And the, particularly the academic part of our movement, of computing and software engineering, has focused very much on static typing, static checking, proving programs to be correct or otherwise. And we hear lots of very interesting terms, dependent types, existential types, phantom types, inferred types. Who here knows what all of those terms mean? One person, right? So most of these things are, I would argue, going to be forever beyond the reach of most people who are engaged in programming.

And I think that’s probably going to be true forever. But we have people advocating the use of very sophisticated type systems, which have been very thoroughly studied, very much as this young scientist here is very thoroughly studying this apple. But ultimately, the practitioners amongst us have to choose. We have to choose the apple. Do we choose the dynamic apple or the static apple? And on what basis do we make that decision?

Unfortunately, there’s a lot of people out there who are saying, well, actually, the dynamic apple isn’t really an apple. It’s a very tasty chocolate cake that’s going to taste fantastic and just be wonderful to consume compared to this rather sour, statically typed, Granny Smith apple that’s been imported from somewhere else.

Of course, the static typing people are going to say, well, this is fine. Chocolate cake’s really nice, but it’s going to make you bum bigger. And we don’t like that. We worry about it. We’re insecure. Fear, uncertainty and doubt. I think that’s rife in this industry.

And I think one of the challenges we have is that technology and science in much of what we do is actually completely orthogonal. There’s an awful lot of technology development going on and almost no actual science going on, even though we call it computer science, right? I have several degrees in a real science, a natural science, and I can tell you that what we do has nothing at all to do with science by and large. Nothing. They’re completely orthogonal concepts. And of course, science is about performing experiments. It’s about collecting data. It’s about analyzing that data and then figuring out what it all means.

So if we wanted to know whether dynamically typed languages were actually better than statically typed languages, we would want to go out and perform some experiments, which none of us do. In spite of the fact that this is a trillion dollar industry, nobody is prepared to put up a few thousand dollars to pay for some experiments. We’d rather just carry on regardless, assuming that what we’re doing is the right thing, because it’s what we believe, even though we have no evidence for that. And there’s a word for that, it’s called religion.

Not really where we want to be from perhaps most of us here.

And so there’s a pretty basic process we can go through for figuring out whether something is any good or worthwhile as a pursuit. We can come up with a hypothesis or a claim we would like to test. So the claim could be that statically typed languages, produce programs which are significantly more reliable and therefore significantly cheaper to build and maintain than dynamically typed programs. That’s a pretty clear claim. and then we could go ahead and do some experiments or collect some evidence from the natural world of us writing programs. We can reason about what we find, and then we can come up with some kind of explanation for what’s going on, and then, of course, go around again and use that to make more hypotheses or predictions about how the world works. And this is not a novel process, as I’m sure you’re all aware. You’re all very well educated. This is just the scientific method, and it’s been going on for a few hundred years now, and it works, except that we don’t use it. This is the one thing that does work that we know works.

Reliably.

So why do we forget about it?

So today, I’m going to talk about four things. I want to talk about type systems and their qualities. Some people here are probably experts in type systems. I am not. I will say that. I am a practitioner by and large. I need to understand enough of this to make technology choices. I don’t need to be able to design type systems. We’re going to look at the effectiveness of types and whether they really make a difference and a quick survey of the scant evidence there is out there, which is fairly little. Then we’re going to explore very briefly the argument about types versus tests. One I’m sure you’re all familiar with. Why do we need types if we have tests and we all have tests anyway? So don’t we? Of course we do.

And then we’ll round off by looking at the economics of types. Is it actually worth the effort? And how much effort is it anyway?

So, Robin Milner, who was behind, or one of the people who independently discovered Hindley-Milner-epinference algorithm,

Says in the abstract to one of his very famous papers about a thing called algorithm W, which is about type inference, well type programs cannot go wrong.

This is actually not true at all. Well type programs go wrong all the time, but it’s a nice claim to put in an abstract. And we shouldn’t talk ill of the dead either, so I will stop now.

But type systems have lots of different characteristics, many more than I have here. People have tried to organize them in various ways. And we can argue for the rest of the week, actually, about the definitions of most of these terms and whether they are actually meaningful at all. Some of them are probably more meaningful than others. But we have going from left to right, we have the access of strong and weak typing, and then going clockwise around. We have dynamic static typing, which is mostly what I want to talk about. about today. Mandatory and optional typing. Are we required to provide types or do we have the choice? Structural and nominal typing, something that’s come to the fore more recently with some of the languages coming into play now. Manifest or inferred typing is the language able to infer type information reliably that we leave out statically. And simply typed or untyped languages. There are actually untyped languages out there, which we will depend on.

So let’s just quickly review each of these things because, as I said, the terminology is perhaps a little vague. So I’m going to give you my definitions by example, which is something we can hopefully all understand. Untyped versus typed. On the left, we have an untyped language. Let’s play a game. Which language is this?

This is Arm Assembler. I was going to put 6502 Assembler up there, but then I realized that a lot of people probably are too young for that. And there are some other untyped languages, mostly assembly or very close to assembly, like BCPL, which you never hear about these days, which I think stands for before C programming language.

And then on the right, we have a types language, which is Ruby. Good. I chose this particular fragment because it’s quite interesting because it doesn’t explicitly name any types, but all the types are there in the example. I don’t have a kind of double laser pointer, so I’m going to go this way. So even though there’s no types named here, obviously this implies, well, this is a list, or an array in Ruby speak. This is obviously an integer, this is a string, this is a float. So the types are there. The syntax, the language defines what those types are, even though we don’t name them. So Ruby is typed. Whereas in the assembler, of course, we have no idea what’s in those registers. They could be integers. They could be stupid registers on arm. They could be four ASCII characters. There could be one UTF 32 code point. They could be a pointer to something. Who knows? We can’t tell by reading the program without having some contextual information about what the program’s doing.

Static and dynamic languages. On the left we have C++, very good and on the right we have Python.

I just stripped all the crap out of the C++ program to turn it into a Python program. So on the, yeah, that’s actually valid Python. That’s fine. I’ll not reject that. I’m very lazy. I just took this and stripped all the other stuff out. You can use semicolons in Python.

Although it’s not cool to use semicolons. I will admit that. So obviously here with the I iterator in the C++, we have to declare the type fully, which is actually a fairly complex type, as with the constant double, and the reference type here, which is a reference to some sort of sensor that we deal with. Whereas in the dynamically type language, we don’t need to specify up front what type these references are. They are just references to objects. So Python has types. This is definitely a string.

This is definitely an iterator when the program is running, but the references are not typed. They’re untiped. So strong versus weak typing. On the left we have Python. And on the right, we have JavaScript. Right? So, and the equivalent programs, again, just a simple function that adds to arguments, A and B. And on the left, we get a type error because we can’t add the string five to the integer three, whereas JavaScript is quite happy to coerce the integer three into a string.

Then we have nominal and structural typing.

So the language on the left is C, good. And on the right, O’Camel, exactly. There’s a good hint at the bottom of the slide, if you. And you’ll notice, I guess what’s interesting here is in the C code, or C++, as someone pointed out, these two structures are identical. They have completely identical structures, but different names. And I cannot use those two objects of the two structures interchangeably because they have different names without casting away the type, at which point we’ve just given up, really. But we do that all the time in C. Whereas in O’Camel, we are creating an object here, and we’re describing what the layout of those objects look like, with the value and two methods, but we’re not having to give that object’s name. It’s just an object that has those characteristics. The name is not important. Structural typing is more the shape of the object or its contents. No, the next one’s easy. On the left, we have Jva, and on the right we have C-sharp, which is a mild improvement. And seems to have a slightly better naming culture for some reason. I don’t know why that is.

But the point here is that in Java, we have to spell out the type of the reference S, even though it’s obvious from reading the code. And obviously, the compiler can easily figure out or could easily figure out what the type of the object is going to be. Whereas in C sharp, it will infer S. And type inference in C sharp is simple, not to say simplistic, but some of the other languages listed at the bottom here, F sharp, Haskell, etc. Scala, have much more sophisticated type inference systems.

And then finally we have the kind of not having to choose option, which isn’t really a dichotomy as I’ve set it up here. It’s really more another point on the dynamic to static axis, where we have optional typing. So I’ve used TypeScript here. We have two functions. So TypeScript is essentially JavaScript with optional typing. And so we have a function add, which we can pass in almost anything. and JavaScript will try its best at runtime to make that work, or we can specify some types for the arguments and for the return type. And that will be statically checked.

Whereas the C or C++ on the right, we have to specify all the types for the return type and the argument up front so the typing is mandatory. And then there’s another type system, which is embedded into all of the languages which use any of those other type systems we use, and we rely on this all the time, and we’re very evil people for doing so, which is the string. All of these languages support strings, and we’re all guilty of shoving the most incredibly bad things into strings at the wrong time, aren’t we? Who has ever stored any of these things in a string? A date, a URL, a file system path, a time, a UUID, practically anything else that you can shove in a string? I can make a good argument that the entire kind of basis of modern software architecture is shoving things into strings and sending them through HTTP requests. That’s what we do, isn’t it?

So all software architecture is to do with strongly type programming. It’s all stringly type programming. And that’s what we do all the time.

And you laugh, but we’ve all done it, and we continue to do it, and we will continue to do it, because it’s very flexible. And it turns out that Stringley-Type programming, even though it seems kind of evil from a type theory point of view, is incredibly useful, which is why we continue to do it.

And of course you can plot all of these languages on a kind of multi-five dimensional space or however many dimensions of characteristics you’re interested in. But at some point, you still have to choose. How do you choose what tool to use? Different languages occupy different corners of this space.

And today I wanted to talk about a type system that you’d all be familiar with. And I couldn’t really think of one. And then I thought, well, hang on, when you were at school? Everyone went to school, right? Did all go to school. And you all did some science at some point, I’m sure. And in science, you had to do a thing called dimensional analysis, right? And even if you don’t know what dimensional analysis is, I’m sure you’ll recognize the examples. It’s things like this. Right? You all remember doing this. So we have a quantity, resistivity, which is measured in oms per meter. And we multiply that by a distance, maybe the length of a wire in meters. And we can easily calculate the resistance of the length of wire, which is in oms. And this is really easy because the meters here divides, the reciprocal meter here is canceled out by the meter here and we end up with just oms. Simple. This is a type system, right? It’s one you’re taught in school.

Is it useful?

Maybe. It’s useful. So here’s another example. Getting a bit more sophisticated. Hopefully you remember enough physics for this. So force times distance is energy. And force is in Newton’s. I’m going to use SI. So any Americans here, you’ll just have to be confused for a little while. So…

Forces in Newton’s, which boils down to kilogram metres per second squared. Distance, of course, is in meters, which results in energy. So we all know that if we push something with a given force over a given distance, that requires a certain amount of energy, which is in kilogram meters squared per second squared, which is, of course, Jules. You all knew that. I didn’t. I had to look that up because I’d forgotten. I knew it once. This is great. Nice. Jules is just shorthand for this. this.

Because this is just awkward to write. But then we can also do force times distance, same as at the top, Newton’s times meters, is Newton meters, a different quantity torque. So here we have equal dimensionality, equal type, this is a type system, and any equivalent quantities. What’s going on here?

How can this possibly be?

Right? Is all that dimensional analysis you did at school useless? There is. We’re going to get to that. So it turns out that some contemporary programming languages have quite good support for physical units, measures built into the type system. So F sharp is one, perhaps the best example of that. So you can go to try Fsharp.org, I think it is. I can’t read the URL from here. And you can, if your computer supports Silverlight, which is fewer and fewer computers these days, you can get F sharp up and running, and you can begin writing some code. And F-sharp supports type annotations, which allow us to explain physical units of measure to the type system in F-sharp. It’s really a very sophisticated thing. And so what we’re going to do here is just put up a few types, some of the ones we just dealt with, kilograms, meters, and seconds. And we use this annotation measure to tell F-sharp that we are trying to build some types that represent physical quantities.

And then we’re going to bake a cake.

So we put some, a kilogram of sugar. You can see I can annotate these float values in angled brackets with kilograms. How many eggs should go into a cake? I’ve chosen three. Is that right? I don’t know. You don’t know.

And you can see F sharp is very strictly typed. It’s even going to complain about we’re multiplying an int by a float. It doesn’t even know how to do that. I have to then go ahead. And you see that the error here is int does not match the type float. It’s complaining about the multiplication operator.

And so we actually have to insert an explicit conversion to float here.

And we can run this and you’ll see at the bottom there that F sharp is able to infer that the mass of the cake is in kilograms. This is fantastic. It knows that a kilogram plus a kilogram plus a kilogram times a kilogram plus a scalar is in kilogram. Great. We know that cakes are weighed or more correctly have mass in kilograms. That’s fine. The type system is sufficiently advanced that we can actually explain through it derived unit. So I’m now going to explain to it what a Newton is and what a jewel is.

This is quite clever for your type system to do this.

And now we’re going to do an example where we push a truck along as I described.

For that we need a truck.

You can see we have the force on the back of the truck, F truck, but then we’re going to push it through a distance, D truck. 30 meters with a force of 20,000 neutrons. That’s… What’s that, 20 tons roughly, of force? And we can work at the energy required to do that by multiplying force times distance. Simple stuff.

And you’ll see that F sharp is able to infer that the energy value is in Newton meters there, float angle brackets, Newton meters. And that’s kind of wrong, really. It’s not strictly wrong, but it’s not helpful because we said the energy is in jewels. And even though I’ve described that energy is in jewels to F sharp, it’s not able to figure out whether I really mean jewels or Newton meters, different quantities, even though they have the same dimensionality. You know, I’m panicking here because I thought this timer was counting up and it’s counting down.

Okay.

So now we’re going to do the other example of using some torque. So we need a spanner to apply some torque, maybe one on to undo one of the wheel nuts on the truck.

So we apply force to the end of the spanner, multiplied by the distance the spanner, gives us the torque on the nut in Newton meters. Okay, so we can set this up.

In F sharp so the length, the force on the spanner is 200 newtons. The length of the spanner is 0.2 meters, about this long.

And then we can ask F sharp to figure out the torque on the spanner.

And if we just go forward one here, we can infer the type of torque spanner, which it now correctly, if you like, gets in Newton meters. Fantastic. Very nice.

I mean, you can imagine how if you’re trying to land a space probe on another planet, this kind of thing might be useful.

And we can run this and we get the correct answer at the bottom there. Torx spanner is afloat with a dimension of Newton meters with a value of 40 Newton meters. Fantastic.

Now we are going to create a value WTF. You know what that stands for. And we’re going to ask F sharp to add the energy of the truck to the torque of the spanner, right? Because they’re both in Newton meters, because that’s equivalent to jewels. And then we’re going to ask F sharp to infer the type of WTF.

Right. This is what should happen. At this point, the entire universe is shut down. It’s the end of nature, the end of time. There should be no more anything, let alone software. Right? You cannot add energy and talk. Right. Even though the dimensions are compatible, they’re just not in any way equivalent quantities. But…

Back in the real world, back in our buggy universe, F-sharp is happy to add these quantities together completely nonsensically and come up with a value in Newton meters of 60,040. This is just wrong, right? So here we have a very sophisticated type system, perhaps one of the more sophisticated type systems out there that you can easily get hold of and use for actual commercial code today, and it isn’t watertight, right? Because physics isn’t watertight, right? So there are theoretical limits to what you’re, what type systems can do. And so given that, we should think carefully about whether the benefits of the type systems really help us when modeling the physical world or modeling some enterprise business process that is really important to you.

If you’re interested in this, I’m not going to go into the detail. There’s a wonderful short paper. It’s only two pages called On the Concept of Dimension. And I guess the key quote is he says he has doubts about the usefulness of the concept, even though it’s widely used. And I’m beginning to feel the same about static type checking. It’s useful in toy cases and actually enforcing not very interesting constraints on the behavior of your program. But for anything vaguely interesting, relevant or difficult, it just does. doesn’t cut the mustard. Good paper. Moving on, let’s look at the effectiveness of types then.

Where is the science in computer science? I’ve already had a bit of a dig at this, and I’m just going to twist the knife a bit now.

Another wonderful paper, an experiment about static and dynamic type systems by Stefan Hannenberg. There’s hardly any knowledge about the impact of static type systems on development time or resulting quality for a piece of software. There are a lot of people out there advocating that we must use static type systems to have reliable software. Everybody in this room thinks that static software is more reliable, perhaps only a bit, than dynamic software.

This guy did a really good experiment. He actually tried to take out most of the variables by designing a programming language in two variants, one that was statically tight and one that was dynamically tight. And then he had a bunch of people solve the same problem using the language. He took experience out of the equation because it was a completely novel language, right? So it’s quite a well-controlled experiment. Actual science, I would argue. and. Figured out how long it took for people to solve a problem.

And here is one of the results from the paper. There are many interesting results in this paper. But the development time for the dynamically typed languages, you’re familiar with these box plot things, right? They have. I forget exactly how they work, so my point is not working. So this is the median, and this is the lower quartile, the upper quartile, and the maximum and minimum, I think, times. So you can see that the development time for the dynamic languages is quite significantly shorter than it is for the statically typed languages. languages on this particular task. And you could argue that his task was well suited for dynamic languages. But honestly, I don’t think he was setting out to prove this case one or the other. He just wanted to know.

And I think what’s really fascinating about this is his finding about the debug time, which is when he looked at the debug time of the exceptions that could have been handled by a type checker, so you run the program, it’s broken, you then have to go and fix it, like the type error we saw right at the beginning. That took a lot less time to fix than it did for people to design the types while they were writing the program to prevent this problem ever happening. Right? So if you can, in some sense, afford for your program to fail at all, which we all can because all our programs do, and we’re all still here, we’re all still gamefully employed, right? It’s actually much cheaper to go and fix these things afterwards than it is to put lots of effort in upfront designing a very robust type system to do the job. So that was his research. This is my research, which, although perhaps scientific in spirit, I would not claim that this is a rigorous experiment, but it’s interesting to look at. So I spent some time analysed. issues in GitHub.

Which is great because there’s just vast quantities of code in lots of different languages and all the issue trackers in there and you can look at what’s going on. Fantastic resource. And I looked at programs in a whole bunch of different languages, C++, Java, C-sharp, Ruby, JavaScript, Python, F-sharp, Closure, Haskell. And just one quick question. I don’t have a slide on this, but which of these do you think has the lowest defect rate per line of code? Come on, Haskell. There must be some Haskell people in here who are going to shout. Haskell! Ruby!

You wouldn’t C++ has the lowest rate. Why is that? Because right at the bottom of the stack, right, of everything we do, there’s some C++, right? And most of the C++ projects have been wrapped and exposed to these other languages, right? So they’re all getting used. The C++ projects that I sampled are getting used from all of these different environments. Whereas you write something in Ruby, only Ruby people can use it. So the C++ code is more reliable just because it gets vastly more use than anything else. If you sit in one of these islands like JVM world or the CLR world, you’re on your own. You’re only with other JVM people. Did I just say that at JavaZone?

But I looked at 1.7 million repositories, 15 million files, 3.6 million issues. I wrote some software. These days, this would be called Big Data. I don’t like that. It’s just statistics. I wrote some software to analyze all of these things, and I also manually went in and read lots of issue reports to ensure that my statistics were reasonably valid, which I believe they are. And just quickly, the kind of things I was looking at, So I use a Python example. We look for type errors. These are the kind of things that can go wrong in dynamic programs, where we add. Some values together which don’t match.

We look at some attribute errors. So I’m trying to call an append method on a string here. You’re not allowed to do that in Python, because strings are immutable, you can’t append to them.

And I thought there was another one, yes. And I looked at name errors, which aren’t really type errors, but they’re a nice example of the kind of thing that screws up in dynamic languages, which is where we just refer to a reference or a value which doesn’t yet exist. So I looked at those for Python, and I looked at the equivalence for Ruby and the class cast exceptions, enclosure, and all these different things that you can look at. They can go wrong in the dynamic languages and condensed it all down. And for Python, this is the, the error rate. So out of… 670,000 issues in GitHub on Python projects. Only 8,400 are actual type errors. Another 8,000 are attribute errors and a few thousand are name errors. If you look at the proportions here, they’re really tiny.

In terms of the total errors. So hopefully, if you are working in a static language, the cost of working in the static language exceeds the cost of fixing just a handful of bugs that you’re going to get as a result of working, that you would get expect to find in a dynamic language. And this is sampling a very, very large amount of code. Mostly on GitHub, but I would argue written by people who kind of know what they’re doing.

So 2% have reported GitHub issues for JavaScript, closure, Python, and Ruby in aggregate, that’s the dynamic languages, are type errors, or arguably type errors. So every 100 issues you have, only two are really type errors. So this is what your, these two blue dots are what your fancy type system is going to save you from. It’s not going to help you with the other 98 things that people don’t like about your software.

Just to verify this, I checked with a large Python commercial code base I’ve worked with here in Oslo, a few hundred thousand lines of Python. And the error rate there in a project that’s got very rigorous unit testing and functional testing and all kinds of lots of code review as well, very good development practices, I would argue, is even lower. I think in the whole code database there was one name error that had been found, been reported. and so it’s around 1%. Very low error rates.

And so another study, moving on from my study, another study that was done is this study, which is quite old now back in 2000, but I’m going to put it in because a lot of people probably aren’t familiar with it. And this guy was setting out to compare scripting languages and compiled languages, which I think these days is not useful distinction and not really a phrase we would use. We tend to talk about dynamic languages and other languages. But it turns out that his scripting languages correspond exactly to what I would call a dynamic language in his study. So it’s interesting to look at. He looked at, again, he had a bunch of people implement the same solutions to the same problem in different languages. And he looked at program reliability and found that there’s essentially no difference depending on what language you’re using. All the programs were pretty reliable. Of course, he had a standardized test suite to apply across all of these programs. He found no clear advantage for static over dynamic languages, or indeed vice versa. No difference at all. But then we look at the programming effort that was needed. And Tickle, Rex, Python, and Pearl all required significantly less effort to solve the problem than Java, C plus, and C. Okay. And that actually correlates very well with program length. All of those programs are much shorter. And in this paper, he goes on to explain how essentially productivity of developers is constant per line of code irrespective of what language they’re working in. It’s kind of interesting that if you work in a language that’s more expressive, you get more done in fewer lines of code. Of course, these days it’d be really interesting to repeat this study with Scala and F-sharp, some of the other very expressive, concise languages that allow us to express a lot in relatively few lines of code.

I guess another thing you should point out is the variance of the Java solutions is enormous. Some people have very good solutions, but there are quite a lot of really awful solutions.

And I guess my point here is that this is just a handful of studies. I’ve shown you two published studies, one kind of ad hoc study by me. And I can come up with maybe five other published studies on this, written by kind of not very well-known academics in obscure universities you’ve never heard of. And essentially, no research has been done on this. And it’s incredible because it’s so important. So quickly, we don’t have long left now. I wanted to talk about types and tests. This is a topic you often hear about, should we be doing test-driven development or type-driven development? Should we be focusing on designing types that constrain the problem? I’ve already showed you that even with very sophisticated types, you can still expose yourself to problems. And I thought I’d, actually, I’ll just do the quotes here. I’m sure many of you have heard this before, but beware of bugs in the above code, I have only proved it. Correct, not tried it. And I think this is something that unit testing and the approach to testing we have these days has essentially solved this problem. Very, very good at testing. I occurred relatively to how we were a decade or more ago.

Very good. So I thought I’d draw up a list of the unique capabilities that types have that tests don’t.

When it comes to verifying your programs.

Here it is.

That’s it.

There’s nothing that types can do that your tests cannot do, right? And believe me, when you’ve designed a really, really clever type that enforces some difficult constraint on the behavior of your program, and you compare the readability of that type declaration with the readability of the five tests that would replace it, you would take the tests every time. From a maintenance point of view, much easier to deal with because tests are so much more expressive than nearly all the type systems we deal with. I told you this section would be short. This is the entirety of it. I have nothing further to say. Okay, let’s finish off on the economics of types. Is static type design actually a good investment?

Grady Butch here, one of the people responsible for inflicting UML on the world. But he’s quite quotable character. The function of good software is to make complex things appear simple. Who can disagree with that? And of course the world is a very, very complicated place. And arguably it’s kind of dynamically hyped as well. I’ve spent a lot of time working in Norway in the oil industry. And, you know, oil industry is very simple. People go out, they drill wells into the ground, an oil comes out. And there are whole papers, inches thick. And I’m not kidding, there is one called What is a Well? Right?

And this is written by data modeling people who are trying to understand what these things are. And of course, every oil company has a different view on what a well is. And if you try to design a type system to model that, I mean, it would just take all eternity. You would never finish. And it wouldn’t actually be very useful. We do. We stick it all in a string. And essentially, we have a JSON structure. We attach some strings to it. If an oil company wants their own ID for that well, fine. They can have it. Nobody else needs to know about it. It’s all dynamically tight. Very, very effective. effective, prepared to writing enormous documents and then trying to translate that into a very rigid schema or a very rigid type system.

And of course, as developers or maybe senior developers or architects or even CTOs, you know, we have a lot of pulls on our time, lots of decisions or requirements coming our way. These are the kind of things you get. Maximize return on investment. Who’s heard that?

Thank you for money from your customers. Your managers just care about cost. The users have a small voice in this. Usability, features, reliability. Testers want a spec so they know what it’s supposed to do.

And marketing just want you to deliver the thing quickly. And what are we saying? You’ve heard this before, but it’s, there is a time and a place for talking about type theory. And it is not actually when you’re trying to run a business most of the time, I think.

We have all of these different things we have to worry about when we’re selecting a language. Does it have useful libraries for what we’re trying to do? I wouldn’t do scientific computing in Java because there are no supporting libraries for that. Just the wrong choice.

Can I get people who know this language? How does it modularize? Is it reliable? What’s the type system? There’s a kind of small question in there. Is it portable? How much experience do I have in it? There are many, many things that should be important when you’re choosing a technology. And the type system, frankly, is probably not one of them, or shouldn’t be. And are the costs of static typing offset by that few percent of defects? Or is the agility, the ability to move quickly, the velocity you can get by delivering quicker? A study we looked at, is that more important to your business than the robustness? You’re going to have a few bugs anyway. Who cares? Who are they going to be type errors? Nobody. You. Oh, God. I wrote this.

Right? This is actual C++.

Right? Because I wrote this maybe two or three years ago now. And I was trying to do something. And I’m not kidding. This is one line of Python, right? I want to pass an arbitrary number of arguments of arbitrary type to a factory function that was going to create something. And these were the things that were going to get passed to the constructor of the thing that had been loaded by a plug-in, the subtype. And you can do this statically to a good degree in C++. But you have to do this. I’ll admit that this would be marginally better in C plus plus 11.

With variatic templates, if you can find both of the people who know how they work, right? But what this is doing is really very, very simple. And you can see here, you’re only looking at a bit of it, right? And this entire file is type declarations. There’s no actual behavior here. This is a header file from C++. This code does nothing except to explain to the compiler. This is what I would like to do without actually explaining how to do it. It’s extremely complicated and completely unmentainable. And I, you know, at the same, I don’t know.

I have mixed feelings about this.

Same time, I’m very proud of it and deeply ashamed of it.

And another problem with static types is they’re very anti-modular. So it’s just a simple thing here. You have a booking application which wants to use a booking service. Just one made-up function call here. Make-booking, which returns some object, which supports the iBooking interface. Fine. But now your booking application needs to know all about iBooking, right? You need to transport that definition somehow from one to the other.

The type of iBooking isn’t very helpful because it only tells you whether the booking service and the booking application will physically fit together. It doesn’t tell you that they’re going to work together. Values your passing could be complete garbage, right? It just tells you whether they fit together. Yeah?

In my house, I could, well, this doesn’t work in Norway, but in the UK, right, I could attach the gas pipe into my house to the water taps, right? Foursets. They will fit together. Is this a good idea? No. Right? Don’t do it. So interfaces are a very weak specification of whether pieces work together properly. And so there’s a nice definition in the Goosebook, the bottom there, which says that we really need to deal with protocols. Think about whether, describe somehow whether two components work together, not that they just fit together. And this is a hard problem, and you normally end up resorting to documentation to do that. And of course, that’s what we do now, isn’t it?

This is our modern architecture, right? Everything very loosely coupled. Everything is stringly tight programming over some HTTP request, lots of crap crammed in some JSON thing. We have to share knowledge on each end about how to compose the JSON and how to get the data out of it that we want. And how do we share that information? We have an HTML page which has actual documentation in it that describes the protocol. So the types are actually so useless. We’ve actually rejected them in a large number of cases because they cause things to become very tightly bound together. Tight coupling is really bad for building large systems. And types get in the way, so we’ve abandoned it. I think perhaps the main use of types, I’m going to show my age a little bit now, is that they’re probably a good hiring peer filter for people who are really smart.

If you can have a really good type theory conversation with someone in the interview, then maybe they’re smarter than the other guy who couldn’t have that conversation. I think that’s the main use of types.

Right? It’s people who can think very logically. And of course, they’re the kind of people we want to hire, mostly, as programmers. And frankly, I think there are much bigger problems in the world. You’ve all seen much worse examples of JavaScript than this. I know none of you write JavaScript like this, but I think we’re actually, a lot of the programming styles we are exploring today, or having to explore today, will have a much worse impact on the maintainability of our software in years to come. than the fact that they are dynamically typed. This is a simple example, small enough to fit on my slide, but we’ve all seen the callback hell, you know, the callback pyramid where these things are in 15 levels.

So, just a few minutes left now. Static type checking, marginally assists program reliability by mechanically verifying, very few program properties, most of which aren’t very interesting. Great. But at the same time, it comes at the expense of complexity. Remember that C++? Statically correct, but just awful. It comes at the cost of brittleness. You know, if somebody wants to add another field to an object because some oil company has a new identifier, they could just, add it, right? Or they could make us change all of our software so it supports it.

And it leads to a tendency to monoliths. All of the large monolithic systems I’ve worked on have been statically typed. Right? None of the dynamic systems I’ve worked on have turned into those monoliths because it’s much easier to have things loosely coupled. So I think if you’re building programs at scale, which many of us are, more than a few, you know, more than a few 10,000s of lines of code, then these characteristics, the architectural characteristics become very important. You know, so I don’t think we should worry too much about the type errors that are going to happen. Because something’s going to go wrong. And the thing that goes wrong is probably actually not going to be a type error. It’s going to be something else. It’s going to be one of the other things that we forgot while we were building our program. And when the type error does happen, it’s going to be very easy to fix. So maybe the bomb will never actually go off at all.

As I said, 1% of issues. Is that something we should really be getting so excited about? I don’t think so. It’s not very much. To all the other things out there. While we have this kind of Japanese theme, I thought we would… Have some zen poetry i rather like this living plants are flexible and tender the dead are brittle and dry the rigid and stiff will become broken the soft and yielding will overcome.

And whatever you do, please, whatever you do, I’ve seen this time and time again now. Whatever you do, don’t try to write your own dynamic language and embed it in your statically typed system. I have seen this. I think projects just reach some critical size. It’s in a statically type language that, oh, we need dynamic types, we need a meta-object system, and we need good reflection and the language that we’re using isn’t quite up to it. Let’s build our own interpreter. I’ve worked on systems that have multiple embedded interpreters, each one homegrown, because different people don’t know about each other’s interpreters. Don’t do it. It’s fine. Well, there are two solutions to this. Either just use a dynamic language in the first place. Just do it.

It’s probably going to be fine. Or use an off-the-shelf interpreter. Embed one. That is, Python is a nice example. Lua is another nice example. Just take one that works and use it in your context. These don’t do this. It creates such a mess. So I will finish there. We have a few minutes for questions.

Tool support. I mean, the idea for instance or any development. Right. You can help you because it knows, because there’s definitely typing. That’s true. I mean, as I said, I’ve worked in dot net a lot, and, you know, I love working in C-sharp with V-sharper in there and Intellicense and all these things. It’s really nice. I also do a lot of work in Python. And although I, you’re right, I don’t feel the tool is helping me as much. The outcome just isn’t significantly different. I still manage to make programs more quickly in Python than I do in C-sharp.

It’s about the evidence thing. Maybe it’s one of these things about, you feel as if you’re going more slowly in Python, or you feel as if you’re going quicker in Java because of the tool. It feels like the tool’s helping you, but maybe it isn’t. What the tool is helping you do is create vast amounts of boilerplate you never needed.

I agree with your sentiment. And, you know, I came into this five years ago. I was very, very strongly in favor of static typing. And my experience has led me to believe that it’s not all it’s cracked up to be and gets in the way at least as much as it helps, if not more.

Yes. You still favor the dynamic typing in for controlling nuclear reactors? That’s a really good question. And to be honest, if it’s my job to control a nuclear reactor, no. I’d probably want to write it in Haskell. Right? So it depends on the context. The reality is, though, is that most of us aren’t controlling nuclear reactors. Most of us are writing software that allows people to fill in their timesheets.

Right? You know, give up. Compared to the nuclear reactor scenario, that’s what we’re doing. So it’s, frankly, most of the problems we’re solving just are not that important, right? Like safety type issues, you know. Writing, I mean, I think Airbus Flight Control software is written in C++, right?

Do you really want referencing null pointers in your Airbus? No, you don’t. But, you know, they still manage to do it because of very good testing and good review processes. So I hope you’re all going home on a Boeing.

I haven’t done that research, but I think unit tests are much easier to write than any static types. And my personal experience, and I believe most of the experience of people here, is that testing is very effective.

So I would not stand up here and make the argument that we should just give up on unit testing because it costs so much time because I really do believe it helps us understand the problem in a very immediate way that’s very close to the kind of solution we’re looking for. Yes. Very interesting. You have context where the whole type debates are platforms like Android, where it’s very easy to use the Java. Yeah. A lot of dynamic stuff ends up operating quite slowly. Yeah. So I’ve noticed particularly from Google and Square, they’re making a lot more applications that are even more than just simply type verification. Yeah. Structural validation. All of these things that if these, like, pre-compiler validation don’t occur, the programs don’t see. Right. reads.

To the way that answer used. Not saying that this necessarily applies to like server side app, you know, badder thing. It is. And I mean, performance is… Yeah, I think it was essentially a question about the cost of types, sorry, the cost of dynamic typing with respect to performance and I guess with that power consumption on mobile devices. Is that fair? Yeah. I would agree. I think it’s still true that most of the statically typed languages are faster and therefore more power efficient than the dynamically typed equivalence. Obviously, that’s changing. We’ve seen that when people put a lot of effort into dynamic language runtimes, I mean, V8 from Google is the obvious example. We can see that we can get very close to static performance. And it very much depends on the nature of your problem. Still, I mean, there are always going to be problems that are a better fit for dynamic. languages.

Than are for static languages. So, as I said, what I don’t want you to take away from this talk is the notion that you should always use dynamic languages, right? What I’m saying is you really need to think about the cost-benefit trade-off of the type system, because I would argue that the type system is going to cost you time at some point. And you need to figure out whether that time is a good investment because it’s going to stop a meltdown in your local nuclear reactor. Right, in that case, I would argue it is a good investment. or if it’s on a mobile device. Right, but if it’s some line of business enterprise application that’s spending most of its time sitting around waiting for the next HTTP request. Yeah. Yeah, exactly. Okay, I think we’re out of time.

So I’ll hang around for more questions later if anyone wants to grab me or have a discretion about this or, in fact, anything at all. Thank you very much.