functional-programming-doesnt-work

My initial interest in functional programming was because it seemed so perverse.

At the time, I was the classic self-taught programmer, having learned BASIC and then 6502 assembly language so I could implement my own game designs. I picked up the August 1985 issue of Byte magazine to read about the then-new Amiga. It also happened to be the issue on declarative languages, featuring a reprint of Backus’s famous Turing Award Lecture and a tutorial on Hope, among other articles.

This was all pretty crazy stuff for an Atari 800 game coder to be reading about. I understood some parts, completely missed vast swaths of others, but one key point caught my imagination: programming without modifiable variables. How could that possibly work? I couldn’t write even the smallest game without storing values to memory. It appealed to me for its impossibility, much in the way that I had heard machine language was too difficult for most people to approach. But while I had pored over assembly language listings of games in magazines, and learned to write my own as a result, there wasn’t such direct applicability for functional programming. It made me wonder, but I didn’t use it.

Many years later when I first worked through tutorials for Haskell, Standard ML, and eventually Erlang, it was to figure out how programming without modifying variables could work. In the small, it’s pretty easy. Much of what seemed weird back in 1985 had become commonplace: garbage collection, using complex data structures without worrying about memory layout, languages with much less bookkeeping than C or Pascal. But that “no destructive updates” thing was–and still is–tricky.

I suppose it’s completely obvious to point out that there have been tens of thousands of video games written using an imperative programming style, and maybe a handful–maybe even just a couple of fingers worth–of games written in a purely functional manner. Sure, there have been games written in Lisp and some games written by language dilettantes fond of Objective Caml, but they never turn out to be programmed in a functional style. You can write imperative code in those languages easily enough. And the reason for going down that road is simple: it’s not at all clear how to write many types of complex applications in functional languages.

Usually I can work through the data dependencies, and often I find that there’s an underlying simplicity to the functional approach. But for other applications…well, they can turn into puzzles. Where I can typically slog through a messy solution in C, the purely functional solution either eludes me or takes some puzzling to figure out. In those cases I feel like I’m fighting the system, and I realize why it’s the road less traveled. Don’t believe it? Think that functional purity is always the road to righteousness? Here’s an easy example.

I wrote a semi-successful Mac game a while back called Bumbler. At its heart it was your standard sprite-based game: lots of independent objects running some behavioral code and interacting with each other. That kind of code looks easy to write in a purely functional way. An ant, represented as a coordinate, marches across the screen in a straight line and is deleted when it hits the opposite screen edge. That’s easy to see as a function. One small clod of data goes in, another comes out.

But the behaviors and interactions can be a lot more tangled than this. You could have an insect that chases other insects, so you’ve got to pass in a list of existing entities to it. You can have an insect that affects spawn rates of other other insects, but of course you can’t modify those rates directly so you’ve got to return that data somehow. You can have an insect that latches onto eggs and turns them into something else, so now there’s a behavioral function that needs to reach into the list of entities and make modifications, but of you’re not allowed to do that. You can have an insect that modifies the physical environment (that is, the background of the game) and spawns other insects. And each of these is messier than it sounds, because there are so many counters and thresholds and limiters being managed and sounds being played in all kinds of situations, that the data flow isn’t clean by any means.

What’s interesting is that it would be trivial to write this in C. Some incrementing, some conditions, direct calls to sound playing routines and insect spawning functions, reading and writing from a pool of global counters and state variables. For a purely functional approach, I’m sure the data flow could be puzzled out…assuming that everything was all perfectly planned and all the behaviors were defined ahead of time. It’s much more difficult to take a pure movement function and say “okay, what I’d like is for this object to gravitationally influence other objects once it has bounced off of the screen edges three times.” Doable, yes. As directly implementable as the C equivalent? No way.

That’s one option: to admit that functional programming is the wrong paradigm for some types of problems. Fair enough. I’d put money on that. But it also may be that almost no one has been thinking about problems like this, that functional programming attracts purists and enamored students. In the game example above, some of the issues are solvable, they just need different approaches. Other issues I don’t know how to solve, or at least I don’t have solutions that are as straightforward as writing sequential C. And there you go…I’m admitting that functional programming is awkward in some cases. It’s also extremely useful in others.

Follow-Up

Admitting that functional programming can be awkward drew a much bigger audience than I expected, so here’s some insight into why I wrote it, plus some responses to specific comments.

I started learning some functional programming languages in 1999, because I was looking for a more pleasant way to deal with complex programming tasks. I eventually decided to focus on Erlang (the reasons for which are probably worthy of an entire entry), and after a while I found I was not only using Erlang for some tasks I would have previously used Perl for (and truth be told, I still use Perl sometimes); I was able to approach problems that would have been just plain nasty in C. But I also found that some tasks were surprisingly hard in Erlang, clearly harder than banging out an imperative solution.

Video games are good manifestation of a difficult problem to approach functionally: lots of tangled interactions between actors. I periodically search for information on video games written in functional languages, and I always get the same type of results. There’s much gushing about how wonderful functional programming is, how games are complex, and how the two are a great match. Eight years ago I even did this myself, and I keep running into it as a cited source about the topic. Then there’s Functional Reactive Programming, but the demos are always bouncing balls and Pong and Space Invaders–which are trivial to write in any language–and it’s not at all clear if it scales up to arbitrary projects. There are also a handful of games written in procedural or object-oriented styles in homebrew Lisp variants, and this is often equated with “game in a functional language.”

My conclusion is that there’s very little work in existence or being done on how to cleanly approach video game-like problems in functional languages. And that’s okay; it’s unexplored (or at least undocumented) territory. I wrote “Admitting…” because of my own experiences writing game-like code in Erlang. I don’t think the overall point was Erlang-specific, and it would apply to pure subsets of ML, Scheme, etc. It wasn’t meant as a cry for help or a way of venting frustration. If you’re writing games in functional languages, I’d love to hear from you!

All you have to do is pass the world state to each function and return a new state.

True, yes, but…yuck. It can be clunky in a language with single-assignment. And what is this really gaining you over C?

All you have to do is make entities be functions of time. You pass in a time and the position and state of that entity at that time are returned.

For a bouncing ball or spinning cube, yes, that’s easy. But is there a closed-form solution for a game where entities can collide with each other and react to what the player is doing?

You used the word “modify” which shows you don’t understand how functional programming works.

If a spider does something that causes the spawn rate of ants to change, then of course the old rate isn’t modified to contain the new value. But conceptually the system needs to know that the new rate is what matters, so somehow that data needs to propagate up out of a function in such a way that it gets passed to the insect spawning function from that point on. I was using “modify” in that regard.

Functional Programming Doesn’t Work (and what to do about it)

Original Article

After spending a long time in the functional programming world, and using Erlang as my go-to language for tricky problems, I’ve finally concluded that purely functional programming isn’t worth it. It’s not a failure because of soft issues such as marketing, but because the further you go down the purely functional road the more mental overhead is involved in writing complex programs. That sounds like a description of programming in general–problems get much more difficult to solve as they increase in scope–but it’s much lower-level and specific than that. The kicker is that what’s often a tremendous puzzle in Erlang (or Haskell) turns into straightforward code in Python or Perl or even C.

Imagine you’ve implemented a large program in a purely functional way. All the data is properly threaded in and out of functions, and there are no truly destructive updates to speak of. Now pick the two lowest-level and most isolated functions in the entire codebase. They’re used all over the place, but are never called from the same modules. Now make these dependent on each other: function A behaves differently depending on the number of times function B has been called and vice-versa.

In C, this is easy! It can be done quickly and cleanly by adding some global variables. In purely functional code, this is somewhere between a major rearchitecting of the data flow and hopeless.

A second example: It’s a common compilation technique for C and other imperative languages to convert programs to single-assignment form. That is, where variables are initialized and never changed. It’s easy to mechanically convert a series of destructive updates into what’s essentially pure code. Here’s a simple statement:

if (a > 0) {
   a++;
}

In single-assignment form a new variable is introduced to avoid modifying an existing variable, and the result is rather Erlangy:

if (a > 0) {
   a1 = a + 1;
} else {
   a1 = a;
}

The latter is cleaner in that you know variables won’t change. They’re not variables at all, but names for values. But writing the latter directly can be awkward. Depending on where you are in the code, the current value of whatever “a” represents has different names. Inserting a statement in the middle requires inventing new names for things, and you need to make sure you’re referencing the right version. (There’s more room for error now: you don’t just say “a,” but the name of the value you want in the current chain of calculations.)

In both of these examples imperative code is actually an optimization of the functional code. You could pass a global state in and out of every function in your program, but why not make that implicit? You could go through the pain of trying to write in single-assignment form directly, but as there’s a mechanical translation from one to the other, why not use the form that’s easier to write in?

At this point I should make it clear: functional programming is useful and important. Remember, it was developed as a way to make code easier to reason about and to avoid “spaghetti memory updates.” The line between “imperative” and “functional” is blurry. If a Haskell program contains a BASIC-like domain specific language which is also written in Haskell, is the overall program functional or imperative? Does it matter?

For me, what has worked out is to go down the purely functional path as much as possible, but fall back on imperative techniques when too much code pressure has built up. Some cases of this are well-known and accepted, such as random number generation (where the seed is modified behind the scenes), and most any kind of I/O (where the position in the file is managed for you).

Learning how to find similar pressure relief valves in your own code takes practice.

One bit of advice I can offer is that going for the obvious solution of moving core data structures from functional to imperative code may not be the best approach. In the Pac-Man example from Purely Functional Retrogames, it’s completely doable to write that old game in a purely functional style. The dependencies can be worked out; the data flow isn’t really that bad. It still may be a messy endeavor, with lots of little bits of data to keep track of, and selectively moving parts out of the purely functional world will result in more manageable code. Now the obvious target is either the state of Pac-Man himself or the ghosts, but those are part of the core data flow of the program. Make those globally accessible and modifiable and all of a sudden a large part of the code has shifted from functional to imperative…and that wasn’t the goal.

A better approach is to look for small, stateful, bits of data that get used in a variety of places, not just on the main data flow path. A good candidate in this example is the current game time (a.k.a. the number of elapsed frames). There’s a clear precedent that time/date functions, such as Erlang’s now(), cover up a bit of state, and that’s what makes them useful. Another possibility is the score. It’s a simple value that gets updated in a variety of situations. Making it a true global counter removes a whole layer of data threading, and it’s simple: just have a function to add to the score counter and another function to retrieve the current value. No reason to add extra complexity just to dodge having a single global variable, something that a C / Python / Lua / Ruby programmer wouldn’t even blink at.

Follow-Up

Not surprisingly, Functional Programming Doesn’t Work (and what to do about it) started some lively discussion. There were two interesting “you’re crazy” camps:

The first mistakenly thought that I was proposing fixing problems via a judicious use of randomly updated global variables, so every program turns into potential fodder for the “Daily WTF.”

The second, and really, the folks in this camp need to put some effort into being less predictable, was that I’m completely misunderstanding the nature of functional programming, and if I did understand it then I’d realize the true importance of keeping things pure.

My real position is this: 100% pure functional programing doesn’t work. Even 98% pure functional programming doesn’t work. But if the slider between functional purity and 1980s BASIC-style imperative messiness is kicked down a few notches–say to 85%–then it really does work. You get all the advantages of functional programming, but without the extreme mental effort and unmaintainability that increases as you get closer and closer to perfectly pure.

That 100% purity doesn’t work should only be news to a couple of isolated idealists. Of the millions of non-trivial programs ever written–every application, every game, every embedded system–there are, what, maybe six that are written in a purely functional style? Don’t push me or I’ll disallow compilers for functional languages from that list, and then it’s all hopeless.

“Functional Programming Doesn’t Work” was intended to be optimistic. It does work, but you have to ease up on hardliner positions in order to get the benefits.

Back to the Basics of Functional Programming

I have been accused of taking the long way around to obvious conclusions. Fair enough. But to me it’s not the conclusion so much as tracking the path that leads there, so perhaps I need to be more verbose and not go for a minimalist writing style. We shall see.

The modern functional programming world can be a daunting place. All this talk of the lambda calculus. Monads. A peculiar obsession with currying, even though it is really little more than a special case shortcut that saves a bit of finger typing at the expense of being hard to explain. And type systems. I’m going to remain neutral on the static vs. dynamic typing argument, but there’s no denying that papers on type systems tend to be hardcore reading.

Functional programming is actually a whole lot simpler than any of this lets on. It’s as if the theoreticians figured out functional programming long ago, and needed to come up with new twists to keep themselves amused and to keep the field challenging and mysterious. So where did functional programming come from? I won’t even try to give a definitive history, but I can see the path that led to it looking like a good idea.

When I first learned Pascal (the only languages I knew previously were BASIC and 6502 assembly), there was a fixation with parameter passing in the textbooks I read and classes I took. In a procedure heading like this:

function max(a: integer; b: integer): integer;

“a” and “b” are formal parameters. If called with max(1,2), then 1 and 2 are the actual parameters. All very silly, and one of those cases where the trouble of additional terminology takes something mindlessly simple and makes it cumbersome. Half of my high school programming class was hung up on this for a good two weeks.

But then there’s more: parameters can be passed by value or by reference. As in C, you can pass a structure by value, even if that structure is 10K in size, and the entire structure will be copied to the stack. And that’s usually not a good idea, so by reference is the preferred method in this case… except that data passed by reference might be changed behind the scenes by any function you pass it to. Later languages, such as Ada, got all fancy with multiple types of “by reference” parameters: parameters that were read-only, parameters that were write-only (that is, were assumed to be overwritten by a function), and parameters that could be both read from and written to. All that extra syntax just to reduce the number of cases where a parameter could be stomped all over by a function, causing a global side effect.

One thing Wirth got completely right in Pascal is that “by reference” parameters don’t turn into pointers at the language level. They’re the same as the references that eventually made it into C++. Introduce full pointers into a language, especially with pointer arithmetic, and now things are really scary. Not only can data structures be modified by any function via reference parameters, but any piece of code can potentially reach out into random data space and tromp other variables in the system. And data structures can contain pointers into other data structures and all bets are off at that point. Any small snippet of code involving pointers can completely change the state of the program, and there’s no compile-time analysis that can keep things under control.

There’s a simple way out of the situation: Don’t allow functions to modify data at all. With that rule in place, it makes no difference if parameters are passed by value or by reference, so the compiler can use whatever is most efficient (usually by value for atomic, primitive types and by reference for structured types). Rather shockingly, this works. It’s theoretically possible to write any program without modifying data.

The problem here is how to program in a purely functional manner, and this has gotten surprisingly short shrift in the functional programming community. Yes, types provide more information about intent and can be used to catch a certain class of errors at compile time. Higher-order functions are convenient. Currying is a neat trick. Monads allow I/O and other real-world nastiness to fit into a functional framework. Oh does mergesort look pretty in Haskell. I shudder to think of how tedious it was operating on binary trees in Pascal, yet the Erlang version is breathtakingly trivial.

But ask someone how to write Pac-Man–to choose a hopelessly dated video game–in a purely functional manner. Pac-Man affects the ghosts and the ghosts affect Pac-Man; can most newcomers to FP puzzle out how to do this without destructive updates? Or take just about any large, complex C++ program for that matter. It’s doable, but requires techniques that aren’t well documented, and it’s not like there are many large functional programs that can be used as examples (especially if you remove compilers for functional programming languages from consideration). Monads, types, currying… they’re useful, but, in a way, dodges. The most basic principle of writing code without destructive updates is the tricky part.

Don’t Structure Data All The Way Down

Let’s write some functions to operate on circles, where a circle is a defined by a 2D center point and a radius. In Erlang we’ve got some options for how to represent a circle:

{X, Y, R}                     % raw tuple
{circle, X, Y, R}             % tagged tuple
#circle{x = X, y = Y, r = R}  % icky record

Hmmm…why is a circle represented as a structure, but a point is unwrapped, so to speak, into two values? Attempt #2:

{{X,Y}, R}                    % raw tuples
{circle, {point,X,Y}, R}      % tagged tuples
...                           % gonna stop with records

Now let’s write a function to compute the area of a circle, using this new representation:

area({circle, {point,_X,_Y}, R}) ->
   math:pi() * R * R.

Simple enough. But take a few steps back and look this. First, we’re not actually making use of the structure of the data in area. We’re just destructuring it to get the radius. And to do that destructuring, there’s a bunch of code generated for this function: to verify the parameter is a tuple of size 3, to verify that the first element is the atom circle, to verify the second element is a tuple of size 3 with the atom point as the first element, and to extract the radius. Then there’s a trivial bit of math and we’ve got an answer.

Now suppose we want to find the area of a circle of radius 17.4. We’ve got a nice function all set to go…sort of. We need the radius to be part of a circle, so we could try this:

area({circle, {point,0,0}, 17.4})

Kind of messy. What about a function to build a circle for us? Then we could do this:

area(make_circle(0, 0, 17.4))

We could also have a shorter version of make_circle that only takes a radius, defaulting the center point to 0,0. Okay, stop, we’re engineering ourselves to death. All we need is a simple function to compute the area of a circle:

area(R) ->
   math:pi() * R * R.

Resist the urge to wrap it into an abstract data type or an object. Keep it raw and unstructured and simple. If you want structure, add it one layer up, don’t make it part of the foundation. In fact, I’d go so far as to say that if you pass a record-like data structure to a function and any of the elements in that structure aren’t being used, then you should be operating on a simpler set of values and not a data structure. Keep the data flow obvious.

Admitting that Functional Programming Can Be Awkward

Original Article

Follow-Up

Functional Programming Doesn’t Work (and what to do about it)

Original Article

Follow-Up

Back to the Basics of Functional Programming

Don’t Structure Data All The Way Down