Two Vexing Problems in Functional Programming

js8 · on April 1, 2022

Not sure if the author will read this, but there is a simple answer to both problems.

In functional programming, you typically do not manipulate state directly, instead you compose functions that manipulate that state, into more complex functions. (In Haskell, for example, this composition is often done through monads.) Eventually, you get a function that represents the whole program.

This is somewhat dual to imperative programming, and is kind of a shift of the frame of reference. We could make an analogy with an assembly line in a factory. Imperative programming is like looking standing at the factory floor, every machine (procedure) receives a thing (input), does something with it, and outputs it. Functional programming is like being together with the thing as it is manipulated through the factory, in its own frame of reference. So instead of passing the thing (data) around, you see passing of the machinery that modifies that thing (functions), in the opposite direction.

jayd16 · on April 1, 2022

That's fine from a philosophical sense, but doesn't it make consecutive reads extremely heavy? You're throwing away the efficiency of in place updates at the language level, no?

throwaway81523 · on April 2, 2022

> That's fine from a philosophical sense, but doesn't it make consecutive reads extremely heavy? You're throwing away the efficiency of in place updates at the language level, no?

You use special data structures to deal with that. There's a great book about them, "Purely Functional Data Structures", by Jeff Okasaki. You generally don't get O(1) update like for traditional arrays, but you can usually get O(log n), like you would get for a database table with B-tree indexes.

zrkrlc · on April 2, 2022

Minor nitpick, but it's actually *Chris Okasaki

closeparen · on April 1, 2022

This is why Clojure’s big thing is to also, at the language level, emphasize copy-on-write data structures.

mumblemumble · on April 1, 2022

But Clojure also has a habit of implementing core functionality that needs to be fast in imperative Java.

The thing about copy-on-write is that it's only a good solution when writes are infrequent. My sense is that Clojure largely deals with this by simply not targeting business domains where frequent writes to mutable data structures are particularly necessary.

Which hints at my own opinion on how to deal with this Gordian knot: just acknowledge that there is no universally best programming paradigm. Some problems are, as a practical matter, best modeled in an imperative manner. That's fine. And I don't think it should be personally threatening to us fans of functional programming. John Backus openly discussed it in the paper where he originally proposed the paradigm, and I personally prefer, for moral reasons, his suggestion for how to solve it. In a world where everyone's trying to build the best spork, I think I'd rather have one good spoon and one good fork.

Sinidir · on April 1, 2022

Creation of new versions of datastructures are done in log32(N) time in clojure. The performance even for frequent updates is fine.

closeparen · on April 1, 2022

All programming paradigms are at some level implemented in terms of assembly.

harperlee · on April 2, 2022

Yes, but paradigms can be different enough that an intelligent compiler fails to find the same optimal assembly for two solutions for the same problem. Thus the paradigm in which you express the solution matters. Otherwise we would just need to express the problem, and a trivial solution, and the compiler would work the rest (the idea behind declarative programming!)

lupire · on April 1, 2022

Clojure's other big thing is "transient" local mutable state within a function, like Haskell's ST (State Transformer) monad (the more efficient alternative to the pure immutable state monad)

lupire · on April 1, 2022

"in-place updates" are handled by the compiler and garbage collector, using liveness analysis, not by the programmer managing memory explicitly.

saeranv · on April 1, 2022

Yes, although I suppose even in languages where this isn't mitigated (i.e. Clojure example below), you can offset that inefficiency with the parallelism enabled by functional programming.

zodiac · on April 1, 2022

Using the interpreter example in the article, would it be accurate to say that the author's inefficient "each instruction copies the existing bytes into a new array, change the byte of interest, return this new byte array, and abandon the old array (which would eventually be garbage-collected)" is "manipulating the state directly"?

vnorilo · on April 1, 2022

Yes, and also not a data structure that is suitable for the task. If you want to keep every version of the buffer as a value (which has nice properties like versioning) you would need a data structure that shares most of its structure with the previous versions. Like a persistent radix-n tree. Slower than mutable buffer for sure, but not pathological.

zodiac · on April 1, 2022

Got it. I'm familiar with persistent data structures but I was more curious as to the "Haskell approach" of doing the same problem (which, I assume, doesn't use persistent data structures per se).

Follow-up question: if you wanted to implement the interpreter (as in the author's article) in idiomatic Haskell, would using the State monad be one solution? (I assume that's what you meant by "compose functions that manipulate that state"). Then, if you translated that Haskell directly into Racket (something like: use the exact same functions, just delete the type annotations), what would be the run-time of the solution?

(Not a trick question, I really don't know the answers. I've "understood" how the State, Reader, Writer work in the sense of reading the code, checking the types, and confirming that they do what they claim, but didn't think about how a compiler would compile them before)

dllthomas · on April 1, 2022

Translating naively, in Haskell I might use https://hackage.haskell.org/package/containers-0.6.5.1/docs/..., which does use sharing to reduce copying.

tromp · on April 1, 2022

A simple Brankfuck interpreter in Haskell using state monads may be found at https://tromp.github.io/cl/Binary_lambda_calculus.html#Brain...

dllthomas · on April 2, 2022

Yeah, for something where changes are local like that, using two stacks for your tape makes a lot of sense.

vnorilo · on April 1, 2022

Not sure there's any benefit in using the state monad for an interpreter.

My first thought is to model a cycle of the interpreter as a function of old state into new state.

I guess the state monad could be used to thread the state dataflow through the interpreter, but it would not make the copy-big-buffer any more efficient. Code using the state monad is still pure and uses immutable data.

Now if you have to use a mutable buffer (only one fits in memory?) then you could use techniques like the IO monad to deal with it.

Disclaimer: Am more Clojure than Haskell person.

ChadNauseam · on April 1, 2022

Writing efficient interpreters in a functional style is not easy. I write interpreters in OCaml for a living and we had to switch to a mutation-heavy style for performance and memory usage reasons. It can be done though.

When you think about it, any dependently typed language has to have an interpreter inside it, because you can put function applications inside a type and the typechecker will probably need to evaluate the application at some point. The go-to technique is to use Normalization by Evaluation [0], where you convert the AST version of a function in the interpreted language to an actual function in the host language, and piggyback off the host language's (presumably very fast) function application.

[0]: https://en.wikipedia.org/wiki/Normalisation_by_evaluation

Quekid5 · on April 1, 2022

I suspect you already know this, but other readers might be interested to know that in Haskell you can also use the ST monad which can do local (actual) mutation which is guaranteed to not escape the function's scope.

(Of course you can only access the actual local state and not do I/O or anything like that.)

vnorilo · on April 1, 2022

Sure, but there are many kinds of interpreters!

I was involved with one that wasn't really demanding in terms of computation, but was distributed in a way that required a lot of speculation and rollbacks. The functional approach was pretty sweet.

Highest speed bytecode interpreter is actually one of the few loops where I prefer assembly to C, as the instruction mix, branch prediction and even the use of stack can be weird from the PoV of a modern optimizing compiler.

agumonkey · on April 1, 2022

Isn't there a way to derive efficient/in-place interpretation from pure total recursive functions ?

reading lisp in small pieces, the author migrates interpreters from naive recursive eval to bytecode. It felt there was an underlying model to extract.

deterministic · on April 5, 2022

I am also in the compiler/interpreter business. Can you tell me more about what you are working on? It sounds interesting.

lupire · on April 1, 2022

No, its not manipulating the state, which is why it is inefficient.

An example of what parent was talking about is instead of looking over a list 10 times calling different functions that build a new result list each time, you compose those inti one big function that visits each element once.

ChadNauseam · on April 1, 2022

> In functional programming, you typically do not manipulate state directly, instead you compose functions that manipulate that state, into more complex functions. (In Haskell, for example, this composition is often done through monads.) Eventually, you get a function that represents the whole program.

A bit more detail. In Haskell this is implemented very elegantly with list fusion. If you write

    map (\x -> x+1) mylist

You'll map over a list and add one to every element. this allocates a whole new list where all the elements are increased by one [0]. Now let's say we take the output of that map and map over it again, but this time we multiply every element by two:

    map (\x -> x*2) (map (+1) mylist)

Naively implemented that would copy allocate two lists, one where all the numbers are incremented and another where all the incremented numbers are multiplied by two. Any good dev will know that's performance poison. So the Haskell compiler implements "list fusion" – it sees that you don't use the intermediate list for anything, and rewrites your code so it's equivalent (in semantics and performance) to:

    map (\x -> (x+1) * 2) mylist)

(For the compiler devs in here, this optimization is commonly known as deforestation.) This leads to huge speedups in many cases. But there's a problem. If elsewhere you use the result of `map (\x -> x+1) mylist` for anything besides mapping over it exactly once, this optimization is impossible! So list fusion has a reputation for being fragile – the compiler has to know with 100% certainty that you aren't going to use the list for anything else, and sometimes innocent-seeming changes (like moving the intermediate map into another module) can break it.

The solution the Haskell community finds promising is to be able to give a tag to value, like a type only not quite, that says "I promise to use this value exactly once". If you use it twice or zero times, the compiler will yell at you. The compiler is still on the hook for doing the analysis of what's used once and what's used multiple times, but now the programmer and the compiler can be guaranteed to be on the same page.

As for the other issue mentioned in the original post, of modifying a subelement of a tree: this is a well-known problem and there are many solutions. If the tree is only used once, the same optimization as list fusion can be applied to mutate the list in place (although the "you must use the value only once" restriction doesn't help quite as much as you'd think it would). The more common solution, that doesn't depend on compiler optimization at all, is to use a data structure that supports this efficiently. For example, if you have a tree and you want to change one leaf on the tree, the only thing you really need to copy is the part of the tree that's actually different now - for the unchanged branches, the new tree just has a pointer to branches of the old tree, so they can be (and are) reused without copying. That's why it's very common to see maps implemented in functional languages using trees, instead of hashmaps. With a hashmap, it's much harder to get around the need to copy the whole thing when you just want to change one part.

[0]: Well, it might do that that once it gets around to it, laziness etc., but let's ignore that for now.

munificent · on April 1, 2022

> The solution the Haskell community finds promising is to be able to give a tag to value, like a type only not quite, that says "I promise to use this value exactly once". If you use it twice or zero times, the compiler will yell at you.

I suspect the contortions required to fit your code into that affine type style will likely lead to code that is just as hard to understand and maintain as the equivalent imperative code (though with better static checking, which might be nice).

ChadNauseam · on April 1, 2022

It's surprisingly convenient when you get used to it. A very large proportion of values are only used once, so if a function only uses a value once you can mark it as being linear in that argument and get the relevant guarantees for very little mental overhead. You can still pass any value you want to the function, the only restriction is that the function must consume the value exactly once. It's tricky to write performant functional code, but personally I find linearity tagging a small price to pay for salvation (see this article I wrote, posted here: https://news.ycombinator.com/item?id=30762281 ). (Note that GHC currently doesn't use linearity for any performance optimizations, this is just a promising possible route.)

One tidbit you might find interesting. Linear logic requires you use the value exactly once, but if you relax that requirement to "once or zero times" you get affine logic. (Values that don't have the any restriction on how much you can use them are called "exponentials".) With both linear and affine logic, you can pass an exponential to a function that's linear or affine in its argument – the only requirement is that the function uses the value in a linear or affine way. That's not quite the same as what Rust does, which I do find a bit annoying. What rust does is called "uniquenes typing", where a function can always do whatever it wants with any values it gets, but it can mark its parameters as "unique" and the caller has to make sure that any value passed as a parameter to that function is never used anywhere else. This is arguably the more useful of the two, because it means that you can mutate any argument marked as unique and no one can tell, but if you design a language around that you get Rust and I find Rust a bit less pleasant to program in than Haskell.

munificent · on April 1, 2022

Interesting! This reminds of me of what it feels like to write const-correct code in C++.

lupire · on April 1, 2022

It's like unique_ptr<T> and move() semantics in C++.

lupire · on April 1, 2022

> the caller has to make sure that any value passed as a parameter to that function is never used anywhere else.

you mean the compiler, right?

ChadNauseam · on April 2, 2022

The compiler checks that you don't write code where the caller disobeys that restriction :p

dahfizz · on April 1, 2022

> For example, if you have a tree and you want to change one leaf on the tree, the only thing you really need to copy is the part of the tree that's actually different now - for the unchanged branches, the new tree just has a pointer to branches of the old tree, so they can be (and are) reused without copying.

This still only works if a child has one / a known-ahead-of-time number of parents. If you need to update an object that N objects point to, you need to update all N references.

pyrale · on April 1, 2022

It just doesn't really happen that much in functional programs, to be honest.

The very concept of n objects pointing to the same, commonly mutated data is something that can be adressed easily by having that data contained by a parent structure. Since functional programs don't think in terms of "methods" within each object trying to access data, but in terms of external function manipulating all the data you need, the only change is the way you'll pass your data to your functions.

voxl · on April 1, 2022

Ah yes, monadic programming, liked by no one but championed by those who probably have never really had to do it.

OCaml does state without monads the way it does for a reason, because working with monads is a massive headache.

pyrale · on April 1, 2022

You work with monads all day long, you just don't think about them as such.

There is no such thing as "monadic programing", even in Haskell. The thing making haskellers harp about monads is that the std lib went a bit far in making sure that every fitting data structure implements the monad interface, which includes IO.

dgreensp · on April 1, 2022

These problems have lots of other well-known solutions.

For large objects: Persistent data structures with path-copying. Batching small mutations (and using mutation internally in a way that doesn’t leak out). Using the type system somehow to distinguish cases where copying is not necessary (experimental in certain new languages). Structuring your program around a “data store” that is mutable.

For the “references” problem: Explicit references, like by ID. Or mutable “atoms” as in Clojure, which are maybe sort of like the “mediators” in the article.

tomp · on April 1, 2022

Persistent data structures are very hard. It's basically impossible to cleanly combine them. E.g. if you have an immutable hashtable that contains immutable lists, and you want to modify a list - how do you know if the inner list changed or not (you don't want to be creating new hashtables if the elements don't change).

They work in the small, but don't truly scale.

nlitened · on April 1, 2022

> you want to modify a list - how do you know if the inner list changed or not

You know for sure that inner list did _not_ change, because it is immutable.

Someone else might have a reference to an updated hashtable with an updated inner list, but the one you hold will never change—isn’t that the whole idea of immutable data structures?

tomp · on April 1, 2022

No, I mean "update" in the persistent sense - create a new list, with some (or none) elements changed/added/removed.

e.g. say you have a method remove_all_in_values that takes an int x and a hashmap of list of int, and returns a new hashmap of list of int, with x removed from each element of the hashmap.

Obviously you'd use something like List.filter for the inner operation, but most functional programming languages give you no indication whether the result of filter is the same list (i.e. element is not present) or a new list (i.e. some elements were removed).

So how do you know if you create a new hashtable or return the old one? One solution is complex, the other is wasteful.

codebje · on April 1, 2022

No language will tell you at compile time whether an arbitrary list filter operation will have an effect or not. If it can, it's because the list filter isn't arbitrary, it's probably always a no-op.

The generalisation is that you have some type parent with a way to select a type child, plus a function that modifies a child, you want to lift your child -> child function to a parent -> parent function.

Optics help a lot with this class of problem.

Because you're in the world of persistent data structures, and you're doing a modify operation, there is no avoiding the 'modify hashtable' part of 'modify an element within a hashtable'. You always create a 'new' hashtable.

If it's common that your update operation has no effect it may be worth switching to a function like child -> Modified child for Modified a = NoOp | Mutated a, so the hashtable modify function can determine if it can skip its own update. A list filter is a trivial thing to write, and one that can return NoOp if no elements are filtered is no less efficient.

nlitened · on April 2, 2022

I see what you mean now.

I think most persistent data structures would (for most common operations) return exactly themselves in case when no modification happened. So you can very cheaply compare for equality, and that would either resolve to simple one-op comparison of references (to see if they are equal or not), or, in the worst case, comparison of some minor amount of diverging sub-elements (since persistent data structures do extensive structural sharing by definition).

In this sense, linked list is a “bad” persistent collection for your use case, but a persistent set instead would work wonderfully — if nothing changed, you’d get the same reference to a collection as input.

I am pretty sure it either would work this way, or is very easy to optimize for this use case, in Clojure.

zelphirkalt · on April 1, 2022

Isn't the idea of persistent data structures, that they may be more expensive than their imperative versions, but try to reduce the expense to a minimum? So by using a persistent data structure, one should already have taken that potential additional expense into consideration. The wasteful solution should not be that much more wasteful, if we assume, that there is an OK-performing persistent data structure for the use-case. Otherwise I guess one should not choose a persistent data structure. But then you run into problems, when you want to make use of multiple cores to throw at the problem.

fulafel · on April 3, 2022

In FP I think composability and interchangeability of these higher order functions help, in case the supplied filter did not work like you want: You could write your own filter function variant that guarantees to return the same list if the predicate function returned true for each element.

joshlemer · on April 2, 2022

You can pretty efficiently do this in Clojure

    (update-in
     {:foo [{:bar ["a"], :baz ["b" "c" "d"]}]}
     [:foo 0 :baz 1]
     clojure.string/capitalize)
    
    
    ;; => {:foo [{:bar ["a"], :baz ["b" "C" "d"]}]}

tomp · on April 2, 2022

This only works if you know what to edit up-front.

If you're processing a tree-like structure via recursion (e.g. type-checking an AST implemented with persistent immutable structures in a compiler) it's no longer easy.

mrkeen · on April 1, 2022

> how do you know if the inner list changed or not

This is a vital question that virtually every programming language overlooks.

If you are using a mainstream language, you do not know if they were modified.

If you use a language which has a type system to segregate mutable operations from immutable operations, then you know if they were modified.

> They work in the small, but don't truly scale.

Go even bigger then. Work with the kind of data that won't fit in memory. Put your data on disk - you can't insert into the middle of a 1GB+ file.

nyanpasu64 · on April 2, 2022

I can't stand the standard update-based interface of persistent data structures either. https://lib.rs/crates/im wraps persistent structures in a standard vector interface, with instant clones (only copying a pointer to refcounted data) but slower assignments (which may copy data internally if other references exist), which is effectively equivalent to CoW. A persistent update is instead written as a clone followed by in-place assignment, and you can mutate in-place (which copies if there are other references to the data being modified). The nice part is that only IndexMut detaches the underlying data from other references.

I haven't used this crate yet, but if I have a need for persistent data structures I'd definitely try this since it's more accessible.

joshlemer · on April 2, 2022

I think a lot of the pain in Scala/Rust etc immutable data structures comes down to static typing. In Clojure for instance you can very ergonomically do deeply nested persistent updates.

    (update-in
     {:foo [{:bar ["a"] :baz ["b" "c" "d"]}]}
     [:foo 0 :baz 1]
     clojure.string/capitalize)

    => {:foo [{:bar ["a"], :baz ["b" "C" "d"]}]}

deterministic · on April 5, 2022

In Haskell you can you Lenses to easily manipulate sub-trees, individual values, or a selection of values in tree like data structures. It is very powerful.

fulafel · on April 2, 2022

In Clojure the default associative data structure is a persistent map and cloning it with 0 changed nodes is basically a no-op.

UncleMeat · on April 1, 2022

Persistent data structures exist and work very well but they still remain way slower than alternatives that don't support persistence. This is especially bad if you want the data structure to be shared across multiple threads. Just iterating over a tree now requires taking and releasing oodles of locks. Add in the GC overhead and you've got a meaningful problem.

Yes, you aren't constructing an entirely new 30k byte array on each mutation. But you are still adding a lot of overhead to your program.

dwohnitmok · on April 1, 2022

> Persistent data structures exist and work very well but they still remain way slower than alternatives that don't support persistence.

Depending on your definition of "way" this is true. A lot of persistent vector-like structures can have quite good performance, but if your runtime is dominated by iteration and random updating, you will still get a speed up by using raw arrays, i.e. guaranteed contiguous blocks of memory.

> This is especially bad if you want the data structure to be shared across multiple threads. Just iterating over a tree now requires taking and releasing oodles of locks.

This is not. The whole point of immutable data structures and concurrency is that iterating over them does not require any locks because the structure won't change! So concurrent reads require absolutely no new code and certainly no locks. For concurrent writes, the usual strategy is a single CAS-like operation at the top-level to swap out the old immutable structure for a new one (e.g. Clojure atoms, Java AtomicReferences, Haskell IORefs, etc.). But that's a single thing for the whole structure, not every level.

In general concurrent update in the functional programming world is handled by CAS not locks.

UncleMeat · on April 2, 2022

CAS still has a lock in actual memory.

dwohnitmok · on April 4, 2022

Not necessarily. Depends on the underlying hardware implementation (with different answers for e.g. x86 vs ARM) since software CAS usually is implemented as a single CPU instruction.

This is also stretching the definition of "lock" quite a bit because usually CAS is used to implement lock-free and wait-free concurrency.

But regardless at most, even in hardware, it's a single top-level lock, and sometimes not even that. Not one along every level of a data structure.

tomp · on April 1, 2022

> This is especially bad if you want the data structure to be shared across multiple threads. Just iterating over a tree now requires taking and releasing oodles of locks.

Um, what?

Persistent data structures don't require locks at all (because they're immutable), in fact using them in a multithreaded way is 100% safe and thus much easier than their mutable (and ordinarily more efficient) counterparts.

UncleMeat · on April 2, 2022

Refcounted gc means you need locking on refcounts or you can lose your tree during iteration.

tomp · on April 2, 2022

Can you provide a list of reference-counted functional languages?

memco · on April 1, 2022

I really enjoyed this presentation on how roc-lang is being designed to handle optimization using opportunistic in-place mutation: https://yewtu.be/watch?v=vzfy4EKwG_Y&t=1328.

karmakaze · on April 1, 2022

I was thinking along the same lines. Is an "atom" in Clojure like STM where the structure/reference is changed in a highly controlled manner?

Sinidir · on April 1, 2022

atoms are containers that require no coordination so its a simple Compare and swap. There are Refs that require transactions and change of multiple Refs is coordinated in these transactions. WIth that Clojure has an implementation of Software Transaction Memory.

nightski · on April 1, 2022

You should not approach problems in functional languages with imperative techniques. It's fitting a square peg in a round hole. The approaches & data structures aren't always the same.

In my experience with FP you start with more of a top down approach in contrast to the bottom up approach discussed in the blog post.

There are many good resources on this. I've referenced a few below. Of course there are many more.

[1] https://www.amazon.com/Pearls-Functional-Algorithm-Design-Ri...

[2] https://www.amazon.com/Purely-Functional-Data-Structures-Oka...

[3] https://www.amazon.com/Algorithm-Design-Haskell-Richard-Bird...

[4] https://www.amazon.com/Structure-Interpretation-Computer-Pro...

dehrmann · on April 1, 2022

> You should not approach problems in functional languages with imperative techniques. It's fitting a square peg in a round hole.

This might be part of the problem with FP. Every popular CPU architecture is imperative.

nightski · on April 1, 2022

I was more talking about how you approach solving problems, not the execution model. You can still get tight imperative loops in FP, it's just approached differently than in a traditionally imperative language.

Everything is a tradeoff. FP gives you a lot of flexibility in how you trade space and time of the execution of your program. It's also not a magic wand, just a really powerful tool.

chriswarbo · on April 1, 2022

I find this unconvincing. Electronic circuits are well described as a Cartesian Closed Category; and hence by lambda calculus (i.e. we can give semantics to functional programs by interpreting them as descriptions of circuits; as opposed to the "usual" interpretations of x86_64 machine code, or symbolic values, etc.)

Imperative systems are "fitting a square peg in a round hole" by introducing hacks like 'clock signals', 'pipeline flushes', etc.

hither_shores · on April 2, 2022

The category with linear circuits as its morphisms, labelled sets of electronic terminals as its objects, and laying circuits in parallel as the monoidal product (which is the usual notion of the category of circuits; perhaps you have something more sophisticated in mind, but if so you should specify that) is not cartesian closed.

One of the many ways to demonstrate this: all linear circuits have duals (voltage <-> current, series <-> parallel, and so on), but in a cartesian closed category only the terminal object is dualizable.

What you're probably thinking of is the fact that this category is compact closed. The corresponding internal language is going to be some sort of process calculus (I think the pi calculus is too strong, but not sure), not the lambda calculus.

imtringued · on April 1, 2022

No, that's the easy part. Computation with FP is very easy. The problems start the moment you have to do persistence and communication. Those aren't part of any pure functional paradigm.

TuringTest · on April 1, 2022

> Every popular CPU architecture is imperative.

That's arguable, given that every (deterministic) imperative computation can be expressed as a pure functional program. "Imperative" is more of a viewpoint that the developer is using to understand the behavior of the computer, rather than a property of the machine itself.

A pure functional analysis will describe a computation on that "imperative" architecture as an inmutable, monotonic sequence of all the states that the CPU goes through over time; or maybe, if you know enough PL theory, as a chain of rewritings of a program-long expression that is updated, one term at a time, when each expression is replaced by the result of a function as it consumes its input parameters.

And no viewpoint is inherently better than the other (except that the functional version can be analyzed in a natural way with mathematical tools, which is often difficult with the imperative description).

kortex · on April 1, 2022

Nope, unless you do some serious contortions and basically say that the universe is a lambda calculus that operates on each moment in time to produce a new view of reality.

Computers that do anything useful have all manner of I/O. Highly oversimplified in modern architecture, but you can think of I/O as a bunch of memory cells that are DMA'd by external forces. Literally the exact opposite of functional, IO registers are the epitome of global mutable state. Then you have interrupts and the CPU operates in response to these events.

The viewpoint of "computers are mutable state" is inherently better because of parsimony - it requires less degrees of indirection to explain what is happening. Just because you could shoehorn reality into a physical model, does not make it equally valid. You basically need to break it down to the Schrodinger equation level to have a purely functional model when even idealized gates are involved because propagation delay impacts things like how fast you can DMA.

TuringTest · on April 1, 2022

> Just because you could shoehorn reality into a physical model, does not make it equally valid.

Hey, physicists do it all the time. Having mathematical representations of how things work in the real world often turn out to give you complex devices that you wouldn't get by intuition alone. Thanks to doing it, we have Special Relativity and Schrodinger equations, that bring you advantages as GPS and insanely-high-speed networks.

There's nothing wrong with modelling interrupts and mutable state as static mathematical entities a.k.a. functions; and if you think it is, you don't know enough about functional programming - in fact, doing it allows for advanced tactics like automatic verification and very high scale generation of electronic devices.

Those less degrees of indirection may make it simple to build some kinds of programs; but they also introduce complexity in modelling the possible states of the system when you allow for arbitrary, unchecked side effects. So no, that representation is not inherently better - only most suited for some tasks than others; but the converse is also true for functional models of computation. If you think it is always better, it's because you're not taking all possible scenarios and utilities into account.

(Besides, as I already said elsewhere, "computers are mutable state" is not incompatible with functional representation; as mutable state can be represented with it). It seems that you're confusing mutable, wich is a property of computational devices, with 'imperative' which is a style of writing source code and defining relations between statements in your written program.

> Literally the exact opposite of functional, IO registers are the epitome of global mutable state.

Yup, you definitively don't know what functional programming is if you think you can't have global mutable state in it. In fact, every top-level Haskell program is defined as a global mutable IO monad...

whimsicalism · on April 1, 2022

Not really arguable, IMO.

Just because both lambda calculus and assembly are Turing complete does not mean that CPU architecture isn't clearly imperative.

TuringTest · on April 1, 2022

How do you define "imperative" in a way that precludes a functional description of your definition? I'd say that if you're able to do that, you'd have found a contradiction in Turing completeness and win you a Turing award.

Also, this: https://news.ycombinator.com/item?id=30883863

whimsicalism · on April 2, 2022

Just because every program in C is also expressable in Haskell does not mean that C is Haskell or that C is a functional programming language.

The same is true of CPUs, which at their base level execute instructions line-by-line to move and load things in and out of stateful registers as well as doing simple mathematics.

Not going to keep arguing, as I can see from your other comments that you are going to try to hold this line, and also your reasoning doesn't really make sense.

Everyone has heard of Turing completeness. It does not imply that all distinctions between the semantics of languages and the structure of hardware thus are collapsed. It means that you can write an equivalent program using different semantics or different hardware.

TuringTest · on April 2, 2022

> It does not imply that all distinctions between the semantics of languages and the structure of hardware thus are collapsed. It means that you can write an equivalent program using different semantics or different hardware.

No, it means that every computable program or device can be described through functional semantics, and therefore that very same program running on that same hardware can be described mathematically with them (not an 'equivalent' program, but the same program with the same runtime behaviour, just described in low level with a different format).

That's a core principle of computer science that predates computers themselves, and people arguing about the benefits of the imperative style tend to be unaware of this fact, making their arguments look weak.

dahfizz · on April 1, 2022

You're definitely stretching here. I could just as easily argue that "all of main memory" is the output of any function. Therefore every function is a pure function, and C is a function language. Using your definition, the concept of functional programming is meaningless.

Main memory is state. Registers and cache are state. The entire purpose of the CPU is to mutate state.

nightski · on April 1, 2022

No, I don't believe you could argue C is purely functional by framing memory as the output of the function because C doesn't treat it as such. Memory in C can be changed even if it is not listed in the inputs or outputs of the function.

He's not really stretching the definition, that's just the lens through which FP looks at the world.

Of course there is state. FP doesn't prohibit state, it just makes it explicit and referentially transparent. But this is in the semantics, not the execution model of the program.

dahfizz · on April 1, 2022

> Memory in C can be changed even if it is not listed in the inputs or outputs of the function.

If we are going to nitpick syntax, nothing about `mov eax ebx` lists main memory as output. In fact, mov doesn't even have an 'output'.

If you want to model mov as a function with the implicit output of "all of main memory", then you can do the same with any arbitrary C function. If you can't make that leap for C, then you can't make that leap for assembly.

nightski · on April 1, 2022

It's not syntax, it is semantics. There is a lot to programming language theory/design and this is all formalized. C does not have the semantics you are describing in the formal specifications (C99, etc..). Could they have developed a version of C with these semantics? Possibly. Rust was a step in that direction with it's borrow checker.

dahfizz · on April 1, 2022

Yes, and my whole point is that assembly does not have those semantics either. The computer is an imperative machine.

TuringTest · on April 1, 2022

The computer is not an imperative machine. It's a physical machine that you describe with an imperative representation in your mind. Can you tell the difference, or are you unaware of the map-territory distinction going on?

Functional programmers are simply choosing to use a different style of map, but we both represent the same reality.

dehrmann · on April 2, 2022

According to Wikipedia, "the hardware implementation of almost all computers is imperative."

https://en.wikipedia.org/wiki/Imperative_programming

TuringTest · on April 2, 2022

[Citation needed]

rramadass · on April 2, 2022

I think anybody who wants to understand Functional Programming should start with Lambda Calculus.

I highly recommend An Introduction to Functional Programming Through Lambda Calculus by Greg Michaelson.

You can then see how to express Functional Techniques in your language of choice (i.e. one you already know well eg; Functional Programming in C++ by Ivan Cukic) before learning a new Functional Language. This is to avoid the confusion in syntax/semantic expression of unfamiliar concepts which is a bit too much for a beginning student to handle simultaneously.

exdsq · on April 1, 2022

Thanks for these links, commenting to find them later.

Jtsummers · on April 1, 2022

If you select the timestamp of a comment (also when you reply to it) you can favorite it. This is much cleaner than leaving scattered "to find them later" comments around this site (in fact, I'm pretty sure it was added precisely to avoid those kinds of comments). You can also favorite submissions themselves. Favorites can be found in your profile.

bckr · on April 1, 2022

you can click the timestamp and just keep the URL somewhere, like your browser's bookmarks

exdsq · on April 1, 2022

Cool, didn’t know that! Cheers

Barrin92 · on April 1, 2022

Personally I never really understood the benefit of thinking in functional terms about things that are conceptually more naturally thought of as persistent objects.

For example when the author picks "parents" or "children" or in general maybe any entity of any kind, I don't understand what functional thinking gets me. Sure instead of manipulating mutable objects I can think of every entity being 'one thing' at 'one moment in time' and I do a gazillion state transitions that all return new things and I have never mutated anything, but not only is that bad from a performance standpoint, I never understood how that's clearer or easier, it just seems like a tortured metaphor.

Monads and other functional tools are being brought up in this thread as well but I always had the same problem. Sure, I can stuff my operations into a box, say it's the "state box" and then everything is neat but for what purpose? I still remember when I was in university I rewrote a tetris clone I had made into Haskell. It took a lot of time and weird error messages, but at the end of the day translating all my imperative or object like procedures into functional paradigms didn't make the program any easier to understand.

The places where I think functional programming is natural is for something like accounting, data processing or version control systems. Where I really do want to think of every state as its own, immutable little world, where keeping track of changes pays off, and so on. In a lot of domains it to me seems ill-suited.

valenterry · on April 1, 2022

Because it reduces the number of things you have to keep in your head and the uncertainty about what you don't know.

Using monads and a "state box", it's not enough to make a change to the context of the box, you also have to pass the box onto the next entity/process that wants to change it, or nothing will effectively change (other than burning cpu cycles).

This also means that it suddenly becomes easy to find out what things happened to the box content before, because all you have to do is check what the one who was passing the box to you was doing with it.

If your program is strictly linear than using state boxes is merely syntactic overhead. But as soon as that's not the case anymore, it suddenly becomes important.

I remember very well the time when I had to deal with bugs where I was operating on data that looked different than I expected and I had a hard time to figure out how it came to that.

But even worse are bugs from concurrency and also the "magic" and workarounds for dealing with it when "state boxes" are not available or known to the developer. Now you look at a line of code while knowing that stuff is running somewhere else, impacting what you are working on in a potentially uncontrolled manner.

I find that very unproductive.

zmgsabst · on April 1, 2022

I’ve encountered three benefits to functional programming:

- immutability and function application perspective “play nice” with distributed applications, both for distributing work and idempotence

- in a similar vein, the “everything persists, apply functions” perspective works nicely in data science for finance, where you need auditable calculations

- and similarly, but more broadly, if you’re implementing a workflow engine, then the functional perspective is a clear winner: you’re literally writing a program to manipulate and apply functions!

Friends have told me a fourth:

- composition and lenses are nice for UI

kortex · on April 1, 2022

You are absolutely right that there are domains that are just inherently more stateful and not easily leant to a functional style. What I've found so far on my functional journey is that it can be handy to lean into this aspect of "punishing" certain problems. It kind of forces you to think more explicitly about mutation and when you actually want to invoke it. Thus you end up writing a lot in terms of transactions and hypotheticals - "if I get a POST /users request, what SQL queries do I emit?"

You have execution engines (a web framework and sql engine) that encap a block of pure function. There's no side effects (save maybe logging) whatsover inside your mini-program.

FP patterns like monads let you compose blocks of execution so you can have bigger mini-programs that are more ergonomic to write.

SQL query builders are (I think) a kind of Monad. You can compose selections, filters, groups, etc without ever executing, building up a big "what if" into a transaction. No side effects until you execute, it's pure data in data out. Which means for example you can write true unit tests without mutating a "real" db by composing transactions together.

FP falls apart when you have to reason about "nuts and bolts" of actual computation models, IO, network, time, RNGs, DMA, gpus, large data arrays. FP acolytes say "just let the computer/compiler do it" eg write in FP and compile to optimized imperative machine code, but someone still has to write drivers at the end of the day.

mirceal · on April 1, 2022

immutability is amazing. you can make assumptions about the data that you have received, it's friendly to caching, it's friendly to crashes, it's very easy to reason about.

it's even more amazing when you enter a territory of multithreading/multi processor.

dahfizz · on April 1, 2022

But the real world is mutable. The authors example of a GUI program is a good one: the user wants to change the state of the program, and expects the new state to be reflected in the GUI. You can model this in fp, but it feels like I'm standing on my head. The user wants to modify a glyph, and the program should modify that glyph. It's the simplest way of modeling what is happening.

zelphirkalt · on April 1, 2022

At some point you need to break the paradigm to do anything useful, be it some IO, or whatever. Working in a FP way for as long as appropriate will make large parts of your program better testable and avoid large amounts of unwelcome surprises. If at some point you got some result, which you want to let affect the real world, then you can still limit mutation to that place, where you got the result. I would avoid that for as long as possible, to have larger parts of my program be easy to reason about.

GUI has been one of the strongholds of mutation-encouraging paradigms though. It should be said. I would use it as an example, where one probably might need mutation for performance reason and for the reason, that I find it a bit too overheady to create a new widget for every small change. I don't know enough about how some languages try to solve this and stay declarative or functional, to comment on that.

mirceal · on April 1, 2022

this is me being cheeky, but how do you know that the real world is mutable?

if you believe in super-determinism and that everything progresses based on predetermined laws you can take the view that space+time already exists in all its past and future and that nothing happens.

or, quantum mechanics in the multiple world interpretations. you can say that the state of the world literally splits whenever there is an event.

i guess the metapoint here is that just because something seems mutable does not mean it’s really mutable and that mutable and immutable only make sense at certain levels of matter organization (are atoms mutable? what about photons? what about quarks?)

adamkl · on April 2, 2022

Late to the party here, but what you propose is not being cheeky.

Rich Hickey (Clojure creator) himself has a similar, interesting, and compelling take on time and mutability [0]. This perspective is the underlying basis of Datomic.

[0] - https://youtu.be/ScEPu1cs4l0

mirceal · on April 2, 2022

i found that people are more open to this idea if they believe i am kidding.

joshlemer · on April 2, 2022

Rich Hickey made a great talk on this in 2009 https://youtu.be/ScEPu1cs4l0

An other way to think about it is that the programs we write do not literally simulate entities (users, other domain objects like books, students, professors, classes, etc). Our software simulates the record keeping devices that store /protect information regarding these entities. We are most of the time writing information processing systems, not simulation systems and information accumulates, rather than updates in places. Like, e.g. when there's a new president elected, we don't all go out with our erasers and remove all references to change sentences from "The President is Donald Trump" to say "The President is Joe Biden", instead we accumulate new facts without remembering the old.

joshlemer · on April 2, 2022

*without forgetting the old.

Traubenfuchs · on April 2, 2022

> Personally I never really understood the benefit of thinking in functional terms about things that are conceptually more naturally thought of as persistent objects.

Like the vast majority of software developers world wide, who just want to get things done and/or earn money.

The simple truth is: A minority prefers purely functional programming. The vast majority prefers procedural programming with functional elements.

For some reason, those who prefer pure functional programming often try to convince everyone else their way is the best way in an obnoxious way.

colanderman · on April 1, 2022

Relational systems (e.g. SQL) solve the parent-child problem by making references explicit -- instead, keys are used to reference children in a well-known collection.

That is, if you're trying to emulate sharing of mutable structures (i.e. pointers) in a functional language -- that is, reflecting changes to children across multiple parents -- model the pointers explicitly (as integer keys) and put the shared structures in a map, which is necessary to interpret a parent in full.

This seems to be what the article is getting at with its "mediator" concept, albeit in very abstract terms. The author seems to find this distasteful, wanting to "hide" these relationships, but I disagree -- the difference between value and reference semantics is very important; making the difference explicit is a good thing.

gavinray · on April 1, 2022

The Koka research language has something it calls "FBIP" (Functional, but in-place)

Where you write functional/immutable methods and the compiler deterministically can convert it to in-place mutating operations for performance:

https://koka-lang.github.io/koka/doc/book.html#sec-fbip

  "With Perceus reuse analysis we can write algorithms that dynamically adapt to use in-place mutation when possible (and use copying when used persistently). Importantly, you can rely on this optimization happening, e.g. see the match patterns and pair them to same-sized constructors in each branch."

  "This style of programming leads to a new paradigm that we call FBIP: “functional but in place”. Just like tail-call optimization lets us describe loops in terms of regular function calls, reuse analysis lets us describe in-place mutating imperative algorithms in a purely functional way (and get persistence as well)."

pbiggar · on April 1, 2022

I've been coding in functional languages for 10 years, and they have problems, but these are not them.

The first problem is solved in every functional language I've used (clojure, F#, OCaml) by immutable data structures (aka purely functional data structures, which is also the name of the de facto book on the topic, which is excellent but also you don't need to know any of this to use them. [1]). I'm surprised to see this critique without a mention of this.

The second problem seems to be due to trying to do OO in FP. The thing is you don't have a glyph, you have a whole system which contains glyphs. So you can't do an update on a glyph because it won't update the system.

This is a common frustration when you're used to OO and can be really annoying in FP but it's also part of the design modelling that FP forces you to do: the problem was that he was modeling in the small and not the large.

The solution is not the 1, 2, or 3 from the article (1 and 2 are bad, 3 is not as bad but not FP), but is straightforward. When you want to update a glyph, you use a function like `updateGlyph(pathToGlyph, functionToExecuteOnGlyph)`. That's at the top-level, and it allows you compose your lower level `updateGlyph` functions.

[1] https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.64...

zodiac · on April 1, 2022

I love the Purely Functional Data Structures book (so many cool data structures and ideas!) but does really "solve" it? For e.g. I double checked and the PFDS for an array (the author's example) would be a skew-binary random access list with O(log N) cost for random access, whereas an "imperative random access array" is O(1). Of course, the skew-binary random access list has persistence (you get to keep references to old versions of the data if you wish), while the "imperative random access array" does not, but the author's use-case sounds like he doesn't need persistence. So there is still some gap between this particular approach and the "non-functional" one.

lostcolony · on April 1, 2022

Yes and no.

Ultimately, is the Big O scale going to be larger for certain cases? Yes. Is it going to matter? Maybe. Maybe not. The point the original comment is making is that the author seems unaware these data structures even exist; the fact he was having O(n) operations and found them to be too slow tells us nothing about the reasonableness of the O(logn) option for his use case. He mentioned using a 30k byte array; that's the difference between 30k operations, and 15. Cutting it from 15 to 1 is probably not going to be that meaningful; certainly not compared with the first jump.

zodiac · on April 1, 2022

That's a good point.

I wonder if the problem is that efficient persistent arrays aren't in racket's standard library (there are some third-party libraries implementing them though). Since racket isn't a "pure FP" language, they include cons-arrays and mutable vectors, and I imagine they felt that there wasn't a need to include efficient persistent arrays in addition to those.

lostcolony · on April 1, 2022

Sort of - consider when Lisp originated. The heritage/age comes from a time before the research in persistent data structures happened, or even the issues with mutability were fully known. Because of that, idiomatic Lisp tends to be mutable, and no one is writing a new Lisp who isn't already familiar with another one. That's not to say immutability isn't worthwhile there, or that no Lisp standard libraries support it, but mutability tends to be the default assumption from what I've seen.

phyrex · on April 1, 2022

I don’t know about that; clojure was specifically created with persistent data structures in mind, and it’s definitely as much a lisp as racket is

lostcolony · on April 1, 2022

I inserted 'tends' everywhere intentionally.

pbiggar · on April 1, 2022

In addition, the O(logN) only comes into play if you are truly accessing only one item. If you're accessing more you are either reading the whole list (O(N) in total, O(1) per item), or a sublist (in which case it's O(logN) once to get/update the k item sublist and O(k) to operate on it).

If you are accessing just one item, the O(logN) cost is likely dominated by some other cost (eg the user clicking a button, making a web request, etc).

TuringTest · on April 1, 2022

The thing is, functional and imperative programming are mathematically equivalent - to the point that you can express imperative code in a pure functional language, if that's what you really need. If you want a O(1) random access in a functional language, you can formally define it; the problem with this is that you may end up embedding a whole computer in your program to do it, if your program heavily depends on side effects like those in the article. So, if you really need O(1) mutability for a large part of your program, maybe functional programming is not the tool for the job? Being too tied to a paradigm prevents you from finding the best tool at hand.

Functional programs are best used when you want referential transparency[1] as your problem domain benefits from it. If you want to represent mutable state in a functional, transparent way you use a subset of temporal logic[2] - each function has an additional state of the world input parameter, and return an additional updated state of the world value with all the mutated objects in it. The State monad is a version of this with increased legibility (as long as you don't combine many different sources of mutation).

If your needs for mutation are limited to a few objects in each module, you represent those as streams (the generalized version of futures which are able to be repeatedly updated with new values); a stream represents the history of all values that a mutable variable takes during the program execution. This is the basis of the functional reactive paradigm[3] on which modern web frameworks like React and Vue are converging, as it happens that modeling asynchronous state mutations of multiple loosely connected elements is easier in the functional paradigm than in the imperative.

[1] https://en.wikipedia.org/wiki/Referential_transparency

[2] https://en.wikipedia.org/wiki/Temporal_logic

[3] https://en.wikipedia.org/wiki/Functional_reactive_programmin...

zodiac · on April 1, 2022

If you used the State monad for the author's interpreter example, and you did it in Racket (your "state of the world" is a cons-list), what would be the run-time of the resulting interpreter? Similar question for streams (although I'm not that familiar with streams, so I don't know how they would apply here, or if they can be implemented without changes to the base language)

TuringTest · on April 1, 2022

A stream is in essence the same as the Observer pattern; every function that depends on the values of a mutable variable must use accessor methods to extract the most recent value when they need it, or pass it 'hook' callback functions that represent the next steps to take on new values.

So, in principle, it should have the same performance as the 'Children can issue notification events' method in the article, since it's using what in essence is the same structure (just with a much nicer syntax, better adapted to functional programs).

The advantage of using functional streams instead of OOP observable callbacks is that it tends to be easier to write the program's large structure as compositions of functions over streams, i.e. it is easy to build modular functions without the bugs introduced by side effects.

I learned this style of programming through the online 'Paradigms'[1] course by Peter Van Roy, which you can take for free if you don't want credits. It uses the multi-paradigm language Oz to explain the essence of each paradigm and the differences between them. The course distills the primitives of different programming languages in terms of how they see state (see Van Roy's diagram in [2]), and how deciding to incorporate and combine some of those primitives and not others gives rise to the different programming styles. I found it enlightening, in the way that programmers' koans refer to.

[1] https://www.edx.org/es/course/paradigms-of-computer-programm...

[2] https://en.wikipedia.org/wiki/Programming_paradigm#Overview

Jtsummers · on April 1, 2022

There are cost tradeoffs involved so you have to determine based on your application whether they solve the problem or not.

In a case like this you have several options:

1. Stick with mutation, reuse the original array and don't worry about being purely functional. This will probably be the optimal solution (vice explicitly creating a copy and mutating just the relevant bit). However, if you need the original for some reason then this is a non-solution.

2. Imitate the purely functional style with explicit copies, and accept the massive memory and performance costs that carries. You get to keep the cheaper read cost.

3. Use a better data structure, but losing (in particular) read performance, but largely mitigating the creation and "mutation" costs. (in quotes because the original is, from the perspective of the user, unchanged.)

Assuming you're in a language that doesn't provide (3) for free, and that (1) is a non-solution, you need to profile your system to decide between which of (2) and (3) is appropriate. With a large enough data structure or frequent enough updates, (3) is probably what you will want. If your data structure is small or updates are rare, then you can stick with (2). And if you have no idea which one will occur in the real-world (some users tend towards (2) being better and others towards (3)) you have to make an explicit choice on which one to provide better performance for.

jonnycat · on April 1, 2022

The a-ha moment for me was that functional programming - in most real world applications - is less about "no state" and "no side effects", and more about "properly managed state" and "properly managed side effects". This is one of the things I really like about Elixir/Erlang/OTP - of course your application needs to manage state, but it's well contained and well managed, avoiding a lot of common pitfalls.

z5h · on April 1, 2022

Purely Functional Datastructures (for example) introduces efficient operations on “large” objects. Next, as Out of the Tarpit suggested in 2006, use a relational model (developed in 1969). Hardly seems vexing. Yes, maybe not as efficient as other ways to compute.

dboreham · on April 1, 2022

Interesting thread. Although most of the comments are "you're holding it wrong", I think the author brings up good points, which are roughly along the lines of : the FP language tutorials don't tell you how to write real programs. Now, granted the problems cited will show up at some level of scale in non-FP programming -- you can't just rummage around in some huge nested data structure inside a C++ server application for example because you'll end up holding dangling pointers among other things. However, perhaps you can get much further with non-FP languages by following the basic tutorials.

Also interesting to note that modern UI application frameworks such as React/Redux don't mutate state in-place. They tend to generate some kind of mutation object that is "played" against the in-memory stored application state. Probably similar to the FP-kosher solutions suggested here.

isaacimagine · on April 1, 2022

There are a number of patterns that have arisen over the years to deal with these problems. For the large-object problem, look into persistent data structures or the 'Functional but in Place' paradigm (compiler optimizes modification on a copied unshared array to a mutation).

For the parent-child problem, typed handles, continuation-based state propagation, and reifying state updates for a declarative interpreter may be used.

Monads and effects are common in strongly typed functional languages, but have less bearing on lisp, which usually embraces dynamism to some extent.

PaulHoule · on April 1, 2022

The "big object" problem is solved by the persistent data structures in Clojure

https://clojure.org/reference/data_structures

These are a little slower than conventional data structures (e.g. Java Collections) but are much more efficient than copying the whole structure every time you want to "change" it. The garbage collector picks up behind you so you're not wasting unbounded space on unreachable versions of the data structures.

CornCobs · on April 1, 2022

I think that copying an entire array just to modify one bit doesn't really demonstrate the flaw of FP as much as it is just bad programming. The problem isn't really with the paradigm, per se, rather it is the understanding of each problem domain.

I hate to be pedantic, and I know this sounds like a no true scotsman argument, but I really feel FP discussions always devolve into this because people have so many different levels of understanding of what FP is supposed to mean.

I think that many people who espouse FP like the "aesthetic" of FP. They like that map/filter/reduce look "cleaner" than nesting ifs, breaks and continues in loops. They love how pattern matching on Discriminated Unions is so much clearer than imperative switch statements or OOP-style polymorphism, where to see the behavior of a single function you need to look through 5 different files. This is the kind of evangelising that stops at "cute little values" and toy examples.

But (subjective opinion here) I think what the more experienced FP proponents are trying to promote are in fact just sane, good programming practices made explicit, but they call it FP too (thus "no true scotsman").

For example, for the OP's bf problem, is copying a 30k byte array for every operation a good idea? Of course not! Is isolating the "stateful" operations (operations that modify this byte array) to small sections of code, possibly even gathering all the modifications that need to be made and only applying them after the function returns, rather than littering stateful operations throughout the code, so that it becomes much easier to figure out when and why some state changes, a good, "FP" idea? Hell yeah! Is changing your data structure to a persistent data structure so you can retain your "op: Buffer -> Buffer" mental model while preserving more performance? Possibly so! (Though I think for such a problem the mutable buffer is the obviously correct data structure; a persistent DS doesn't give much benefit since each operation takes the whole array and returns the whole array before the next operation is executed)

In conclusion I think focusing on the principles of good programming are much more important than focusing on the external appearances of each paradigm. You can architect clean, loosely coupled components with little shared state in OOP as well as in FP. We ought to focus on teaching that.

mirceal · on April 1, 2022

strongly agree with you. it should be about the principle and not about a poor understanding and application of the said principle. if the principle is: data should be immutable, data should be immutable. that does not mean copy things over until we throw our hands in the air and say "performance". you also don't mix the semantics of a function call with the underlying representation of the data you are using (if you are using ADT)

parksy · on April 1, 2022

Tangential but what is the most successful and most complex example of a fully FP-based application where I can view the source-code today? I feel like I still don't quite grasp the essence of the conflict between FP and whatever it seems to be opposing. I also feel like I use what a lot of FP people claim as FP in my day-to-day.

I think if I can see some code beyond the old map/reduce toy examples it might click. Not wanting to be confrontational, and I know that this forum isn't here specifically to educate me personally, I just genuinely have had problems understanding what FP is "supposed" to be that any good programmer doesn't already do when possible, regardless of language or whatever.

I'm sure it will click and I'll look dumb but yeah. Isn't loose coupling and strong coherence kind of covering this already? Is it just down to doing map/reduce/filter to avoid mutating some variable and instead get a new set of results? Is mutation the charm?

I genuinely really feel I'm missing out on something important here because I have older views on programming, I've struggled to get a good explanation. Do I have to go back to school again? I'm not against that but yeah. I'm sorry if this sounds ignorant, but I kind of am.

deterministic · on April 5, 2022

I used to be in your shoes. What helped me was learning Haskell and watching all YouTube videos by the Clojure creator (what’s his name again?) It was hard for me to get it (old dog learning new tricks). But I persisted and suddenly the light bulb went off and it all became super simple and easy. It was a hard journey but worth it. I am a much better programmer today, in any language, because of it.

rramadass · on April 2, 2022

https://news.ycombinator.com/item?id=30878534#30888405

skybrian · on April 1, 2022

This article explains the problems pretty well, but they are old problems and functional programmers have various solutions that the author seems unaware of.

I'm not a Haskell programmer but I do know about monads and have heard of lenses.

charcircuit · on April 1, 2022

Lenses don't help. They just make copying and updating large data structures cleaner to write. They don't solve the performance problem of copying everything.

WkndTriathlete · on April 1, 2022

Incorrect. Optics implementations can ensure updating a node in a large balanced tree is O(log n) instead of O(n), trivially, and I'm sure there are sufficiently clever collections with optics that would enable much better performance than the trivial optimizations I can think of.

The same holds true for large structure updates.

charcircuit · on April 3, 2022

Not all data structures are a large balanced tree and a O(1) update is going to be faster than O(log n) updates. There is still going to be a significant performance hit from having to recreate a whole node instead of just updating a single value. While yes you can say large data structures are like a big node of a tree, having to copy all of the different records from the old one to a new one is going to be considerably slower than just modifying a single record.

It does help, but it's still going to be significantly slower than just modifying a single record in a data structure. It is a still a problem that will be slowing down your program.

bts · on April 1, 2022

In the Haskell world, folks have solutions to both of these problems:

The "large-object problem" can be tackled in a principled fashion using strict state threads (aka the ST monad: https://wiki.haskell.org/Monad/ST) or using vanilla mutable (IORef) references.

The "parent-child problem" is well-addressed by lenses, also known as functional references. They are basically composable getters and setters that allow you to read or update deep into nested structures.

SinParadise · on April 1, 2022

I don't fully understand how parent-child problem is addressed by lenses. How does lens helps to address the problem that a reference update is akin to changes an external state?

My understanding is that lens helps to address the large-object problem.

marcosdumay · on April 1, 2022

You mean, Haskell developers ditch the functional idiom and use a more imperative one when they need it for performance? And yes, they do, and it works nicely. But it's not solving any problem with the functional paradigm.

Anyway, none of them is a problem with the semantics of FP. They are a problem of insufficient optimization. So they will probably get solved at some point.

travisathougies · on April 1, 2022

ST is not 'imperative'. The monadic interface can look imperative sure, but then you can also use Applicative and friends and that's back to being functional. Of course, I prefer to think of what you call 'imperative' programming as function composition where >>= is the reverse of the 'normal' composition operator. Indeed, sequential imperative commands and function composition are isomorphic.

marcosdumay · on April 1, 2022

Hum... You declare an state and go modifying it with an action after another. Looks quite imperative to me. The fact that it returns an interpreter instead of values is meaningless if it doesn't change the way you think about the program.

If you personally prefers to read it as the declaration of a program that given an state modifies it step by step, you can claim it's functional. But you are alone on that, the language even has specialized syntax to push people into reading the former, and anything minimally complex all but requires it.

TuringTest · on April 1, 2022

travisathougies is not alone on that. Every one of us functional programmers worth their salt will consider a sequence of transformations from one state to the next as a functional program, as long as there are no side effects outside the current function that may transform state in ways that aren't declared in the types of the input and return values (which would break the referential transparency).

Functional Reactive programming (i.e. combining functions over streams of mutating objects) is essentially that, and it's a pure functional paradigm, quite popular in front-end web development (and I hear it's even creeping its way into the back-end).

mrkeen · on April 1, 2022

The benefit is same-input -> same-output.

marcosdumay · on April 1, 2022

Ok, let's get specific.

Here is an STM function: orElse :: STM a -> STM a -> STM a

It returns the first argument if there isn't any synchronicity problem, or the second if some other execution line conflicts with the first.

What exactly do you mean by "same input -> same output" and how exactly does a programmer thinks about that function on this way when composing it into an STM interpreter?

travisathougies · on April 1, 2022

I think you should give the guy above the benefit of the doubt and restrict yourself to ST. IO and STM are different beasts, and still they're not imperative. The issue with IO and STM is global state. Mutation of global state can be done imperatively, as many imperative languages do. It can also be done functionally using monads to hide the affine types in a nice wrapper class in the absence of proper affine types or using proper affine type systems. Haskell choose the former. Other languages, like clean, choose the latter.

Same input -> same output is a nice property of referential integrity, which is a common thing we have in functional languages, but it is actually not due to the functional nature. Imperative languages can trivially be made to have this property by disallowing mutation of variables not in the immediate scope and restrict IO side effects (and maybe a few other straightforwards restrictions depending on the language in question).

With that out of the way. Same input -> same output holds for `orElse :: STM a -> STM a -> STM a`. However, you really have to expand out the types.

Here is the definition of STM (from https://hackage.haskell.org/package/base-4.15.0.0/docs/GHC-C...):

> newtype STM = STM (State# RealWorld -> (# State# RealWorld, a #))

So, `orElse` should behave the same as a function with this type:

> (State# RealWorld -> (# State# RealWorld, a #)) -> (State# RealWorld -> (# State# RealWorld, a #)) -> State# RealWorld -> (# State# RealWorld, a #)

Now, the definition of 'same input -> same output' is clear. Assuming the global state (State# RealWorld) is the same, then the output state will be the same. Actually, this is not a property of all languages, as most languages allow arbitrary side effects in any code. Haskell does not. This is the key property that makes STM easy in Haskell and difficult in other languages. Note that the 'State# RealWorld' is always handled linearly, so there is no possible way the same 'value' could be passed in twice. Again, hiding this behind a monad enforces that this value is handled linearly, since Haskell doesn't have linear types (or at least didn't when STM, ST, and IO were written).

mrkeen · on April 1, 2022

I was referring to the ST monad which bts and travisathougies were talking about, not STM.

I mean that when you run ST code twice with the same input, you will get the same output. E.g., if I write (runST f) + (runST f), that's the same as 2 * (runST f), etc.

As for thinking, it means I don't use step-through debuggers ever, because I don't need the whole program to be in a certain state to watch what happens. I just grab the function, grab the input, and run it in isolation.

marcosdumay · on April 1, 2022

Ouch, I read STM every time I looked at it.

None of what I said applies to ST.

blablablerg · on April 1, 2022

Yep it it is called declarative.

danidiaz · on April 1, 2022

> Consistent with my loyalty to the functional-programming idiom, I designed my initial bf implementation so that each instruction would copy the existing bytes into a new array, change the byte of interest, return this new byte array, and abandon the old array (which would eventually be garbage-collected).

This would be a fun task to attempt with the new linear features of Haskell https://hackage.haskell.org/package/linear-base-0.1.0/docs/D...

galaxyLogic · on April 1, 2022

I believe mutable data-structures per se are not a bad thing. What is bad is Unobservable State Mutation.

When you assign a field into a data-structure it is bad because later on you will see the changed data-structure and try to use it but it has a wrong value which causes an error in your program - maybe not a crash just an error in the output. And then it's really hard to say what caused that error because you have no visibility into who or where did that erroneous assignment happen.

If instead you could forbid direct assignments and could write a "watcher" (a.k.a "setter") for all data-modifications, you could observe how your program-state evolves during its execution, and you could write assertions that prevent wrong type of data-manipulation from happening.

One such language is Smalltalk. You can only modify data by sending a "message" to an object to be modified, and the object can only respond by executing one of its methods. You can then write assertions in those methods to halt in debugger or write to console if you see the program write a specific value or type of value into your data-structure.

Even on the rather primitive level of Arrays you could easily modify their #at:put: -method in Smalltalk and halt or log the stack-trace if somebody tries to store the number 42 into the array. :-)

In the end the problem is understanding what your program does in terms of its state, how the state evolves. Why? Because: Modifying state in essence modifies your program. And to understand your program you must understand all evolving versions of it. Like, if you want to understand a 'caterpillar' you must also understand its next mutated state 'butterfly'.

FP makes it easier to understand the state-evolution because new state is just the result of some calculation which is easy (?) to understand by understanding each "pure" function in your program and reasoning what their combined result is.

But if you can understand the state-evolution of your program by imperative means that is fine too. That is why many suggest using a database to keep the state. A database is mutable by definition. But it makes it easy to observe how your state evolves by observing the UPDATE and DELETE and INSERT calls made into your database.

throwaway81523 · on April 2, 2022

Article is overall unpersuasive. The first part is silly: copying a whole array instead of using a functional data structure such as a red-black tree is just asking for a huge slowdown. The second part doesn't give enough detail to really make its case. Author's solution was a big tree with mutable nodes, but there's a cautionary tale in an old article[0] by Ramsey and Dias about using a zipper[1] as a functional representation for the control node graph of a C-- compiler.

[0] https://doi.org/10.1016/j.entcs.2005.11.042

[1] https://en.wikipedia.org/wiki/Zipper_(data_structure)

dkarl · on April 1, 2022

> Indeed, Perlis’s quip gestures toward the ultimate tension within all programming languages: how to connect the gossamer realm of human ideas to the grubby reality of a computing machine. Arguably the whole point of programming languages is to insulate us from those details, and allow us to express ourselves naturally. But the farther we drift from reality of the machine, the less efficiently we can use its resources.

I would call it challenge, rather than tension, and I don't think expressing ourselves naturally is the only goal. We also want to express a solution that is tractable, that we can understand and have confidence in. "Natural" imperative programming can become intractable when mutating state; the goal of functional programming is to find a different way that is more tractable for humans, without worrying about if it feels "natural." If there's one thing programming (and mathematics!) has shown us, it's that anything we can do will eventually feel natural if we do it enough.

(We produced multiple generations of programmers for whom C-style for loops were the most natural thing in the world, almost the benchmark for what felt natural in a programming language. The "map" function was weird-ass shit to them. I know because the experienced programmers I worked with called it weird-ass shit when I discovered it in Python as a wee junior in 2000 and wondered why Java and C++ didn't have it.)

I don't think functional programmers consistently write code that is more tractable than imperative code, not yet. (I think some of them aren't trying, honestly.) But I think functional programming languages will evolve to better support that goal. It has only been in the last 10-20 years that the majority of working software engineers have fully recognized the difficulty of managing shared mutable state, generating widespread demand for functional programming languages for practical applications like web services, data processing, user interfaces, and so on. The trade-offs for those applications are different from the trade-offs for academic research, and the trade-offs for most industry programmers are different from the trade-offs for researchers who are immersed in the mathematical foundations. The academics have shaped functional languages for decades, but working software engineers will have more and more influence in the future.

mrkeen · on April 1, 2022

Agree with most, especially that C-style for-loops used to be the most natural thing.

I'm more pessimistic about:

> It has only been in the last 10-20 years that the majority of working software engineers have fully recognized the difficulty of managing shared mutable state

I suspect that the number of practicing programmers who don't worry about shared mutable state is growing much faster than the number who do worry. To quote Uncle Bob: "the industry as a whole will remain dominated by novices, and exist in a state of perpetual immaturity"

evilotto · on April 1, 2022

I rather like tcl's solution: copy on write, with reference counting. Values are immutable, so if you to alter a value in place, it gets copied; but iff nothing else is pointing to the same value it can get modified in place because the end result is indistinguishable. This does lead to the occasional need to explictly unshare things for performance[1], at least until the compiler gets smart enough to prove that the value is unshared.

[1] https://wiki.tcl-lang.org/page/K#c2a6014c2d129837889d8a8000d...

prophesi · on April 1, 2022

Glad to see Cunningham’s Law working its magic. The article was a good read, and the HN comments section has a lot of good alternative solutions to the proposed problems.

TuringTest · on April 1, 2022

For anyone interested in programming paradigms, I strongly recommend the online 'Paradigms'[1] course by Peter Van Roy, which you can take for free if you don't want to get certified. It uses the multi-paradigm language Oz to explain the essence of each paradigm and the differences between them. (Learning that new language makes sense in this case, as it prevents you from being distracted by the idioms of languages you already know).

The course distills the primitives of different programming languages in terms of how they see state (see Van Roy's diagram in [2]), and how deciding to incorporate and combine some of those primitives and not others gives rise to the different programming styles. I found it enlightening, in the way that programmers' koans refer to.

[1] https://www.edx.org/es/course/paradigms-of-computer-programm...

[2] https://en.wikipedia.org/wiki/Programming_paradigm#Overview

(This is a repost of a comment I made below, so that readers who don't read the whole thread can still see it).

nunez · on April 1, 2022

> Did this implementation work? Yes. Did it run fast? No. It was glacial. After noodling with the code for a while, I had my galaxy-brain moment: maybe the incessant duplication and deletion of 30K data arrays had something to do with it?

I thought OCaml handled this problem with copy-on-write and/or fancy pointer math. Is there more to it?

hurril · on April 1, 2022

Making a huge array of values to in-place edit by copy is just... drinking water with the glass upside down. Why would you do that?

Either do a nice imperative for-loop as that is an excellent and imperative way of solving the problem (which I don't know what it is.)

Or solve it in any of the normal functional ways. Pick the style that you and your team likes.

twic · on April 1, 2022

Another fun one is the multiple index problem:

http://pepijndevos.nl/2012/08/12/the-multiple-index-problem....

ljloisnflwef · on April 1, 2022

Other options:

* Use linear / affine / session types / move semantics

* Describe an interface, then weave side effects through it, such as the State Monad.

* Use lenses

mirceal · on April 1, 2022

> To be clear, there’s nothing inherently superior about functional programming. (Despite the fact that these self-contained functions are also known as pure functions.) It is not the way computers work at a hardware level.

Hah. Weird way of approaching it but okay. Here's the thing: it is exactly how computers work at hardware level. If you look at an electronic circuit you and a program written in a functional way, the equivalence is almost 1:1. The circuits are self contained and you compose them to generate a new output from the inputs that are fed in.

dahfizz · on April 1, 2022

That works for simple circuits. Actual computers have oodles of state - we usually call it "main memory". Assembly is a language that can only do one thing: mutate this state.

mrkeen · on April 1, 2022

But it's a pain to write out assembly, so we go via LLVM and SSA.

In SSA, rather than modelling your program using 16 mutable registers (imperative folk love talking about "the real world" or "real computer"), we use an infinite number of immutable registers.

Infinite registers won't fit in a computer (we only have 16), and not being able to overwrite registers must kill performance right?

dahfizz · on April 1, 2022

I'm not sure what your point is. Yes, it is possible to build "functional" layers of abstraction on top of real hardware. The hardware itself is not "functional", that's why you need layers of abstraction.

deterministic · on April 5, 2022

The hardware is also not C style imperative. C/Haskell/Clojure/… are all abstractions that gets mapped to a very complicated extremely parallel hardware state machine. None of the main stream programming languages maps to real hardware. Which is a good thing! We want to use an abstraction that helps us getting the job done faster/better. It does not matter how much the abstraction you use maps to real hardware. As long as it gets the job done who cares?

mrkeen · on April 1, 2022

> I'm not sure what your point is.

I'm pointing out that this is a terrible argument and needs to die: "It is not the way computers work at a hardware level"

> it is possible to build "functional" layers of abstraction on top of real hardware.

It's not just possible; I don't see any alternatives gaining traction anytime soon.

mirceal · on April 1, 2022

that’s the point. more complex circuits are made from simple circuits. principle still applies. if you think about it, most problems with writing code stem from state management.

dahfizz · on April 1, 2022

> more complex circuits are made from simple circuits. principle still applies.

And humans are made out of carbon, but you are never going to understand human emotion by studying carbon atoms really hard.

Assembly is the interface to the hardware. Assembly is imperitive. The abstraction that assembly exposes is a machine with global, mutable state. That's the way the hardware works.

mirceal · on April 2, 2022

> And humans are made out of carbon, but you are never going to understand human emotion by studying carbon atoms really hard.

here is the problem with that line of reasoning: human emotion is difficult to define, highly subjective. it does not mean anything unless you are human and try to understand the emotions.

also, you probably need to focus more on K and Na ions and electrons if you are going down the emotional rabbit hole. Carbon is interesting, but apart from providing the skeleton for various things like dna, proteins etc i think it’s not where you’d start

sfvisser · on April 1, 2022

> Functional programming, however, suggests we should create and destroy these structures willy-nilly.

It really saddens me that some people completely miss the point of what FP is about.

How did this happen?

mrkeen · on April 1, 2022

Two reasons:

1) Uncle Bob's "perpetual state of immaturity" [1]

2) Non-FP languages and "hybrid" languages do a bad enough job of FP to stop people from seeking out more. E.g.

FP lang: Doesn't have null. Uses Maybe in the rare cases its useful. The benefit is not having null. Imp lang: Keeps null. Adds Optional so now there's two ways not to have a value.

FP lang: Functions are lambdas are first class. Imp lang: Separate Function object. Lambdas can't throw checked exceptions or update variables in the method's scope.

FP lang: Understands when mutation can and cannot happen. Can share data and fuse code efficiently. Imp lang: Cannot assume underlying data doesn't change. Programmer needs to make defensive copies.

FP lang: map (+1) [1,2,3] Imp lang: Arrays.asList(1,2,3).stream().map(x -> x + 1).collect(Collectors.toList())

If I grew up with the kind of FP that made it into mainstream languages I'd probably be against it too.

[1] https://blog.cleancoder.com/uncle-bob/2014/06/20/MyLawn.html

zodiac · on April 1, 2022

I think the author somewhat explained why FP "suggests" that. In his explanation, it's because when you read tutorials that operate on "numbers and strings and simple lists" (as the author puts it), you do create and destroy those structures willy-nilly.

So when you operate on a large array (as in his example), a natural approach is to also create and destroy it.

Of course after you gain experience you learn not to do that, and techniques that let you not do that, but I don't think he is missing the point.

j-pb · on April 1, 2022

I found that a fast immutable in memory database like Datascript solves both problems quite nicely.

ARandomerDude · on April 1, 2022

Came here to say this. The author wasn't actually describing problems with functional programming, he was describing his troubles due to naive implementations.

Naive implementations will always, by definition, not be great. But instead of spending the time, energy, and money on crafting a custom solution each time, use off-the-shelf libraries that do the job. DataScript, Storm, etc. are great for this.

rini17 · on April 1, 2022

Common Lisp solves the big structure problem using "setf" form. It also generally has both destructive/inplace and nondestructive structure manipulation versions of practically all functions in standard library.

It solves parent-child problem by having symbols as built-in data type. They can be used exactly as the "mutable mediator" describes. And beyond, such as you can generate a code that uses these symbols and have it natively compiled - at runtime.

pmarreck · on April 4, 2022

I'm confused; has this person actually never heard of "purely functional data structures"?

zelphirkalt · on April 1, 2022

I am surprised, that there is no mention of concurrency or running things in parallel in the article. That is one of the areas, where you profit the most from avoiding mutating state. Was there any parallelization effort for the example project "Archetype"? If any, how did that go? What were the problems with parallelizing to make things run faster?

I think there might also be another idea to avoid too heavy updating in case of the many many functional updates to the 30K data, but it depends on how Archetype works and what it does. It might be possible to not apply some updates immediately, but group them together with other updates or simplify multiple subsequent updates into one update in clever ways. This might be difficult to implement though and as I said, might not be appropriate for all situations, as you might want to give immediate feedback in the GUI for any change.

Lichtso · on April 1, 2022

Assuming we define functional as referential transparent and imperative in turn as referential opaque.

Then, I think there are two different aspects of the problem:

- There is the identity of each object in your program (e.g. a key, reference or pointer), regardless of its value.

- And there are the versions of values each object has / had.

Imperative tracks identities but no versions while functional tracks versions but no identities. Ideally the program would run in a VCS so that one could track both versions and identities.

kkfx · on April 2, 2022

The real issue, IMVHO, about functional programming is that's more suited for a development model modern society do not follow.

That's not much an FP issue but an actual IT issue, since modern development for modern business reasons produce monsters and solutions waiting for problems. That's also hard to teach and probably the reason functional programming is seen as "superior but not for the real world by many.

otikik · on April 1, 2022

Naming things, and cache invalidation :).

Now seriously: while reading this, and especially the first problem, my head was shouting "immutable data structures! immutable data structures!".

The author surely must have encountered IDSs during their 10-year tenure with Lisp. I'm curious about why they thought they would not fit the bill for the problems they are writing about.

NyxWulf · on April 1, 2022

Did you read the article? The entire discussion was on the tension between immutable data structures and cognitive complexity on one side, and machine performance and mutability on the other side.

otikik · on April 1, 2022

First, if I sounded snarky I apologize. I assure you it was not my intention.

Secondly, I did read the article. If they are allocating a big chunk of memory and copying it over every single time they are making a change, they are not using immutable data structures. Ok, technically speaking, they might be. But they are using the worst implementation possible as a base to make conclusions from. That'd be like looking at the C language and concluding that strong types are not good, because of all the limitations C has.

sly010 · on April 1, 2022

Particularly the "parent-child" example feels like a non issue to me. You probably want to keep multiple versions of a glyph anyway to support things like undo, in which case explicit parent update is a requirement, not an annoyance. No reason you can't have both:

  g2 = updateGlyph(g1)
  f2 = updateFontInGlyph(f1, g2)

TuringTest · on April 1, 2022

I think the crux is that any complex data structure f1 using glyphs would not need to be updated explicitly in an imperative program through a function like updateFontInGlyph; the runtime would do it as part of its semantics. Of course that's a double-edged sword, as you may get undesired side effects as well as the desired ones like this.

There are ways to achieve such "automatic updating" of items in a functional data structure, but you need to build them explicitly instead of relying on the runtime, which gives you more responsibility but also more control. The most direct approach as compared to imperative code would be to build the data structure on top of mutable state-based streams, though there are other possibilities like monotonic or immutable structures.

https://blog.waleedkhan.name/monotonicity/

Sinidir · on April 1, 2022

"When a functional update on a child structure occurs, the mediator is updated with the reference to the new child. But all the parents still point at the same mediator".

This is literally atoms in clojure.

gorgoiler · on April 1, 2022

For the latter problem, I believe lenses are a known solution. I’ve never really used them mind you, but it might be worth reading up on them?

jerf · on April 1, 2022

So, as the trend of forcing functional programming into imperative languages continues on its inexorable path, I just can't help but bring the old joke to mind more and more often:

    "Doctor, it hurts when I do this."

    "Well, then, don't do that."

You mean, jamming a foreign paradigm into a language, where the advantages of the foreign paradigm are reduced and the disadvantages greatly magnified, hurts sometimes?

Have you considered... not doing that?

I simply can not apprehend the mindset that goes:

    1. Wow, map/filter/reduce sure is fun in this language over here
       designed to work with it.
    2. Now I'm going to use everywhere I go, no matter how much it hurts
       and no matter what compromises I have to make.
    3. When it hurts, I will blame anything and everything except myself
       for forcing a paradigm in where it doesn't belong.

The paradigm has to be good. It just has to! Even if it hurts! Even if I'm really not enjoying using it! Even if it's murdering my runtime and killing my performance and I'm in a language where functions take quite a lot of text rather than \x -> x + 1! Even if my code is hard to read and hard to abstract and hard to refactor. Even if the foreign paradigm is critically based on pervasive laziness my language doesn't have! Or some other foundation my current language totally lacks! Gosh darn it, I'm going to use this paradigm in this code if it kills me! Because it was so much fun... over there.

I'm not saying that it isn't sometimes useful to borrow paradigms in certain places even in languages that don't support it. When it doesn't hurt like this, by all means use it as you like. What I don't understand is this constant, if not growing, stream of posts about "my gosh, my forced FP programming in this not-FP language is kinda painful here... grimaces yup, it's pretty painful all right... grits teeth yup, this is really hurting me and making me really angry and isn't fun at all... but I'm so glad I'm using this style that is making things hard and hurting me and making me angry, wouldn't dream of doing anything else!"

I know Haskell. There are many lessons I've taken from Haskell and applied to other languages, like the virtue of very strongly separating IO code from logic, the virtues of taking extra steps to provide and use composability in your APIs, the virtues of certain things being immutable data. But taking these lessons from functional programming languages shouldn't mean you ape the surface solutions functional programming uses; you should be thinking about how to idiomatically bring the target virtues into the other languages you use. There will be compromises, but sometimes there will also be things the current language does better than Haskell. (Really quite a lot of things. Haskell's great but there's many reasons it hasn't conquered the world.) You shouldn't be so focused on importing the foreign paradigm that you completely throw away the virtues of the environment you're in.

I do not understand why so many of you are so vigorously adamant that you're going to program in the intersection of functional programming languages and imperative programming languages. It's an incredibly aggravating place to try to be. The (near) union, applied with the wisdom gleaned from both camps, is much more fun.

ImprobableTruth · on April 1, 2022

I don't really see how this relates to the article. The author specifically mentions that he also encountered these problems in Racket (a functional programming language). Anyway:

>I do not understand why so many of you are so vigorously adamant that you're going to program in the intersection of functional programming languages and imperative programming languages

Because it actually works pretty well? The vast majority of programming languages today are multi-paradigm for a reason. In fact, almost all functional programming languages (Lisp, Scheme, Racket, Clojure, (S)ML, OCaml, F#, Scala) are multi-paradigm. Haskell is really the odd one among functional programming languages. The same applies to lazyness by the way, to claim that functional programming is "critically based on pervasive laziness" is simply wrong.

The idea that people use FP in a language like e.g. JS only because of an ideology driven obsession and that they constantly have to fight against the language just doesn't bear out in reality.

jerf · on April 1, 2022

I'm not talking about using it when it's helpful. I do it myself all the time.

I'm talking about forcing it in when it hurts specifically.

It occurs to me that I may have accidentally answered myself at the end, when I referred to the concept of union vs. intersection. When you bring in helpful angles on your current paradigm from light cast by other paradigms, that's helpful. You're expanding your working set of options.

But when you get a bit of a good taste of those other paradigms, and then declare this is always the right way to do it, it may feel like you're expanding your paradigms, but now you're not. When you force sum types into a language that doesn't support them because they're just always the right answer, when you force lots of maps & filters in a language that can't do fusion and function calls are kind of expensive, when you force an algorithm in that depends on laziness to work, and so on, you are no longer working in a sort of union of paradigms. You're working in the intersection.

And that intersection sucks. A worst-of-both-worlds situation for sure.

You can tell, by the way people trying this are constantly screaming in pain about it.

When it hurts... stop.

ImprobableTruth · on April 1, 2022

Sure, it's best to work with rather than against the language, but for most languages support of functional programming is not binary, just like with OOP.

I'm pretty sure most of the complaints are generally of the "This FP thing would work well here, but the language is unfortunately not quite there" nature rather than "I will try to implement profunctor optics in C no matter the cost" square peg in round hole crazyness.

>when you force lots of maps & filters in a language that can't do fusion and function calls are kind of expensive

I think that's perfect example to show that it's really about degrees, because fusion is really a Haskell thing. Sure, it's a nice-to-have, but it's definitely not essential to functional programming.

And when people want sum types (which aren't even really FP, ALGOL had them first!), it's often because the idiomatic alternative just kind of sucks for what they're doing. Trying to "implement" sum types in a language without support is nonsensical, but complaints about lacking them are pretty understandable I think.

setr · on April 1, 2022

I’m pretty sure half the purpose of the article is to resolve the ambiguity: does it hurt because the paradigm is inelegant for this problem, or because I don’t know how to make it elegant?

Just stopping anytime you struggle is precisely how you learn nothing at all

ljloisnflwef · on April 1, 2022

There's genuinely good things that came from FP that creeped into popular languages:

* Algebraic data types

* Pattern matching (with exhaustiveness)

* Lambdas / Closures

* Type inference

dboreham · on April 1, 2022

I don't believe it's true that algebraic data types "came from" FP. The name did, but the concept has existed since before high level languages existed. Actually before computers existed since it's in Type Theory. And before that it's in ordinary human thinking for thousands of years.

lostcolony · on April 1, 2022

As a usable programming construct though it did.

I mean, Lambdas came from the Lambda calculus which predates FP. Type inferencing was also formulated as part of typed Lambda calculus. Pattern matching definitely came about prior to FP; SNOBOL (and its predecessor COMIT) is probably the first language that really made it 'a thing'.

Very few things FP uses came from FP; they came from underlying Math and CS, and are themselves specific formulations and implementations that fit within the context of broader well understood concepts.

tgflynn · on April 1, 2022

The question I am left with, as someone who has a bit of experience with functional languages but much more in imperative/OOP languages, is does it ever make sense to develop something like the GUI applications the author was talking about using a primarily functional paradigm ? Assume you get to pick whatever FP language you want so the issue here isn't with shoehorning FP techniques into a language that doesn't support them well, it's about whether there are certain types of programs that just aren't suited to a functional paradigm.

valcron1000 · on April 1, 2022

For GUI programming I found FRP to be a really good solution. And no, I don't consider React to actually do FRP.

The closest mainstream project to FRP is Angular using RxJS exclusively. You define how each component interacts with each other, and then you delegate to a runtime to actually perform the side effects (change the DOM in this case). Solutions like React + Redux/Elm involve a lot of plumbing that's not relevant to the issue: I just want to define relationships between components, not manually tread changes across a big state tree.

galaxyLogic · on April 1, 2022

That is a very good question. The discussion often circles around pro- and against FP. Proponents of "pure FP" seem to suggest that it is the best solution for everything always. Is it?

If not it would be nice to find discussion about when it is and when not and why? What is Haskell NOT the best solution for?

valcron1000 · on April 1, 2022

In my experience Haskell is a bad solution when you need fine control over memory usage and control flow. For those cases you should probably use something like C, C++ or Rust. For everything else I think is a viable option.

The hard part is understanding how you should model your problem. I found that you could classify problems into two big groups: pipelines like compilers or web services where you get an input, do some processing and produce an output, and graph problems like GUI applications where the system is a living thing and events change how components should behave. The first group is easy, the second not so much. Excel is the best example of this kind of system: an user describes relationships between cells and lets the runtime perform the appropriate side effects. Ideally we would be able to write arbitrary programs using this same model.

galaxyLogic · on April 1, 2022

It took me some time to understand how Haskell performs its "magic" but now I think I do. Anybody please correct me if I'm still getting it wrong...

Haskell main program in essence runs a big loop until it decides the program should exit. In Haskell syntax the loop is coded as a call to a tail-recursive function, which on a real hardware of course needs to be translated to do iteration. At the end of each loop it sets the new calculated value of the new state to be used during the next loop.

Using tail-recursion optimization it looks like our whole program executes a single top-level function-call of a recursive function, that is how you code it in Haskell.

This is great but somehow it feels like a trick. It is actually still doing iteration and state mutation internally while running. You could program that in an imperative language writing a top-level loop and iterating over it and modifying the statefull variables at the end of each loop.

And what happens if you cannot write your program as tail-recursive function? What if the problem domain requires you to use general recursion, which can not be optimized away like tail-recursion can?

hither_shores · on April 2, 2022

No, that's not accurate. The final form of your Haskell program before it's compiled down to some assembly language (currently C-- or LLVM) is a graph reduction problem. The partially imperative part is the runtime system, which actually reduces the graph (= runs your program), but this is significantly more complicated than the sort of high level loop transformations you're talking about. Search "spineless tagless G-machine" for a (somewhat idealized) description of the core architecture of GHC.

valcron1000 · on April 1, 2022

What you are trying to describe is not Haskell the language but an implementation, and right now the de facto implementation is GHC. I really can't talk about how GHC works internally, but when it comes to the language you need to think your program as a value, just like `5` or `[1,2,3]` or `{ "name": "susan" }`. That is, effects are values, you compose those effects, and, your program being some kind of effect, is also a value. That's the reason behind `main :: IO ()`: your program is an `IO value`.

When it comes to "executing" your program, a runtime (like GHC's runtime) takes your program which is a value and actually performs the requested actions, dealing with mutation, loops, jumps, etc. This is a hack just as much as any other language dealing with variables instead of the stack, heap or registries directly.

In case you're interested, this idea of using some kind of interpreter which loops over your program and performs mutations and side effects is very popular in Scala: libraries like Cats and ZIO do this, allowing you to effectively get Haskell's IO. As far as I know, tail-recursion is not really needed for this, but when implemented on the runtime it allows the end user to write loops using recursion without blowing up the stack.

galaxyLogic · on April 4, 2022

Right. I was thinking of how to write a game in Haskell or similar. I need to write some kind of loop to keep my program/game running indefinitely. And I need to avoid "blowing up" the stack by writing my program as a set of calls to tail-recursive functions. No?

The program must keep the previous state somewhere and then modify it based on the player's inputs and again store that new state somewhere somehow so it is remembered when next input comes. If loops which keep and modify state are not allowed, then I don't see how it could be done without tail-recursive functions.

rrradical · on April 1, 2022

Functional Reactive Programming (FRP) is a really nice paradigm for doing GUI in a functional setting (e.g. Haskell). I build games in Haskell., including UI. It's great.

deterministic · on April 5, 2022

Elm is a great example of UI done right. And it is 100% functional.

craigching · on April 1, 2022

I think you're being downvoted because your tone is a bit hostile, so I'll bump you up because I think you make a good point.

I love functional idioms as well, but you have to know the cost of those idioms in a non-functional language. The first thing I did when I had to learn go was determine if I could use functional idioms (map, fold, etc.) and did some research and testing and found that, no, functional idioms are a really bad idea in go. It just isn't meant to be used that way.

For the record, I had to learn go because the team I joined used it.

jerf · on April 1, 2022

Tried saying it politely. Still get downvoted. Thought I'd try a bit sharper on purpose.

Despite what it may superficially sounds like, it's an attempt to reach out and help people. If it hurts, you know, think about it. Maybe don't do something that hurts so much because it "must" be good.

FP techniques in imperative languages is a tool. A good tool sometimes. But it shouldn't be an aspiration.

avgcorrection · on April 1, 2022

↑ Knowing the price of everything and the value of nothing.

Why do you want to climb that mountain? Because it’s there.

jerf · on April 1, 2022

If that's your defense, party on. If you want masochism, by golly you've found it.

But could y'all be more clear about labeling at such? You're fooling young programmers into thinking this is a good engineering idea. In engineering, we do not climb mountains just because they are there. Quite the opposite.