Functional Programming (FP) has been around just as long, if not longer than, Object-Oriented Programming (OOP). But it's only (relatively) recently gaining in popularity, particularly in the JavaScript community. Why?

I went to MIT in the early 00's. Structure and Interpretation of Computer Programs (SICP - sick-pee) was my textbook. So my first formally-taught programming language was functional. Then I worked in industry for over a decade and hardly ever thought about FP. Now I'm shocked to learn that the textbook from college I don't remember very well anymore is considered the "functional programming bible".

Don't get me wrong. It's a good textbook. I'm sure it made me a better programmer. But FP wasn't something I applied very often in my Java/ActionScript/PHP/Python/Ruby/JavaScript career. OOP patterns dominated.

Then I taught at Wyncode Academy for four years and found myself trying to explain some FP concepts to newcomers. In a world dominated by OOP, it's hard to explain FP. It's so different.

After learning OOP, why is FP so much harder?

Related questions: Why has it taken so long for FP to catch on? Why aren't I talking about techniques for learning OOP in an FP-dominated world?

We in the coding community need to grapple with why the OOP->FP transition is so hard to teach. Evangelizing FP like a religion repeats the same mistakes that caused FP to languish in the industry for so long.

Many introductions to FP are missing something. It's not just an alternative programming style. It's a new way of thinking. When introducing something big and new to my students, I try to ease them into it. These same tricks may also work with more experienced programmers from OOP backgrounds.

One of the techniques I used at Wyncode to get a running start into a hard concept is storytelling. If I can get my students to understand the context - the big picture - I find it easier to later explain the technical details.

So here are two big-picture strategies for introducing FP - particularly to an OOP audience.

Big Picture 1: History

Sometimes it's good to start from the beginning: How does a computer work?

The most common (popular? easy-to-understand?) model of computing is the Turing Machine. The state that FP programmers complain about is staring us right in the face in a Turing Machine. An algorithm for operating this machine represents transitions between different states, e.g. from some boxes being on/off (1 or 0) to some other boxes being on/off.

If we try to imagine two Turing Machines operating on the same section of tape at the same time, we can begin to understand why "shared state" and concurrency in OOP are hard problems. But that's a post for another time.

The Turing Machine is a universal machine. It can be used to solve every solvable (effectively calculable) math and logic problem. This simple collection of operations - move left, move right, write a dot, read a dot, erase a dot - are enough (given enough time and resources) to tackle every math problem in the universe. That's what Alan Turing proved in 1936.

In many ways, a Turing Machine is how a computer "works".

But this is also how a computer works.

A full adder circuit

This is a circuit for addition. It's the kind of component found inside the CPU of a computer.

This is not a Turing Machine. It's not universal. It's just addition. It can't (easily) be "reprogrammed".

There's also no Turing-machine-like "state". Apply voltage to the inputs corresponding to the numbers-to-add and detect voltages in the outputs corresponding to the sum. As soon as the voltage is shut off, the answer goes away. There's no "tape" sitting around to read or manipulate. Two circuits can't operate on the same logic gates simultaneously. (I don't think they can, but I'm sure someone will comment to prove me wrong.)

This circuit is also fast. While a classic Turing Machine flips 1s and 0s back-and-forth on some medium, this circuit operates at the speed of electricity through a wire. There are no moving parts.

A circuit is a different model of computation. Each of the logic gates (AND, OR, NAND, NOR, XOR, etc.) are pure functions. They accept inputs and produce outputs with no side-effects. If all we have is the ability to create and combine these "functions", we can also solve every solvable math problem in the universe. That's what Alonzo Church proved, also in 1936.

So we've got two different models of computing: the Turing Machine's little boxes of 0s and 1s (objects) and Alonzo's Church's lambda calculus built out of logic gates (functions). Which one is correct?

For a time there was a debate about whether an abstract Turing Machine could solve the same set of math problems as lambda calculus (and vice versa). Eventually they were proven to be equivalent.

Being equivalent means that they're equally powerful. Any algorithm that can be written for a Turing Machine can also be written using functions. So any program that can be written in Turing Machine software can also be represented in circuitry hardware.

What does it mean to "program in hardware"?

We can see "hardware programming" embodied in Application-specific Integrated Circuits (ASICs). Circuits can be created that are "programmed" to do one thing very quickly, like mine Bitcoin or play chess.

Since the proposal of the Church-Turing Thesis, we've had two programming options. Hardware is faster and software is slower. Make a mistake in software? Just hit the delete key and try again. Make a mistake in hardware? It's time to grab a soldering iron. It's a classic engineering design trade-off.

So let's say we have an algorithm written in an OOP style that we'd like to convert into an ASIC. It's probably a good strategy to rewrite the program in a FP style so it better maps to the circuit diagram's domain. Most programming languages are flexible enough to do that, but some are better at it others.

# Elixir pipes
"1" |> String.to_integer() |> Kernel.*(2) # returns 2

Many FP-oriented languages tend to look like circuits. Specifically the "pipe operators" in Unix, Elixir, F#, JavaScript (maybe someday) and others make code look like a circuit diagram: inputs go into the left, flow through a number of "gates" (pipes) until they're transformed into the final output on the right. It's probably not a coincidence that the pipe operator used by some languages (|>) looks like a logic gate.

The NOT gate

Putting my coding instructor hat back on, a good "big picture" way of introducing FP is to start by talking about how circuits work, how they can be "programmed", and how we can model circuit diagrams in code.

Big Picture 2: Philosophy

I picked up a Philosophy minor with my CS degree, so one of the things I'm fascinated by is the intersection between those two fields of study. I find talking about the overlap helpful when teaching new coders, particularly those with Humanities instead of STEM backgrounds.

A philosophically important concept in FP is "functional equivalence".

Perhaps the best example demonstrating this equivalence is Tom Stuart's great article "Programming From Nothing".

Stuart demonstrates how a program (specifically the ubiquitous FizzBuzz) can be written entirely out of functions. I'm not going to repeat that entire exercise here, but I am going to borrow his explanation of how numbers can be represented entirely with functions (the Church encoding).

Start by defining the concept of zero as a function that accepts a function argument and does nothing with it.

# Ruby
ZERO = -> (func) { 
  # does nothing
  func
}

Similarly, we can define all the natural numbers as functions that accept function arguments and call them n-times.

ONE = -> (func) {
  # calls it once
  # same as "func.call()"
  func[]
  func
}

TWO = -> (func) {
  # calls it twice
  func[]
  func[]
  func
}

To test these "function-numbers", pass them a test function.

HELLO = ->() { puts "hello" }

# same as "ZERO.call(HELLO)"
ZERO[HELLO] # nothing displayed
ONE[HELLO]  # one "hello" displayed
TWO[HELLO]  # "hello" twice

This functional-numeric representation can be hard to play around with and debug.

p ZERO
# outputs #<Proc:0x000055d195ae57b0@(repl):3 (lambda)>

So to make it easier to work with we can define a method that will convert these functional-numbers into the object-numbers we're used to.

# convert number function into number object
def to_integer(func)
  # count how many times counter is called
  n = 0
  counter = ->() { n += 1 }
  func[counter]
  n
end

p to_integer(ZERO) # 0
p to_integer(ONE)  # 1
p to_integer(TWO)  # 2

This converter creates a counting function and passes it to the numeric function. The ZERO function will call it zero times, the ONE function will call it one time, etc. We keep track of how many times the counter has been called to get the result.

Given these function-number definitions, we can implement addition.

ADD = -> (func1, func2) {
  -> (f) { func1[func2[f]] }
}

sum = ADD[ZERO, ZERO]
p to_integer(sum) # 0

sum = ADD[ZERO, ONE]
p to_integer(sum) # 1

sum = ADD[ONE, ONE]
p to_integer(sum) # 2

If TWO calls a function twice, then ADD[TWO, TWO] will return a function-number that calls its argument four times (the function-number FOUR).

It's a mind-bending exercise. When I get to the end of "Programming From Nothing", I get the sense that this is an interesting product of the clever application of a fundamental computer science concept, but not something I could use in my day job.

And that's exactly the sense that I (and I suspect many others) have about FP in general - it's clever, but doesn't seem very useful. That feeling of unnecessary complexity is exactly the problem we need to solve if we hope to make FP techniques more popular.

So a better place to start teaching FP than Church numerals is The Matrix.

In that 1999 sci-fi film, the reality perceived by most humans is actually a simulation called "The Matrix". A few months ago Elon Musk suggested that this "simulation hypothesis" may be real, starting weeks of "Philosophy 101"-level media on the topic.

What does The Matrix have to do with FP?

The metaphysical debate, of which the "simulation hypothesis" is but one response, is very old and mind-numbingly complicated at times. So my attempt to summarize it won't do it justice. But the big idea is that we have no proof that the world around us is real. Maybe there are actual objects in the world or maybe we're just brains in jars.

So there are at least two contradictory theories of what, for example, the number one is. Is it a thing (a noun, an object) that we can interact with (touch and feel)? Or is it an action (a verb, a function), something that acts on the world, but isn't embodied?

The functional-one is a simulation of the number one. It's functionally equivalent to the object-one, meaning it does everything the object-one can do. For example, we can do arithmetic with it.

But it's not really "there" in the way that objects in OOP are "there". It's a Matrix simulation. It doesn't have inherent attributes - it isn't x, it just does x.

To pick a less abstract example, is the chair you're sitting in real or just forces pressing against your body? A "chair" may be a chair-object that exists in the real world or a chair-function: a (hopefully comfortable) force pushing against you with no underlying objective basis.

A red delicious apple

Consider color. Is a red delicious apple really red (adjective describing a noun) or does it act red (verb)? Is color an inherent attribute of a real underlying apple-object or just an action that an apple-function is programmed to do when light shines on it? Is the apple real or just a simulation?

# A "real" apple
class Apple
  attr_reader :color
  def initialize
    @color = "ruby red"
  end
end

p Apple.new.color # "ruby red"

# A "simulated" apple
APPLE = -> (applied) {
  return "ruby red" if applied == "light"
}

p APPLE["light"] # "ruby red"

The difficulty of this philosophical concept is a good metaphor for why FP is so hard to teach in an OOP-dominated world. To help students understand, start by opening up their minds to the possibility of a world made up solely of "functions". Start with that big picture concept, then transition towards FP models of the world: how they differ from OOP representations yet maintain equivalent results. Ask an experienced OOP developer to consider rewriting a class into its functional equivalent.

Conclusion

Transitioning from OOP into FP can be hard. It's not just a different programming style. It's an alternative model of the world. And the better we are at easing students into that paradigm shift, the easier it'll be to avoid another half-century of ignoring this useful tool in the coder's toolbox.

Edits
Writing is just as debuggable as code. So I've decided to clarify that I'm presenting teaching strategies for introducing FP to OOP-minded programmers. FP programming itself isn't hard. It's the paradigm shift that needs support.

Teaching Functional Programming: Two Big Picture Approaches

Big Picture 1: History

Big Picture 2: Philosophy

Conclusion