I found this Reddit thread a while ago when I was starting to get into Haskell:
“do-blocks look extremely imperative to me”.
The author argues that Haskell’s do blocks “seem extremely un-functional,”
due to the imperative-like sequencing of functions within.
What’s funny is that he then goes on to say that
“[he knows] that they are related to monads and binds and all that,”
as if this is some tangential detail.
But in fact, this is the reason do blocks exist in the first place.
What is a Monad?
I won’t go into the full details of what monads are here now,
but suffice it to say that a monad can be thought of as a context
in which computations can be performed.
The simplest context is a container which might or might not hold a value,
which Haskell calls Maybe.
data Maybe a = Just a | Nothing
An instance of Maybe String, for example, can be one of two things:
it can be a Just String, containing a usable string, or a Nothing, which doesn’t.
Monad itself is a Haskell type class with two basic operations:
return and bind, where bind is written as the infix operator >>=.
Fun fact: if you go to the homepage
for the Haskell programming language you may notice that its logo looks
suspiciously like the bind operator, with a lambda
thrown in for good measure.
class Monad m where
return :: a -> m a
(>>=) :: m a -> (a -> m b) -> m b
We can create an instance of Monad for Maybe by implementing these functions.
In reality we would also have to implement Functor and Applicative as well,
but that’s not important for this topic so I’ll skip it for now.
instance Monad Maybe where
return x = Just x
(Just x) >>= f = Just (f x)
Nothing >>= _ = Nothing
For Maybe, return simply wraps a value in its minimal context (Just)
while bind applies a function to the contents if any exist.
If it does, the function returns a new context. Otherwise, the result is Nothing.
The Maybe Monad in Action
Maybe is frequently used in Haskell as a way to represent a computation that might fail.
Consider a “safe” integer division function which catches attempts to divide by zero.
safeDiv :: Integer -> Integer -> Maybe Integer
safeDiv 0 _ = Nothing
safeDiv divisor dividend = Just (dividend / divisor)
We can use this function to safely divide by integers that might be zero.
example1 = step4
where
start = 60
d1 = 5
d2 = 3
d3 = 0
d4 = 2
step1 = safeDiv d1 start -- returns (Just 12)
step2 = case step1 of
Just step1' -> safeDiv d2 step1' -- returns (Just 4)
Nothing -> Nothing
step3 = case step2 of
Just step2' -> safeDiv d3 step2' -- returns Nothing
Nothing -> Nothing
step4 = case step3 of
Just step3' -> safeDiv d3 step3'
Nothing -> Nothing -- safeDiv isn't even called this time
This is safe, but awkward.
If we’re chaining the results of one step to the next like this
then we can treat the Maybe Integer output of safeDiv as a monad
and use bind.
example2 = safeDiv d1 start >>= safeDiv d2 >>= safeDiv d3 >>= safeDiv d4
where
start = 60
d1 = 5
d2 = 3
d3 = 0
d4 = 2
If you’re wondering why I put the divisor first in the arguments list
of safeDiv, this is why. In this situation you can think of the divisor
as the “configuration” of safeDiv, and by partial function application
we can create a new function to apply that specific divisor.
Re-using Steps of the Computation
This is great, but what if you wanted to use an intermediate result to compute a more complex expression, but still cleanly abort if any of those intermediate results would produce a division by zero?
-- z / (5 + (x / y))
example3 :: Integer -> Integer -> Maybe Integer
example3 x y z = answer
where
d1 = safeDiv y x
d2 = case d1 of
Just d1' -> Just (5 + d1')
Nothing -> Nothing
answer = case d2 of
Just d2' -> safeDiv d2' z
Nothing -> Nothing
It’s not clear how to chain safeDiv calls when one or more of the arguments
are themselves wrapped in a context of potential failure,
which has reduced us back to the style we used before introducing bind.
Note that you could also use “applicative style” here, but that’s not the focus today.
For more complex monadic flows the simpler applicative style may not work as well,
but I will admit that for simpler function applications like this it does work well
(even if it does look totally alien to people unfamiliar with applicative style,
which is a drawback that do notation does not suffer from).
-- z / (5 + (x / y))
example3' :: Integer -> Integer -> Maybe Integer
example3' x y z = answer
where
d1 = safeDiv y x
d2 = (+5) <$> d1
answer = safeDiv <$> d2 <*> pure z
Instead, how might you do this with bind instead? Remember that bind applies a function to the value inside a context.
-- z / (5 + (x / y))
example4 :: Integer -> Integer -> Maybe Integer
example4 x y z = answer
where
d1 = safeDiv y x
d2 = d1 >>= (\d1' -> Just (5 + d1'))
answer = d2 >>= (\d2' -> safeDiv d2' z)
This is a massive improvement, but there is still a potential problem.
What if we wanted to use the result of d1 again in the outermost computation?
We can achieve this by creating a nested bind, which gives the innermost
lambda access to the parameters of the outermost:
-- (z + (x / y)) / (5 + (x / y))
example5 :: Integer -> Integer -> Maybe Integer
example5 x y z = answer
where
d1 = safeDiv y x
answer = d1 >>= (\d1' -> return (5 + d1') >>= (\d2' -> safeDiv d2' (z + d1')))
This solves the problem of accessing the previous answer,
but now it’s a bit hard to read.
You could probably fix it by splitting across lines and fiddling with indentation,
but Haskell offers a better way: do notation.
-- (z + (x / y)) / (5 + (x / y))
example6 :: Integer -> Integer -> Maybe Integer
example6 x y z = do
d1 <- safeDiv y x
d2 <- return (5 + d1) -- this can be rewritten as "let d2 = 5 + d1"
safeDiv d2 (z + d1)
do notation is just syntactic sugar for creating a chain of monadic binds.
It is not imperative at all, at least not in the way that C code is imperative.
If any of the results in the chain become Nothing at any point, the entire
computation is terminated at that point, so there is no need to check anything
to abort the computation manually.
This is extremely convenient in other monads with effects.
Functions involving the State or IO monads in particular will more strongly resemble
C-style imperative operations, and trying to write such functions with
manual bind chains would be extremely annoying. Instead, Haskell offers
syntax which allows writing such code in its natural style while remaining pure.