“do” Notation in Haskell

· 6 min

I found this Reddit thread a while ago when I was starting to get into Haskell: “do-blocks look extremely imperative to me”. The author argues that Haskell’s do blocks “seem extremely un-functional,” due to the imperative-like sequencing of functions within.

What’s funny is that he then goes on to say that “[he knows] that they are related to monads and binds and all that,” as if this is some tangential detail. But in fact, this is the reason do blocks exist in the first place.

What is a Monad?

I won’t go into the full details of what monads are here now, but suffice it to say that a monad can be thought of as a context in which computations can be performed. The simplest context is a container which might or might not hold a value, which Haskell calls Maybe.

data Maybe a = Just a | Nothing

An instance of Maybe String, for example, can be one of two things: it can be a Just String, containing a usable string, or a Nothing, which doesn’t.

Monad itself is a Haskell type class with two basic operations: return and bind, where bind is written as the infix operator >>=. Fun fact: if you go to the homepage for the Haskell programming language you may notice that its logo looks suspiciously like the bind operator, with a lambda thrown in for good measure.

class Monad m where
  return :: a -> m a
  (>>=) :: m a -> (a -> m b) -> m b

We can create an instance of Monad for Maybe by implementing these functions. In reality we would also have to implement Functor and Applicative as well, but that’s not important for this topic so I’ll skip it for now.

instance Monad Maybe where
  return x = Just x

  (Just x) >>= f = Just (f x)
  Nothing  >>= _ = Nothing

For Maybe, return simply wraps a value in its minimal context (Just) while bind applies a function to the contents if any exist. If it does, the function returns a new context. Otherwise, the result is Nothing.

The Maybe Monad in Action

Maybe is frequently used in Haskell as a way to represent a computation that might fail. Consider a “safe” integer division function which catches attempts to divide by zero.

safeDiv :: Integer -> Integer -> Maybe Integer
safeDiv 0       _        = Nothing
safeDiv divisor dividend = Just (dividend / divisor)

We can use this function to safely divide by integers that might be zero.

example1 = step4
  where
    start = 60
    d1 = 5
    d2 = 3
    d3 = 0
    d4 = 2

    step1 = safeDiv d1 start  -- returns (Just 12)

    step2 = case step1 of
      Just step1' -> safeDiv d2 step1'  -- returns (Just 4)
      Nothing -> Nothing

    step3 = case step2 of
      Just step2' -> safeDiv d3 step2'  -- returns Nothing
      Nothing -> Nothing

    step4 = case step3 of
      Just step3' -> safeDiv d3 step3'
      Nothing -> Nothing  -- safeDiv isn't even called this time

This is safe, but awkward. If we’re chaining the results of one step to the next like this then we can treat the Maybe Integer output of safeDiv as a monad and use bind.

example2 = safeDiv d1 start >>= safeDiv d2 >>= safeDiv d3 >>= safeDiv d4
  where
    start = 60
    d1 = 5
    d2 = 3
    d3 = 0
    d4 = 2

If you’re wondering why I put the divisor first in the arguments list of safeDiv, this is why. In this situation you can think of the divisor as the “configuration” of safeDiv, and by partial function application we can create a new function to apply that specific divisor.

Re-using Steps of the Computation

This is great, but what if you wanted to use an intermediate result to compute a more complex expression, but still cleanly abort if any of those intermediate results would produce a division by zero?

-- z / (5 + (x / y))
example3 :: Integer -> Integer -> Maybe Integer
example3 x y z = answer
  where
    d1 = safeDiv y x
    d2 = case d1 of
      Just d1' -> Just (5 + d1')
      Nothing -> Nothing
    answer = case d2 of
      Just d2' -> safeDiv d2' z
      Nothing -> Nothing

It’s not clear how to chain safeDiv calls when one or more of the arguments are themselves wrapped in a context of potential failure, which has reduced us back to the style we used before introducing bind.

Note that you could also use “applicative style” here, but that’s not the focus today. For more complex monadic flows the simpler applicative style may not work as well, but I will admit that for simpler function applications like this it does work well (even if it does look totally alien to people unfamiliar with applicative style, which is a drawback that do notation does not suffer from).

-- z / (5 + (x / y))
example3' :: Integer -> Integer -> Maybe Integer
example3' x y z = answer
  where
    d1     = safeDiv y x
    d2     = (+5) <$> d1
    answer = safeDiv <$> d2 <*> pure z

Instead, how might you do this with bind instead? Remember that bind applies a function to the value inside a context.

-- z / (5 + (x / y))
example4 :: Integer -> Integer -> Maybe Integer
example4 x y z = answer
  where
    d1     = safeDiv y x
    d2     = d1 >>= (\d1' -> Just (5 + d1'))
    answer = d2 >>= (\d2' -> safeDiv d2' z)

This is a massive improvement, but there is still a potential problem. What if we wanted to use the result of d1 again in the outermost computation? We can achieve this by creating a nested bind, which gives the innermost lambda access to the parameters of the outermost:

-- (z + (x / y)) / (5 + (x / y))
example5 :: Integer -> Integer -> Maybe Integer
example5 x y z = answer
  where
    d1     = safeDiv y x
    answer = d1 >>= (\d1' -> return (5 + d1') >>= (\d2' -> safeDiv d2' (z + d1')))

This solves the problem of accessing the previous answer, but now it’s a bit hard to read. You could probably fix it by splitting across lines and fiddling with indentation, but Haskell offers a better way: do notation.

-- (z + (x / y)) / (5 + (x / y))
example6 :: Integer -> Integer -> Maybe Integer
example6 x y z = do
  d1 <- safeDiv y x
  d2 <- return (5 + d1)  -- this can be rewritten as "let d2 = 5 + d1"
  safeDiv d2 (z + d1)

do notation is just syntactic sugar for creating a chain of monadic binds. It is not imperative at all, at least not in the way that C code is imperative. If any of the results in the chain become Nothing at any point, the entire computation is terminated at that point, so there is no need to check anything to abort the computation manually.

This is extremely convenient in other monads with effects. Functions involving the State or IO monads in particular will more strongly resemble C-style imperative operations, and trying to write such functions with manual bind chains would be extremely annoying. Instead, Haskell offers syntax which allows writing such code in its natural style while remaining pure.