Thursday, February 20, 2014

Regular imperative statements versus FPL abstractions

When having talks over the Internets about FPL (Functional Programming Language), it’s not uncommon reading people exposing the same perennial misconceptions. They’re actually several sides around such a topic:
  • the ones who think FP is a toy and have poor arguments
  • the ones who think FP is not that good and have poor arguments
  • the ones who think FP sucks and can’t say why (no argument at all)
  • the ones who think FP is a “way of thinking the software” and that it’s not language-related
  • the ones who’s gone beyond the stupid toy / research toy / masturbating toy / brainfuck-like language stuffs and are now interested and want to discover more
  • the ones who think FP is great
  • the ones who have no idea what the heck FP is
Something interesting: in most cases, a lot of guys never actually experienced FP seriously. I mean “use it in a software different than Fibonacci generation or other school cases”. I don’t want to start a debate about whether FP is the best or the worse way of programming. If you’re there it’s because you’re from the guys who’s overstepped the barrier of FP and you want to dig in. If you’re there, you now understand that “it’s slow”, “it’s ugly”, “it doesn’t fit production requirements and serious business” and so on were valid arguments some decades ago, but not anymore – some people would even say some of those points weren’t actually close to be valid at all.

In this thread, I put forward an analogy between regular imperative – C/C++, Java, C#, Perl and so on – statements and FPL abstractions. The requirements for you to understand this thread are quite low: basic statements (for, if, else if, else, switch, and so on).

Loops
Looping over a collection is natural for all programmers. It’s a very common situation and is used to break the controlflow of your application. Without them, you’d have a program acting like a math function. It would take input, do some stuff on it, then exit. Iterations enables you to go backward, repeat the same functions and instructions a certain times or even do some exotic and weird things.

That was just the a “programmer” simple way of defining what looping could be. For a mathematician, regarding functions, looping does really not make any sense. You can’t apply a function and rewind.

Consider this C++ snippet:

std::vector; a = { 1, 8, 2, 34, 314 };
int sum = 0;

for (auto it = std::begin(a), e = std::end(a); it != e; ++it);
  sum += *it;
This is a typical case of looping in C++ using an iterator, which could be considered as a raw pointer in that previous example.

Now, how would you do the same thing in FPL? One say “well let’s just write a recursive function sumArray that would sum all elements!”. This would work, but it’s not that way we think in FPL. Instead, we come up with abstractions.

Abstracting is the process of generalizing something true to one concept to several other ones along a certain, well fixed, axis of abstraction. For instance, you know how to sum integers. You can generalize integers sum along some axis. If you choose the type axis, you can generalize integers sum to numbers sum. If you choose the operation axis, you can generalize integers sum into integers binary operation. And so on.

“Looping”. Try to forget programming. Does it still have any meaning to you? One of the hardest things to deal with when trying to get into functional programming is to think like a human again – yeah you’re all filfthy machines! Try reasoning about that collection, and forget looping. Ok, let’s give it a go. I’d say “I want to do something for each elements”. That’s a great start! But this is very abstract. Let’s try something else. “Well, I want to construct a value while traversing a collection”. This is much better! “In this case, I just want to sum all elements in the collection, then I just want to build a value by summing it to each element of the collection”. Quite concise. In C++, what was that again? “I want to loop over a collection and alter a variable by summing.”.

In Haskell – but also in perhaps all other FPL – it’s very common to use the abstract concept of a fold to iterate over a collection, building a value along. The collection is now called a foldable type and the iteration a fold. Folding a value is just accumulating a value using a function on each element. The idea is quite simple:
  • you have a function a -> b -> b
  • you have a foldable value with several a inside of it
  • you have an initial value for b
  • folding is just applying the function to each of the elements, passing along the previous value of the accumulator to the current call
Folding is then a special kind of iteration in which you accumulate a value. Because of polymorphic type, Haskell can handle any types of accumulator when using a folding function. That means you can fold a list and generate any kind of data.

Let’s write the same example as above, but in Haskell:

let a    = [1,8,2,34,314]
result = foldl1 (+) a
That’s all. You might say “ahah, foldl1, it’s barly readable” but in fact, it is. We find fold in foldl1. Then, we find a l. What could it be? Well, it’s a directional fold from left to right. You could also find a r as in foldr1 for instance – it would be a directional fold from right to left. What’s the 1? As mentionned above, folding needs an initial value to be able to start. Imagine the sum, the initial value is 0. foldl1 just uses the first value of the foldable type as initial value for the accumulator – it has an impact of the type of the folding function, which is now a -> a -> a; if you don’t understand why, it’s not really a matter).

Then, foldl1 (+) is a function that sums all elements of a collection (a list, actually). It could be a first implementation of the sum function, that does the same thing.

Conditional execution
This is quite the same thing in FPL, but it’s a bit different as well. In typical imperative languages, conditionals are control statement. The single exception might be the ternary operator – not all languages have it. You know, that’s operator with ? and : keywords. Well, for Haskell, we only have that one. We don’t use the same keywords though, but instead, we have if and else.

Consider this C++ snippet:

string name;

std::cin << name;
if (name.empty()) {
  std::cout << "woh, empty name!" << std::endl;
} else if (name.size() < 4) {
  std::cout << "this is a short name you have" << std::endl;
} else {
  std::cout << "what a long name!" << std::endl;
}
This code is quite simple. You enter your name, and if it’s null, you’ll told it’s empty, if it’s less than four characters, you’ll be told it’s short and more than four characters you have a long name.

Now the same thing, but in Haskell:

name <- getLine
putStrLn $
  if null name
  then
    "woh, empty name!"
    else if length name < 4
    then
      "this is a short name you have"
    else
      "what a long name!"
You don’t have to insert as much newlines and indents I do, but I think it’s clearer that way. As you can see, the test is now done inside the value construction we’re passing to putStrLn, which takes a String and outputs it into your terminal. In Haskell, conditionals yields values. You could then write something like that:

a <- getLine
let r = if a == 0 then 314 else a
Haskell also has a lot of ways doing conditional execution. For instance, it has native support for partial function implementation depending on a predicate:

foo :: String -> IO ()
foo = putStrLn . go . length
  where
    go l
        | l == 0    = "woh, empty name!"
        | l < 4     = "this is a short name you have"
        | otherwise = "what a long name"
Those | are called guards in Haskell. We also have pattern-matching to implement a function differently depending on the “form” of its argument. For instance:

bar :: String -> String
bar []     = "empty string"
bar (x:[]) = "singleton string
bar _      = "long string"
I won’t go into complex details, but we also have monadic conditional function that enables us to do pretty things like:

testSomething :: IO ()
testSomething =
    runMaybeT $ do
      userInput <- lift getLine
      guard (length userInput /= 0)
      lift (putStrLn "ok!")
In that last snippet, if the user enters an empty string, the string “ok!” won’t be printed out.

As I already told you, abstractions are everywhere in FPL and especially in Haskell.

Error handling
One last example to give you an idea: exception handling. In C++, you have the keywords throw and catch and the type std::exception to inherits from. One big mistake a lot of folks do is thinking that error-prone code should be handled through exception. This is not really a good idea since exceptions are exceptional.

In Haskell, you can handle errors via dozens of ways. For instance, just like the testSomething function just above, you can use the Maybe type or use it as a monad. Even better, you can use its monadic transformer form, MaybeT, (what I did here), in order to extend your code with failable with no message errors code. Maybe express this concept: “maybe there’s nothing, maybe there’s just something”. And well, the Maybe type has two constructors. Try to guess what. Just, and Nothing. It can express a value that can be absent, ill, misformed, and so on.

For errors with message, we have a lovely type, monad and monad transformer with corresponding combinators: Either. It shows this: “I can have either this type or this one”. This is the theory about Either. Pratically, we don’t use it that way. We use this way: “it can has a right value, or a value that represents something wrong”. Two constructors for that type: Right, for a right value, and Left, for a wrong value.

If you just want to compute a value, and express a potential error with a String, you’re eligible to use an Either String a type! You could then do that:

testSomething :: IO ()
testSomething = do
    e <- runEitherT $ do
      userInput<- lift getLine
      when (length userInput == 0) . left $
        "oh god, error! empty string!"
      lift (putStrLn "ok!")
    either err skip e
stderr. Pretty straight-forward huh? :)

Conclusion
I could have written much more lines about comparison, but I think it’s quite enough for today. I’ll go into further details if you want.

The last thing to keep in mind is that all you can do in imperative languages can be achieved in FPL as well.

1 comment:

  1. There is std::accumulate (and have been for a while actually) in C++, just to point that abstractions exists there as well. Actually C++ (const) iterators are a way to make the very same stuff you can achieve with Foldable/Traversable from Haskell, at about the same abstraction level.

    Also you have a semicolon after for loop making the body empty and first C++ example not compilable. And as you use C++11 auto, you could use range-based for loop (if you don’t want to use std::accumulate for some reason).

    In conditional execution example, I would extract string generation into own function. You do it Haskell example, so why not to do it C++ as well?

    Using exceptions for handling “not-happy” path is totally valid thing in C++. It’s way better than testing result value for each function call (as the Maybe’s or Either’s bind does, but that’s hidden behind the abstraction). Computing with exceptions is like computing in exists e : Exception, Either e -monad.

    Maybe the choice of C++ for imperative language is not the best possible. You can build powerful abstractions in C++ as well. Bare C would be much illustrative.

    ReplyDelete