Ruminations of a Programmer: March 2008

Thursday, March 27, 2008

Monads - Another way to abstract computations in Scala

You can view monads as containers or as computations. Isn't this confusing enough ?

To a programmer, the biggest scare with the name "monad" is the fact that it originates from the constructs of category theory and the rich mathematics that goes along with it. The simplest way to make an idea of monads is to look at applications and examples that use them to solve real life problems. Monads have been described as models of computation, code transformers, state transformers and what not. But unfortunately for someone uninitiated to monadic programming, and indoctrinated with the tenets of OO programming and design patterns, all these incarnations make little sense. Sometime back, after listening to a long dissertation on monads, I asked myself .. what the heck do I need monads for ? I can very well do the same thing with a Composite Command design pattern that serializes execution of its composing commands.

I have been playing around with Scala for some time now. Scala is a multiparadigm language. The nice thing about Scala is that you never feel out of the way writing either imperative or functional code. On one hand, you have the re-assignable vars, explicit IOs, side-effecting constructs. And on the other, you can write pure clean functional code without any side-effecting features. Coming from a Java / C++ background and seeing enough of OO over the last decade, my interest is in exploring the functional features that the language has to offer. Ok, I know you will refer me to the backyards of Haskell or OCamL as more pure functional languages. But hey, I need to have my appplication deployed on the JVM, and don't ask me why ..

This post is all about my exposure to monads in Scala. This is not meant to be a tutorial on monads, considered by some, to be the stairway to the heaven of Haskell. It looks at monads purely as a means to design complex computational abstractions using the phenomenal powers of closures and higher order functions in Scala. Functional programming is all about referential transparency, where you treat your functions algebraically, evaluating them in no specific order. Monads allow ordered computation within FP that allows us to model sequencing of actions in a nice structured form, somewhat like a DSL. And the greatest power comes with the ability to compose monads that serve different purposes, into extensible abstractions within an application.

This sequencing and threading of actions by a monad is done by the language compiler that does the transformation through the magic of closures. Scala provides some built-in support of monads, and offers the machinery to design your own monadic operations.

Consider the following Scala code fragment, which looks quite intuitive to anyone familiar with programming in a high level language :

for {
  x <- List(1, 2)
}
yield(x + 2)

The code snippet uses Scala for-comprehensions to perform a "+ 2" operation on individual elements of a list of integers. Scala for-comprehension, being a syntactic sugar, the compiler does the heavy lifting and converts it to a more traditional map operation ..

List(1, 2) map {x => x + 2}

Quite trivial .. huh ! But how does it relate to monads ? Hang on ..

Now how about this ?

val first = List(1, 2)
val next = List(8, 9)

for {
  i <- first
  j <- next
}
yield(i * j)

I have added an extra step of computation in the sequence, that gets resolved as ..

first flatMap {
  f => next map {
    n => f * n
  }
}

and the last one ..

val first = List(1, 2)
val next = List(8, 9)
val last = List("ab", "cde", "fghi")

for {
  i <- first
  j <- next
  k <- last
}
yield(i * j * k.length)

that transforms to :

first flatMap {
  i => next flatMap {
    j => last map {
      k => i * j * k.length
    }
  }
}

The key abstraction is the flatMap, which binds the computation through chaining. Each invocation of flatMap returns the same data structure type (but of different value), that serves as the input to the next command in chain. In the above snippet, flatMap takes as input a closure (SomeType) => List[AnotherType] and returns a List[AnotherType]. The important point to note is that all flatMaps take the same closure type as input and return the same type as output. This is what "binds" the computation thread - every item of the sequence in the for-comprehension has to honor this same type constraint.

The above is an example of the List monad in Scala. Lists support filtering, and so does the List monad. In the for-comprehension, we can also filter data items through the if-guards :

for {
  i <- 1 until n
  j <- 1 until (i-1)
  if isPrime(i+j)
}
yield (i, j)

Consider another different computation involving sequencing of operations (not of the List type though) from an example which I mentioned in my previous Scala post :

case class Order(lineItem: Option[LineItem])
case class LineItem(product: Option[Product])
case class Product(name: String)

for {
  order <- maybeOrder
  lineItem <- order.lineItem
  product <- lineItem.product
}
yield product.name

This gets me the product name from the chain of order, lineItem and product. One more sequencing of operations, and the for-comprehension gets converted to :

maybeOrder flatMap {
  order => order.lineItem flatMap {
    lineItem => lineItem.product map {
      product => product.name
    }
  }
}

flatMap again!

Once again we have the magic of flatMap binding the thread of computation. And similar to the earlier example of the List monad, every flatMap in the entire thread is homogenously typed - input type being (T => Option[U]) and the output type (Option[U]). The types participating in this sequence of computation is the Maybe monad type, modeled as Option[T] in Scala.

What is the commonality of the above two examples ?

It is the sequencing of operations, that leads to the evolution of higher order functional abstractions.

And what is the variability part ?

It is the types that actually take part in the operations. In the first case, it is the operations on List type that gets chained. While in the second case, it is the Option type.

And what is the secret sauce ?

The flatMap (aka bind in Haskell) operation, which works orthogonally across types and serves as the generic binder of the sequence of actions.

In the above examples, the two monad types discussed, List[T] and Option[T] are container types, if we consider the latter to be a degenerate version with only 2 elements. However, monads can be designed to be of other types as well e.g. those that work on state manipulation or on doing IO. And for these monads, possibly it is more intuitive to comprehend monads as computations. I hope to provide examples of some real life applications of State and IO monads using Scala in a future post.

Whole is bigger than the sum of its parts

As I mentioned earlier, the biggest power of monads is the ability to combine diverse monadic operations to design modular and extensible code. The following snippet gives an example that combines a List monad and Maybe monad within the same for-comprehension block :

val list = List("India", "Japan", "France", "Russia")
val capitals =
  Map("India" -> "New Delhi", "Japan" -> "Tokyo", "France" -> "Paris")

for {
  i <- list
  j <- capitals get(i) orElse(Some("None"))
}
yield(j)

The first operation of the sequence is one on a List monad, while the next one is on a Maybe monad. The syntactic sugar of the for-comprehensions abstracts the details nicely enough for the user, who is completely oblivious of the underlying machinery of binding monads. Here is what comes up after the code transformation :

list flatMap {
  i => capitals.get(i).orElse(Some("None")) map {
    j => j
  }
}

Developing modular software is all about working at the right level of abstraction. And monads offer yet another machinery to mix and match the right abstractions within your codebase. Consider adding the power of monads to your toolbox - they certainly are one of the potent design patterns that you will ever need.

Monday, March 17, 2008

Are you fully using your Static Typing ?

Back in 2005 in an LtU discussion on Dynamic vs. Static Typing, Anton van Straaten had this post ..

Here's a nice bit of Java code I came across (here):

if ((value != null) && !returnedClass().isAssignableFrom(value.getClass())) { throw new IllegalArgumentException("Received value is not a [" + returnedClass().getName() + "] but [" + value.getClass() + "]"); }

This is from a piece of code that's trying very, very hard to avoid the need for the definition of boilerplate classes when persisting classes representing enumeration types to a SQL database.

This code is actually doing a kind of dynamic typechecking, illustrating the following generalization of Greenspun's 10th Law: "any sufficiently complicated program in a statically-typechecked language contains an ad-hoc, informally-specified bug-ridden slow implementation of a dynamically-checked language." ;)

Today's good Java frameworks use reflection quite sparingly and responsibly. Using Java generics, these frameworks allow compile time type checking for cases which would earlier have to be implemented using a slow and bug ridden simulation of runtime type checking. Guice and EasyMock stand out as two frameworks I have been using that have used the power of generics to implement extraordinary typesafety.

I really like the way small interface-heavy APIs of Guice enforce compile time type-safety.

Have a look at this piece of code, which binds an implementation SpecialServiceImpl to the interface Service using Guice Binder.

public class MyModule implements Module {
    public void configure(Binder binder) {
        binder.bind(Service.class)
              .to(SpecialServiceImpl.class)
              .in(Scopes.SINGLETON);
    }
}

Given the fact that DI frameworks are in the business of injecting implementations to objects dynamically, it may seem that the "implements" relationship between Service and SpecialServiceImpl is done during runtime. Thanks to Java generics, every bit of type checking is done during compile time.

A peek at the source code of Guice reveals that BinderImpl.bind() returns BindingBuilderImpl<T> ..

public <T> BindingBuilderImpl<T> bind(Class<T> clazz) {
    return bind(Key.get(clazz));
}

and BindingBuilderImpl<T>.to() takes as input Class<? extends T> - the bound on the wild card enforces the above "implements" relationship as part of compile time type checking of the arguments ..

public ScopedBindingBuilder to(Class<? extends T> implementation) {
    return to(TypeLiteral.get(implementation));
}

In comparison to Guice, Spring makes much heavier use of reflection, which, I think, is, kind of expected, from a pre-generics framework. Spring's implementation has lots of code similar to

// Check if required type matches the type of the actual bean instance.
if (requiredType != null && bean != null &&
      !requiredType.isAssignableFrom(bean.getClass())) {
    throw new BeanNotOfRequiredTypeException(name, requiredType, bean.getClass());
}

that does quite a bit of juggling with dynamic type checking at runtime.

Coming back to the above post by Anton, yes, the kind of runtime type checking exists in lots of popular Java frameworks, even today. And this is where frameworks like Guice and EasyMock really shine with their strongly typed API sets that make you feel more secure within the confines of your IDE and refactoring abilities.

The Morale

When you are programming in a statically typed language, use appropriate language features to make most of your type checking at compile time. This way, before you hit the run button, you can be assured that your code is well-formed within the bounds of the type system. And you have the power of easier refactoring and cleaner evolution of your codebase.

Tuesday, March 11, 2008

Maybe Scala

Working with languages that grow organically with your programs is real fun. You can simply visualize the whole program as a living organism evolving dynamically in front of you. Just fire up your REPL and see for yourself how the malleable syntactic structures of the language grow in front of your eyes, alongside your program. Whether this is through Lisp macros or Ruby meta-programming or Scala control structures, the secret sauce is in the ability to implement more and more powerful abstractions within the language. But what makes one language shine more compared to another is the ability to combine abstractions leading to more powerful syntactic structures.

Recently people have been talking about the Maybe monad and its myriads of implementation possibilities in Ruby. Because of its dynamic nature and powerful meta-programming facilities, Ruby allows you to write this ..

@phone = Location.find(:first, ...elided... ).andand.phone

Here andand is an abstraction of the Maybe monad that you can seamlessly compose with core Ruby syntax structures, effectively growing the Ruby language. Now, compare this with the Null Object design pattern in Java and C++, which is typically used in those languages to write null safe code. Abstractions like Null Object pattern encapsulate null handling logic, but do not relieve programmers from writing repetitive code structures - you need to have a Null Object defined for each of your abstractions!

I have been working with Scala in recent times and really enjoying every bit of it. Professionally I belong to the Java/JVM community, and the very fact that Scala is a language for the JVM has made it easier for me. After working for a long time in enterprise projects, I have also been affiliated to the statically typed languages. I always believed that static encoding of type information on your values serve as the first level of unit testing, constraining your data to the bounds of the declared type, as early as the compile time. However, at times with Java, I also felt that complete static typing annotations adds to the visual complexity in reading a piece of code. Now working in Scala made me realize that static typing is not necessarily a cognitive burden in programming - a good type inferencing engine takes away most of the drudgery of explicit type annotations of your values.

Enough of Scala virtues, let's get going with the main topic .. Scala's Maybe monad, it's marriage to for-comprehensions and it's implicit extensibility ..

Scala offers the Option[T] datatype as its implementation of the monad. Suppose you have the following classes ..

case class Order(lineItem: Option[LineItem])
case class LineItem(product: Option[Product])
case class Product(name: String)

and you would like to chain accessors and find out the name of the Product corresponding to a LineItem of an Order. Without drudging through boilerplate null-checkers and Baton Carriers, Scala's Option[T] datatype and for-comprehensions offer a nice way of calling names ..

def productName(maybeOrder: Option[Order]): Option[String] = {
    for (val order <- maybeOrder;
         val lineItem <- order.lineItem;
         val product <- lineItem.product)
             yield product.name
}

The example demonstrates a very powerful abstraction in Scala, the for-comprehensions, which works literally on anything that implements map, flatMap and filter methods. This is also an illustration of how for-comprehensions in Scala blends beautifully with its implementation of the Maybe monad (Option[T]), leading to concise and powerful programming idioms. I think this is what Paul Graham calls orthogonal language in his On Lisp book.

David Pollack implements similar usage of the Maybe monad in his Lift framework. Here is a snippet from his own blog ..

def confirmDelete {
    (for (val id <- param("id"); // get the ID
        val user <- User.find(id)) // find the user
     yield {
         user.delete_!
         notice("User deleted")
         redirectTo("/simple/index.html")
     }) getOrElse {error("User not found"); redirectTo("/simple/index.html")}
}

The find method on the model class and the param method that extracts from the request, all return Option[T] - hence we do not need any explicit guard to check for null id or "user not found" checks.

In the first example, if the method productName() gets any null component in the path of accessing the name, it returns None (as it should be). What if I would like to know which part is None as part of the diagnostic message ?

Just pimp my Option !

class RichOption[T](value: Option[T]) {
    def ifNone(message: String) = {
        value.getOrElse(error(message))
    }
}

and add the implicit conversion ..

implicit def enrichOption[T](opt: Option[T]): RichOption[T]
    = new RichOption(opt)

Now the method becomes ..

def productName(maybeOrder: Option[Order]): Option[String] = {
    try {
        val nm = maybeOrder.ifNone("null order")
                           .lineItem.ifNone("null line item")
                           .product.ifNone("null product")
                           .name
        if (nm == null)
            None
        else
            Some(nm)

    } catch {
        case e => println(e.toString()); None
    }
}

and the method invocation reports with meaningful messages in case of null accesses.

Wednesday, March 05, 2008

Sunshine on Dynamic Languages rolls on

Lighting up enterprisey Java shops with Ruby and Python.

Sun is up in arms strengthening the ecosystem of the JVM. In fact the J in JVM is starting to fade away and is fast becoming the VM of choice for dynamic languages. I am sure, with Sun's hiring of Ted and Frank, invokedynamic is poised to gain more importance in the JCP world. After Ruby, Python is slated to be another first-class-citizen on the JVM.

But what happens to Java, the language ?

With more and more power of the dynamic languages being blessed on the JVM, are we prepared to see more and more polyglotism in enterprise applications ? BigCos and enterprise managers always like to remain sensitive to the platform they have invested in - you can implement in any language you want to, so long it runs on the JVM. In the days to come, expect more strategic support from Sun on features that will enrich Java, the platform, rather than Java, the language.

Next stop for Sun ? Scala ?