Ruminations of a Programmer: April 2008

Sunday, April 27, 2008

Greedy Coin Changer in Scala

OnLamp has published this Python implementation of the Greedy Coin Changer. Here is my version of a functional implementation of the same in Scala.

object GreedyCoinChanger {
  def divMod(dividend: Int, divisor: Int) = {
      (dividend / divisor, dividend % divisor)
  }

  // the list has to be sorted in descending order of denomination
  def change(coins: List[Int], amount: Int) = {
    def changeOne(pair: (List[Int], Int), div: Int) = {
      val dm = divMod(pair._2, div)
      ((dm._1 :: pair._1), dm._2)
    }
    (((List[Int](), amount) /: coins)(changeOne(_,_)))._1.reverse
  }

  def main(args: Array[String]) = {
    println(change(List(25, 10, 5, 1), 71))
  }
}

Running this will print :

>> List(2, 2, 0, 1)

indicating 2 quarters, 2 dimes, 0 nickels and 1 penny. To make things a little more explanatory and verbose, here is a slightly more decorated version :

object GreedyCoinChanger {
  def divMod(dividend: Int, divisor: Int) = {
    (dividend / divisor, dividend % divisor)
  }

  def pluralize(no: Int, phrase: String) = phrase match {
    case "penny" if no > 1 =>
        "pennies"
    case something if no > 1 =>
        something + "s"
    case other => other
  }

  // the list has to be sorted in descending order of denomination
  def change(coins: List[(Int,String)], amount: Int) = {
    def changeOne(pair: (List[String], Int), denomination: (Int,String)) = {
      val (div, mod) = divMod(pair._2, denomination._1)
      div match {
        case 0 => (pair._1, mod)
        case _ => ((div + " " + pluralize(div, denomination._2) :: pair._1), mod)
      }
    }
    (((List[String](), amount) /: coins)(changeOne(_,_)))._1.reverse.mkString("(", ", ", ")")
  }

  def main(args: Array[String]) = {
    println(change(List((25,"quarter"), (10,"dime"), (5,"nickel"), (1,"penny")), 71))
  }
}

Running this will print :

>> (2 quarters, 2 dimes, 1 penny)

I am not an expert in Scala or functional programming. Any suggestion to make it more idiomatic is most welcome.

Tuesday, April 22, 2008

Syntactic Sugars - What makes them sweet ?

Stephen Colebourne talks about implementing for-each in Java maps. He has proposed changes to be made to javac and queued up his request for approval by appropriate authorities. It is good to see Java community leads taking some serious steps towards syntactic sugars in the language. I am always for intention revealing syntactic sugars - after all .. Programs must be written for people to read, and only incidentally for machines to execute ?

Syntactic sugars, when properly designed, reduce the semantic distance between the problem domain and the solution domain. Syntactic sugars do not add new features or capabilities to an existing language. Still we value them mainly for social reasons - they can make your abstractions much more explicit, thereby making your intentions much more direct. And syntactic sugars often lead to concise and succinct code much pleasing to your eyes.

Java is not a language that boasts of concise syntax. Yet the smart for loop introduced in Java 5 reduces a lot of accidental complexity and makes the programmer intention much more explicit ..

for(String name : names) {
  // process name
}

is much more succinct than

for(Iterator<String> it = names.iterator(); it.hasNext(); ) {
  String name = it.next();
  // process name
}

judging from the fact that the latter snippet has its intentions buried down into verbosity of structures not directly related to the intention of the programmer.

names foreach println

is better, though not Java.

Many of the languages being used today offer lots of syntactic sugars abstracting rich capabilities of the underlying language. Take, for example, the Groovy Builder syntax, which exploits the mechanics of meta-programming, closures and named arguments to implement elegant, concise, intuitive APIs. Java developers use binding frameworks to manipulate XML and bind them to the model or the relational database schema. Not that there is anything wrong with it. But the developer has to go through all the hoops of mapping the object structure to the XML schema and use an external framework like JAXB to come up with a much longer version of the same solution than using Groovy MarkupBuilders.

Syntactic sugars are nothing new in the landscape of programming languages. It all started (possibly) with Lisp, offering macros as the means to design syntactic abstractions. To get a little sugar to the language offered syntax, you need not have to wait till the next official release. In Lisp, the syntax of the program is a direct representation of the AST, and with macros you can manipulate the parse tree directly. Languages like Lisp are known to offer syntax extensibility and allows developers to implement his own syntactic sugar.

Ruby offers runtime meta-programming, another technique to add your own syntactic sugars. Ruby does not have a macro system where you can play around with the abstract syntax tree, though we have had a ruby parser released by Ryan Davis that has been written entirely in Ruby. The standard meta object protocol offered by Ruby allows developer control over the language semantics (not the syntax) and has the capability to generate classes and methods dynamically at runtime. Meta-programming, method_missing, open classes, optional parentheses are some of the features that make Ruby a great language to build syntax abstractions for runtime processing.

A language built on the philosophy of bottom up programming offers extensible syntax (be it through the syntactic abstractions of Lisp or the semantic customizations of Ruby), on which syntactic sugars can be constructed by developers. Java believes in democratization of all syntax offered by the language, and it may take quite a few years to officialize the little sugar that you have been yarning for. Remember the explosion in the number of blog posts in celebration of the for-each loop when it came out with Java 5. In other languages, people build new syntax by the day and evolve new vocabulary within the same language that maps into the domain that they model. If you miss those features which you enjoyed in your earlier language, just build it over the new language. And it does not necessarily have to be the process of hooking onto the compilation cycle or plugging in customized modules into your language parsers.

Many of today's languages offer strong enough capabilities to build structures that look like syntax extensions. Scala is an example that makes the cut in this category. The advanced type system of Scala enables developers write control structures within the syntax of the language that looks like syntactic abstractions. Max likes deterministic finalization in C# and its idiomatic usage with "using" keyword. He has implemented the same syntax in Scala using closures, view bounds and implicit conversions. Besides eliminating lots of boilerplates, his extension looks charmingly useful for the domain he is using it.

Syntax extensibility is a necessity if you want your language to support evolution of DSLs. Extensible syntax scales much better than the framework based approach so popularized by Java. When you add a new syntactic structure in your language, it meshes so nicely with the rest of the language constructs that you never feel that it has been externally bolted on to you. Although in reality it is nothing more than syntactic sugars being presented in a form that makes more sense when coding for the particular problem in hand. When we talk about a language, we think in terms of parsers - this is no different when we think about DSLs. Implementing an external DSL is hard, considering the enormous complexity that parser generators make you go through. Scala offers monadic parser combinators where you can directly map your EBNF syntactic structures into implementations. And all this is done through syntactic sugars on top of closures and higher order functions that the language offers.

Higher Order Functions - The Secret Sauce ?

There has been lots of debates on whether object-oriented interfaces scale better than syntax extension capabilities in a language design. While OO certainly has its place in modularizing components and abstracting away relationships between them, there are situations when objects force us fit the round peg in a square hole. How many times have you cursed Java for forcing you define an unnecessary interface just to apply a function over a set of abstractions defining a specific set of contracts ? You can do the same in Scala using structural typing (aka anonymous types) and higher order functions. Higher order functions seem to be the secret sauce for offering syntax extensibility in programming languages.

Monday, April 14, 2008

External DSLs made easy with Scala Parser Combinators

External DSLs are hard since implementing them involves reinventing most of the mechanisms found in a general purpose language. Designing internal DSLs are equally hard, more so in a statically typed language. Dynamically typed languages like Ruby offer strong meta-programming facilities, which help in implementing internal DSLs. But metaprogramming in Ruby is still considered elitist by many, and is not an art mastered by programmers at large.

Parser combinators offer a unique value here. They allow programmers to write executable grammars, in the sense that designing and implementing a DSL is almost equivalent to writing the EBNF notations in the syntax of the native language. So what really are parser combinators and what kind of language support do we need to implement parser combinator libraries ? Here is how Gilad Bracha describes them ..

The basic idea is to view the operators of BNF (or regular expressions for that matter) as methods that operate on objects representing productions of a grammar. Each such object is a parser that accepts the language specified by a particular production. The results of the method invocations are also such parsers. The operations are called combinators for rather esoteric technical reasons (and to intimidate unwashed illiterates wherever they might lurk).

Combinators have their theoretical underpinnings in functional programming. A parser combinator is a higher order function that accepts a parser and applies transformation functions to generate more complex parsers. Hence parser combinators are easily implemented in languages that have strong support for functional programming. In Scala, parsers are implemented as monads - hence defining combinators for parsers are just monadic transformations implementing sequencing, alternation or any other composition operations.

This post is not an introductory material on parser combinators or their implementations in Scala. Here I would like to narrate my experience in designing an external DSL for a financial application using the parser combinator library of Scala. In one of my earlier posts, I had talked about monads as abstractions for containers and computations. Parser combinator implementation in Scala is a great example of the power of monads in evolving a DSL.

In developing a financial application involving securities trade and settlement processing, we've been using XML as the DSL for getting buy/sell orders from client, trade/execution information from the exchange and settlement information from clearing agencies. Needless to say, XML processing is one of the key functions that pervade our codebase. In one of my very early posts, I had ranted about executable XMLs (aka Lisp) and had considered using Scheme as the DSL for securities trade processing operations. SISC offers a fully compliant Scheme implementation on top of the JVM, still all ideas fell through the cracks of enterprise decision making deliberations. After a fairly long time, I have found out another alternative - DSLs designed using Scala parser combinators ..

easy to implement

as concise as your EBNF productions

algebraic data types to generate ASTs and

powerful pattern matching techniques to inspect them.

Add to them the fact that I can have the entire stack running on the JVM, with Java objects still running the show at the backend. This imples that I do not have to reimplement my current Java application. I can just plug in the DSL and have the parser cook up my Java objects at the AST level. And that is exactly what I plan to do here.

Here is a sample DSL (simplified for brevity) for accepting client orders to buy/sell equities ..

(buy 100 IBM shares at max USD 45, sell 50 CISCO shares at min USD 25, buy 100 Google shares at max USD 800) for trading_account "SSS1234"

The equivalent XML will be too verbose, too painful for the eyes, and will definitely need more extraneous infrastructure over native language support for meaningful processing.

and here is the Scala parser for recognizing the DSL ..

import scala.util.parsing.combinator.syntactical._
object OrderDSL extends StandardTokenParsers {
  lexical.delimiters ++= List("(", ")", ",")
  lexical.reserved += ("buy", "sell", "shares", "at", "max", "min", "for", "trading", "account")

  def instr = trans ~ account_spec

  def trans = "(" ~> repsep(trans_spec, ",") <~ ")"

  def trans_spec = buy_sell ~ buy_sell_instr

  def account_spec = "for" ~> "trading" ~> "account" ~> stringLit

  def buy_sell = ("buy" | "sell")

  def buy_sell_instr = security_spec ~ price_spec

  def security_spec = numericLit ~ ident ~ "shares"

  def price_spec = "at" ~ ("min" | "max") ~ numericLit
}

This is really all that I need to parse my DSL. Really. And the most interesting part is that, the methods above have almost a one-to-one correspondence to EBNF production rules, as I would write them in a natural language. All the heavy lifting of lexical analysis and parsing are taken care of by the Scala parser combinator library.

The combinators used in the above example look like operators, though, actually they are Scala methods. Every combinator method works on a portion of the input, parses it and may optionally pass on the remaining part to the next combinator in the chain. e.g. the sequencing combinator ~ composes two parsers sequentially. In the above example, for the first production, trans ~ account_spec succeeds only if trans succeeds and then account_spec succeeds on the portion of the input left by trans. And the final result is another parser on which an optional function application combinator (^^) can work, applying the function on the result of the sequencing combinator.

Once I have the basic parsing productions defined in Scala, I can work towards building my abstract syntax tree (AST), which will accumulate necessary parsed information and provide me the model of the abstraction that the DSL embodies. This model depends on how I would like to process the abstraction defined by the language, and may vary based on the requirements of the backend system which receives the AST. In the above example, I may like to abstract the client order details into a POJO and pass on to the database layer for persistence. Alternatively I may like to pass on the AST to the pretty printer function, which generates an HTML for confirming the client order. Hence, it is always better if we can decouple the two concerns - recognising the language and processing the information to generate the AST. Gilad Bracha talks about similar decoupling in Newspeak using a combination of closures and inheritance. But Newspeak is a dynamic language and I am not sure if this decoupling can be achieved in a statically typed language like Scala.

Hence, in Scala it is not possible to ensure that multiple back end systems share the same grammar instance while working on different models of the AST. Scala offers combinators to plug in function application on the above production rules, which get executed on successful parsing of the rule. This is achieved by the function application combinator (^^) provided by the Scala library that enables plugging in of processing code towards evolution of the AST.

Depending on what processing I would like to do with the AST, I can choose an appropriate data structure. If I choose to perform heavy recursive traversals, tree manipulations and node annotations, the AST can be modeled as Scala algebraic data structures, which can then be inspected using pattern matching techniques. In the target application that I propose to use this, the backend contains the POJO based domain model and I would like to generate domain objects from ASTs to be used for transparent persistence in the data layer. Hence I choose to map the AST with my domain model for processing client orders.

Here are some Java classes (simplified for brevity) for abstracting a client order ..

// ClientOrder.java
public class ClientOrder {
  public enum BuySell {
    BUY,
    SELL
  }
  private String accountNo;

  private List<LineItem> lineItems = new ArrayList<LineItem>();

  // constructors, getters ..
}

// LineItem.java
public class LineItem {
  private final String security;
  private final int quantity;
  private final ClientOrder.BuySell bs;
  private final int price;

  public LineItem(String security, int quantity, ClientOrder.BuySell bs, int price) {
    this.security = security;
    this.quantity = quantity;
    this.bs = bs;
    this.price = price;
  }
  //..
  //..
}

I can plug in function application combinators with the production rules and have my AST model a ClientOrder. Remember I am plugging in the DSL to a system based on POJOs - hence I need to do some conversions between Java and Scala collections. But the final AST is a Java object to be consumed by the existing backend that does client order processing.

import scala.util.parsing.combinator.syntactical._
import org.dg.biz.ClientOrder
object OrderDSL extends StandardTokenParsers {

  def scala2JavaList(sl: List[LineItem]): java.util.List[LineItem] = {
    var jl = new java.util.ArrayList[LineItem]()
    sl.foreach(jl.add(_))
    jl
  }

  lexical.delimiters ++= List("(", ")", ",")
  lexical.reserved += ("buy", "sell", "shares", "at", "max", "min", "for", "trading", "account")

  def instr: Parser[ClientOrder] =
    trans ~ account_spec ^^ { case t ~ a => new ClientOrder(scala2JavaList(t), a) }

  def trans: Parser[List[LineItem]] =
    "(" ~> repsep(trans_spec, ",") <~ ")" ^^ { (ts: List[LineItem]) => ts }

  def trans_spec: Parser[LineItem] =
    buy_sell ~ buy_sell_instr ^^ { case bs ~ bsi => new LineItem(bsi._1._2, bsi._1._1, bs, bsi._2) }

  def account_spec =
    "for" ~> "trading" ~> "account" ~> stringLit ^^ {case s => s}

  def buy_sell: Parser[ClientOrder.BuySell] =
    ("buy" | "sell") ^^ { case "buy" => ClientOrder.BuySell.BUY
                          case "sell" => ClientOrder.BuySell.SELL }

  def buy_sell_instr: Parser[((Int, String), Int)] =
    security_spec ~ price_spec ^^ { case s ~ p => (s, p) }

  def security_spec: Parser[(Int, String)] =
    numericLit ~ ident ~ "shares" ^^ { case n ~ a ~ "shares" => (n.toInt, a) }

  def price_spec: Parser[Int] =
    "at" ~ ("min" | "max") ~ numericLit ^^ { case "at" ~ s ~ n => n.toInt }
}

Here is a function within OrderDSL that uses the AST model ..

def doMatch() {
  val dsl =
    "(buy 100 IBM shares at max 45, sell 40 Sun shares at min 24,buy 25 CISCO shares at max 56) for trading account \"A1234\""

  instr(new lexical.Scanner(dsl)) match {
    case Success(ord, _) => processOrder(ord) // ord is a ClientOrder
    case Failure(msg, _) => println(msg)
    case Error(msg, _) => println(msg)
  }
}

The basic idea of polyglotism is to harness the power of multiple languages in their respective strength areas. Languages like Scala, despite being statically typed offer lots of flexibilities and conciseness. Offering the strong features of both OO and functional paradigms, Scala shines in providing parser combinator libraries straight out of the box. And the above example shows how easy it is to get a DSL working if we use the power of combinators. And the best part is that you can still use your existing Java objects to do the heavy backend lifting - truly it is the single platform of JVM that unifies the diversity of multiple programming languages.