Scala Talk - Part 1

A Brief Introduction to a Few Scala Features

I gave a brief introduction to the Scala programming language at the New York City Java Meetup on October 20, 2008. I followed up with a second part to my presentation at the December 15, 2008 meeting. Notes to that talk are here.

Scala is a hybrid functional/object-oriented language that runs in the JVM. Because it is a statically typed language, its performance is on a level with Java. My intention was to show a few of Scala's features that might be immediately attractive to Java programmers, and how the language is not as strange and off-putting as it might seem at first glance.

I gave the entire presentation using the Eclipse IDE with the Scala plugin. The plugin is available at http://www.scala-lang.org/node/94. It is a complete development environment for Scala applications, including the compiler and the necessary libraries. While it does not have all the advanced features of the Java development environment (such as refactoring), it has many important ones like like code completion, syntax highlighting, hover documentation, etc. It is quite usable and is improving rapidly.

I started off with a small, useless Java program.

package javatest;
import java.util.*;

public class JavaTest {

	static void recurse(int i) {
		if (i % 1000 == 0) {
			System.out.println("hello");
		}
		recurse(i + 1);
	}

	public static void main(String[] args) {
		if (args.length > 0) {
			System.out.println(args[0]);
		}
		List<String> theList = new ArrayList<String>();
		theList.add("one");
		theList.add("two");
		theList.add("three");
		Iterator<String> i = theList.iterator();
		while ( i.hasNext()) {
			System.out.println(i.next());
		}
		recurse(0);
	}
}

Actually, less than useless, since obviously the infinitely recursive method will quickly use up the entire stack and throw an exception:

Exception in thread "main" java.lang.StackOverflowError
	at javatest.JavaTest.recurse(JavaTest.java:10)

Next is a very literal translation of the Java program into Scala. It is extremely awkward Scala code, but it illustrates the similarities of using the two languages, and it shows how Java libraries are directly usable from Scala.

package scalatest;
import java.util._;

object ScalaTest {
  
  def recurse(i: Int): Unit = {
    if (i % 1000 == 0) {
      System.out.println("hello");
    }
    recurse(i + 1);
  }

  def main(args: Array[String]): Unit = {
    if (args.length > 0) {
      System.out.println(args(0));
    }
    var theList: List[String] = new ArrayList[String]();
    theList.add("one");
    theList.add("two");
    theList.add("three");
    val i: Iterator[String] = theList.iterator();
    while ( i.hasNext()) {
      System.out.println(i.next());
    }
    recurse(0);
  }
}

The line-by-line correspondence makes the similarity obvious. There are definite syntactical differences but really only one major conceptual difference between the two. Scala does not have statics. Scala does have the "class" keyword, and its meaning is fairly similar to that of Java. But here we see it replaced with "object". An object is something like a singleton that does not have to be explicitly instantiated, and so its methods here, recurse and main, are pretty much equivalent to static methods.

Starting at the top, let's examine the syntactical differences. The package declaration looks the same. Good. But in the next line, in the import statement, we see that the * wildcard is replaced with _. Then the "object" keyword we talked about, and now we get serious.

In the recurse method, we note that "def" indicates the start of the definition of a method or function. We see that the method's single parameter is surrounded by parentheses, as it is in Java. But inside the parens we see that the argument's name, i, comes before its type, Int, the opposite of Java. And they are separated by a colon. And once again unlike Java, the method's return type comes last, and is preceded by a colon. "Unit" is the equivalent of "void".

The recurse method body is identical to the Java version. It is surrounded by curlies, and the only difference is that the Scala version has an = before the opening curly.

Moving down into the main method (the application starting point, just like Java), notice that when the first element of the args array is referenced, it is indexed into using parens instead of square brackets; System.out.println(args(0)); instead of System.out.println(args[0]);. When creating the list and its iterator, we see the same reverse syntax mentioned before; the name comes before the type. The list is made a var (mutable) and the iterator is a val (immutable). In functional programming, mutable objects are avoided whenever possible, but we won't worry about that in this introduction.

And for the final syntax difference, notice that the generic types are parameterized using square brackets rather than angle brackets (ArrayList[String] instead of ArrayList<String>).

Armed with little more than this tiny bit of knowledge, a Java programmer could begin writing working Scala code with a heavy Java accent. Notice how we just imported and used the Java ArrayList right out of the box?

So now let's move on to a few of the features that prompt some people to call Scala a "better Java". For one thing, if you run the Scala version of this little application, it does not overflow the stack; it says "hello" until you terminate it. The compiler is able to do an optimization that turns this type of recursive call into what is essentially a loop. Another instantly comprehensible improvement is scoped imports. We can remove the import statement from the top and move it into the main method, the only place it is needed:

  def main(args: Array[String]): Unit = {
    if (args.length > 0) {
      System.out.println(args(0));
    }
    import java.util._;
    var theList: List[String] = new ArrayList[String]();
    theList.add("one");
    theList.add("two");
    theList.add("three");
    val i: Iterator[String] = theList.iterator();
    while ( i.hasNext()) {
      System.out.println(i.next());
    }
    recurse(0);
  }

It's pretty hard to think of any downside to that feature. But what about some of those syntax differences that might seem to be designed just to make a Java programmer's life more difficult? For example, using parentheses to index into an array instead of square brackets. Well, the Scala philosophy is to put as much functionality into libraries as possible rather than into the language proper. Indexing into the Array class is merely calling a function, hence the use of parentheses.

As for placing the variable's name before its type - this makes type inference clean and simple. In the following statement:

val theList: List[String] = new ArrayList[String]();

the compiler can easily infer the correct type of theList by what is being assigned to it. Thus, it can be written:

val theList = new ArrayList[String]();

This brings up a difference I've noticed between the Scala best-practice mind-set and that of Java. Scala is conducive to writing very concise code, while the Java best-practice is to be explicit whenever there is the slightest chance of error or misinterpretation. For example, while it is legal Java to omit the curly braces around a single-statement block:

while ( i.hasNext()) System.out.println(i.next());

we are encouraged to write:

while ( i.hasNext()) {
System.out.println(i.next());
}

In Scala, you can omit the semicolon at the end of a statement when the compiler can determine the end by the context. You can often omit explicitly providing the return type at the top of the function - whatever the last line of the function evaluates to is the return type. And you can also usually skip the "return" keyword.

Additionally, the dot separating the object name from the called method name can usually be replaced with a space; e.g., "theObject getVal()" instead of the "theObject.getVal()". And when there are no arguments to the method, you can eliminate the parens; "theObject getVal" instead of the "theObject.getVal()". If there is a single argument, a space can replace the parens; "theObject setVal 5" instead of the "theObject.setVal(5)".

Using these tricks, we can convert this code from the Scala version of the app:

    var theList: List[String] = new ArrayList[String]();
    theList.add("one");
    theList.add("two");
    theList.add("three");
    val i: Iterator[String] = theList.iterator();
    while ( i.hasNext()) {
      System.out.println(i.next());
    }

into this:

    var theList = new ArrayList[String]
    theList add "one"
    theList add "two"
    theList add "three"
    val i = theList iterator()
    while (i hasNext) println(i next)

It looks a little less cluttered, doesn't it? Notice that I couldn't use every feature in every case, but the compiler lets you know when it can't figure out your intention from the context. I took advantage of the automatically imported Scala Predef class, which allows you to reduce System.out.println() to println(). Scala is serious about reducing clutter.

So far, there has been nothing introduced that would cause the slightest difficulty for a Java programmer. But Scala is also a full-featured functional language, and some of those concepts are quite foreign to a programmer who is only familiar with object-oriented development. In the talk, I attempted to give just a tiny taste of some of the functional features of Scala. I did this by replacing the Java ArrayList with a Scala List.

The Java collections are excellent, and if I had used the new style for loop, iterating through the ArrayList would have been simpler. And as we saw, the Java collections can easily be used in Scala. But Scala has its own collections, which are designed for functional programming. The workhorse is the List. An idiomatic way of building our list would be:

val theList = "one" :: "two" :: "three" :: Nil

which builds lists item by item, creating a new list each step by combining the next item with the previous list and discarding the previous list. But we'll use the more familiar constructor style:

val theList = List("one","two","three")

Rather than explicitly iterating through the list, we can call List's foreach method, which takes as its argument a function that is applied to each item in the list. That function is a single argument function, the type of the argument being the parameterized type of the list, in this case String (inferred by the compiler). In Scala, functions are objects, so we can create the function object like this:

val aFunction = (s: String) => { println(s) }

and use it like this:

theList.foreach(aFunction)

Or we can directly pass in an anonymous function:

theList.foreach((s: String) => { println(s) })

So there's no need to wait for closures to come to Java! Now we'll apply some of Scala's syntactic sugar to trim some excess code. We can remove the curlies, since the function body is a single statement, and we can use the list at its creation point rather than assigning it to a variable:

List("one","two","three") foreach((s: String) => println(s) )

And since the type of List is String, the compiler knows that the argument of the function has to be a String:

List("one","two","three") foreach(s => println(s) )

So why even bother with the argument at all?

List("one","two","three") foreach println

And finally, the entire, more concise Scala-style version of the app that we ended up with:

package scalatest

object ScalaTest {
  
  def recurse(i: Int): Unit = {
    if (i % 1000 == 0) println("hello")
    recurse(i + 1)
  }

  def main(args:Array[String]) = {
    if (args.length > 0) println(args(0))
    List("one","two","three") foreach println
    recurse(0);
  }
}