Beginning Functional Programming

Published on February 15, 2018 by Vasily Vasinov

Functional programming has been getting more and more accepted by the wider community of software engineers. Lots of mainstream languages implement ideas that were considered esoteric only 15 years ago. Many engineers say that they learned more about computer science and computer systems after being introduced to the world of functional programming. Many also claim that it’s more fun to build software with functional tools.

The reason functional programming forces you to learn so much is because it challenges your every assumption about building software.

In this lesson we’ll cover some functional programming concepts that are considered the low-hanging fruit of functional programming and that are somewhat easy to grasp. My goal here is to spark interest in engineers that are already quite experienced in languages like Java, Ruby, and Python. Don’t worry if you don’t understand some of the Scala code that we’ll use throughout this lesson. Its main purpose is to demonstrate functional programming concepts in broad strokes.

Immutable State

First functional programming experiences are pretty similar: engineers start working on a project with the assumption that Scala (or other language with advanced functional capabilities) is like Java but it has a few cool features and “Rubyisms.” It’s pretty natural to have this assumption initially even though it’s wrong. Code written by functional rookies has mutable state all over the place and there is usually a lack of understanding what immutable data structures are good for. “How do I modify values in a list?” “How do I modify a map?” “How do I keep state in loops?” Those are some pretty common early questions.

To demonstrate the benefits of immutability let’s look at two versions of the same program: one in Java and one in Scala.

Immutability

Immutability is the property of objects describing their inability to be changed after they are created. In other words, once an object is initialized no outside component can change it.

The following imperative Java program filters a list of users by tht active flag, sorts them by ID, and then pulls last names from sorted objects into a list:

public class User {
  private final int id;
  private final String firstName;
  private final String lastName;
  private final Boolean active;
    
  // omitting constructors, getters, and setters for brevity...
}

public static List<String> activeById(List<User> us) {
   List<User> users = new ArrayList<User>();
   
   for (User u: us) {
     if (u.getActive()) users.add(u);
   }
   
   Collections.sort(users, new Comparator<User>() {
      public int compare(User a, User b) {
        return a.getId() - b.getId();
      } 
   });
   
   List<String> finalUsers = new ArrayList<String>();
   
   for (User u: users) {
     finalUsers.add(u.getLastname());
   }
   
   return finalUsers;
}

List<User> inputUsers = new ArrayList<User>();

inputUsers.add(new User(11, "Nick", "Smith", false));
inputUsers.add(new User(89, "Ken", "Pratt", true));
inputUsers.add(new User(23, "Jack", "Sparrow", true));

List<User> activeUsersById = activeById(inputUsers)

This is pretty typical pre-Java 8 code: each collection is mutated by a set of explicit instructions. The whole program is somewhat verbose: every line of activeById describes what we want to do with the data as opposed to how the data should flow from start to finish. Here is what the same program looks like when written functionally:

case class User(id: Int,
                firstname: String,
                lastname: String,
                active: Boolean)

def activeById(us: Seq[User]) =
  us.filter(_.active).sortBy(_.id).map(_.lastname)

val activeUsersById = activeById(Seq(
  User(11, "Nick", "Smith", false),
  User(89, "Ken", "Pratt", true),
  User(23, "Jack", "Sparrow", true)
))

This looks a lot cleaner than the imperative version largely because there is no state that we have to keep track of. ActiveById accepts one argument and then passes it through chained functions that are part of the standard collection library. It’s important to note that filter, sortBy, and map are not arbitrary. They are very well defined and studied by the functional programming practitioners.

A Rubyist would immediately notice that functional Scala code is similar to what she would write in Ruby. The code might look similar but the underlying mechanism of immutability is very different. The problem with Ruby is that it doesn’t enforce immutability. Every variable and data structure can potentially be mutated, which makes Ruby not as trustworthy. In Scala vals (read-only variables) and immutable collections are actually immutable from the programmer’s perspective.

What does immutability bring to the table? From the practical standpoint, we end up with cleaner code, a smaller margin for error (we always know what’s in our collections and read-only variables), and richer abstractions. Another benefit of immutability is that we can write concurrent programs without worrying about mutexes and semaphores.

Functions

Most of functional programming is about using functions—not a huge surprise. There are different kinds of functions and techniques of functional composition that we can use.

Pure Functions

Pure functions are one of the main pillars of functional programming. A pure function is a function that depends only on its input parameters. It also must only return a specific result without mutating the outside state. sin(x: Double) or md5(s: String) are great examples of pure functions that only depend on their input parameters and always return an expected result since they don’t rely on anything from the outside. This makes pure functions easily testable and less bug prone.

Not all abstractions can be directly implemented with pure functions. Think about input/output operations, logging, DB reading and writing, etc. In functional programming, there are models and abstractions that allow us to deal with those impure abstractions in a pure way, which results in cleaner and more composable code.

Functions in Scala are first-class citizens. It means that they are not just class methods that can be declared and called; they are separate typed entities that can be passed to other functions.

Functions can also return other functions. We can store them in variables or data structures. We can also work with them in the literal form without naming them:

val ints = Seq(1, 2, 3, 4, 5)

ints.filter(n => n % 2 == 0)

n => n % 2 == 0, or its full form (n) => { n % 2 == 0 }, is a function literal without a name that checks whether a number is even. We can pass it around as another function argument or use it as a return value.

Functions can be nested inside other functions. It’s a useful feature if we need to implement recursive calls with “subroutines” that we don’t want to put in the higher level scope.

def toList: List[A] = {
  @annotation.tailrec
  def go(s: Stream[A], l: List[A]): List[A] = s match {
    case Empty => l
    case Cons(h, t) => go(t(), h() +: l)
  }

  go(this, List[A]()).reverse
}

In this example we don’t want go to be in the same scope as toList because it’s too specific to its “parent” function and there could be other functions at the same level as toList that have its own functions named go.

Currying and Partial Function Application

Currying and partial function application are pure math concepts that can be practically applied in functional languages. Both approaches allow us to store partially invoked functions in variables or pass them to other functions. Let’s take a closer look at partial function application:

sealed trait Resource
case class User(id: Int) extends Resource
case class Record()
case class FormattedRecord()

def loadRecordsFor(r: Resource,
                   since: Long): List[Record] = ???

def formatRecords(f: Long => List[Record],
                  since: Long): List[FormattedRecord] = ???

val userRecordsLoader = loadRecordsFor(User(36), _: Long)

formatRecords(userRecordsLoader, System.currentTimeMillis - 86400000)

In this example we have two generic functions: loadRecordsFor and formatRecords. We can partially apply loadRecordsFor for some user and save it to userRecordsLoader. Then, in another part of the program, we can call formatRecords with userRecordsLoader as first parameter, since userRecordsLoader matches formatRecords’s signature perfectly. This kind of function composition comes in handy in a lot of situations and makes our code less rigid.

Currying is similar to partial function application. It’s the process of decomposing a function of multiple arguments into a chained sequence of functions of one argument. In Scala, we use multiple parameter lists to implement currying:

def line(a: Int, b: Int, x: Int): Int = a * x + b

def curriedLine(a: Int)(b: Int)(x: Int): Int = line(a, b, x)

def defaultLine(x: Int): Int = curriedLine(1)(0)(x)

defaultLine(5)
// returns `5`

The curriedLine method does all the currying work here. Its signature is Int => (Int => (Int => Int)). This means that curriedLine takes an integer as a parameter and returns another function that takes an integer as a parameter…and so on.

You can learn more about those useful functional concepts in another lesson.

Options and Pattern Matching

The Option data type is an abstraction that represents optional values. It might seem like it’s nothing special but in everyday work it’s an extremely powerful mechanism to represent null, empty, or corrupt object and variable values.

Options are containers that either contain a value of a certain type, represented by Some[T], or contain nothing, which is represented as None. You can also think of it as a list that’s either empty or has one value. When applied throughout the whole program this powerful abstraction allows us to eliminate countless edge cases that result in null pointer exceptions or type incompatibilities when null values get extracted.

Consider the following example:

case class Project(id: String, name: String,
                   priority: Int, description: Option[String])

object ProjectsAccessor {
  find(id: String): Option[Project] = ???
}

val project = ProjectsAccessor.find(123)

We are trying to retrieve a project record from the database but we don’t know if the project with a specific ID exists. Instead of returning null or throwing an exception, we are either going to return Some[Project] or None as defined by the Option[Project] return type of the find method.

Container types like that allow us to use a powerful tool called pattern matching. Pattern matching is a way to process data based on its structure. For example, if we wanted to process the result of the find method from the example above and extract the name of the project we’d do something like this:

ProjectsAccessor.find(123) match {
  case Some(p) => p.name
  case None => ""
}

Here we are matching the result of find to see if the project exists. If it exists then we return its name, otherwise we return an empty string. At first, it might look like a switch-case structure from Java but it’s very different. With pattern matching we can add custom logic to our patterns:

ProjectsAccessor.find(123) match {
  case Some(p) if 1 until 5 contains p.priority => p.name
  case Some(p) if name == "Default Project" => p.name
  case Some(p) => None
  case None => ""
}

We can also match our result directly based on the actual structure of the object:

def descriptionWrapper(p: Project) = p match {
  case Project(_, _, _, None) => "No description."
  case Project(id, _, _, Some(d)) => s"Project $id's description: $d"
}

One-Liners and For Comprehensions

One of the greatest things that advanced function composition brings to the table is function chaining. Instead of manually reiterating over the same data collection in a bunch of for loops, we can do it in one elegant expression, or a one-liner:

case class Participant(name: String, score: Int, active: Boolean)

val ps = Seq(Participant("Jack", 34, true),
             Participant("Tom", 51, true),
             Participant("Bob", 90, false))
             
ps.filter(_.score < 50).filter(_.active).map(_.copy(active = false))

// returns List(Participant(Jack, 34, false))

In this one-liner we grabbed all participants whose score is lower than 50 and who are still active; then we changed the active status of selected participants to false. There are many situations where similar one-liners save functional programmers time and dramatically reduce the amount of code.

If a one-liner becomes too dense we can always break it down with a technique called for comprehensions. Let’s rewrite our example from before:

for {
  loser <- ps if loser.score < 50
  activeLoser <- Some(loser) if activeLoser.active
  deactivatedLoser <- Some(activeLoser.copy(active = false))
} yield deactivatedLoser

// returns List(Participant(Jack, 34, false))

This is more verbose than a one-liner but in cases where logic becomes too dense, this can really help code readability, yet keep all the benefits of immutability and function chaining.

Type System

Coming from the world of Ruby programming strict typing in Scala can feel like a burden at the outset. After exploring its benefits most tend to change their minds. Some functional languages have extremely sophisticated type systems with properties that imperative programmers would never use. Those type systems allow for flexible and composable programs. Let’s go over some of the properties of functional programming language type systems.

Type Inference

Type inference is an ability of a programming language to deduce types of an expression without explicit type annotations. Scala is not great at type inference in certain cases and sometimes we have to hold its hand and give it hints about what types to use. Let’s look at a real example:

// We always have to specify types in method signatures:
def nameStartsWith(ns: Seq[String], n: String): Seq[String] =
  // Scala can't infer types for "generic" collections,
  // so we can't just say `Seq.empty`:
  ns.foldLeft(Seq.empty[String]) {
    // But it doesn't need types in this anonymous function:
    (l, r) => if(r.startsWith(n)) r +: l else l
  }
  
// Type inference works really well with list declarations:
val names = Seq("Bob", "Alice", "Ann")

nameStartsWith(names, "A")

// returns List(Ann, Alice)

This example demonstrates both sides of type inference in Scala: we still have to explicitly define some types but in other cases, like when we pass a function (l, r) => ..., types don’t have to be annotated. In purely functional languages, like Haskell, we hardly ever have to specify types. The compiler is smart enough to infer them.

Type Bounds

Type bounds is another important concept in functional programming. It means that we can support class hierarchies in generic type declarations. In Java, we can use generics in order to define types during runtime and still keep our code type safe. For example, to define a generic list of elements we’d use this interface in Java:

public interface MyList<T>

If we want to define a list of, say, maps, but we don’t know what the implementation of those maps is, we’d use an upper bound for a type in our generic:

public interface MyList<T extends Map>

Now we can use this list to fill it with Hashtable, LinkedHashMap, HashMap, and TreeMap, or in other words, all default descendants of the Map interface. If there are children inheriting from Map, they can be valid elements of MyList as well. No other type can be used because of the type bound. Here is an example of using an upper bound in Scala:

def convertToInts[T <: AnyVal](es: Seq[T], f: T => Int): Seq[Int] = {
  es.map(f(_))
}

AnyVal is a parent class of Double, Float, Int, and many others. In this function we simply define that we want T to be a child of AnyVal for type safety.

On top of defining upper bounds we can define lower bounds, like [T >: Int] that would match Int’s parents. We can also combine type bounds for different generics in the same function signature: [T >: Int <: Any].

Type Variance

One other important property of an advanced type system is type variance. In Java, if we have class List<T> then List<Object> and List<String> will be unrelated or invariant even though Object and String are directly related. Arrays are covariant, meaning that String[] is a subtype of Object[]. Since arrays are mutable, we can end up with ArrayStoreException runtime exceptions if values of different types are stored and retrieved incorrectly. In Scala, arrays are invariant by default and immutable collections (or container types) are covariant, which in Scala syntax is defined as [+A]. Since collections are immutable all of our potential type errors will be discovered at compile time as opposed to run time. We can also define a container as contravariant: [-A]. Contravariance means that a container with a parent type is a subtype of a container with a child type. Here is how it all works in real code:

case class InvariantContainer[A]()
case class CovariantContainer[+A]()
case class ContravariantContainer[-A]()

class Person
class User extends Person
class Admin extends User

// works
val inv1: InvariantContainer[User] = InvariantContainer[User]

// doesn't work
val inv2: InvariantContainer[User] = InvariantContainer[Admin]

// works
val cov1: CovariantContainer[User] = CovariantContainer[User]

// works
val cov2: CovariantContainer[User] = CovariantContainer[Admin] 

// works
val con1: ContravariantContainer[User] = ContravariantContainer[User] 

// doesn't work
val con2: ContravariantContainer[User] = ContravariantContainer[Admin] 

// works
val con3: ContravariantContainer[User] = ContravariantContainer[Person]

Covariance and cotravariance are widely used in collection implementations and function type trickery.

Lazy Evaluation and Infinite Data Structures

The concept of lazy evaluation doesn’t directly exist in non-functional languages but it’s pretty easy to grasp. Think of a typical if statement:

def expensiveOperation() = ???
val a = "foo"
val b = "foo"

if ((a == b) || expensiveOperation()) true else false

In most imperative languages the || operator evaluates its arguments (a == b) and expensiveOperation() lazily meaning that expensiveOperation() doesn’t get executed if (a == b) is true. It’s only executed if (a == b) returns false. Lazy evaluation in Scala allows us to define similar behavior in more contexts.

We can define our variables to be lazy, meaning that they don’t get executed until they are accessed for the first time as opposed to normal variables that are executed when they are defined. Once a lazy variable is executed its value is cached.

case class Order(name: String, price: Int)

case class User(id: Int, orders: Seq[Order]) {
  lazy val cheapOrders: Seq[Order] = orders.filter(_.price < 50)
  lazy val expensiveOrders: Seq[Order] = orders.filter(_.price >= 50)
}

In this example we have a case class for the user abstraction that contains a list of shopping orders. CheapOrders and expensiveOrders don’t get evaluated during case class initialization like a normal val would. They would only get executed if we call them directly. Why not use a method instead? The problem is that if we have an expensive computation or a DB call to make, calling a method multiple times will execute it multiple times. Lazy variables cache the return value once they are called, which makes for very effective optimizations in certain scenarios.

Another example of delayed execution are by-name function parameters. Normally, function parameters get executed right when they are passed. However, in some cases that involve DB access or heavy computations we don’t want to execute function parameters until absolutely necessary:

sealed trait Person
case class User() extends Person
case class Admin() extends Person

def loadAdminsOrUsers(needAdmins: => Boolean, loadAdmins: => Seq[Admin],
                      loadUsers: => Seq[User]): Seq[Person] = {
  if (needAdmins) loadAdmins
  else loadUsers
}

Here we have three by-name parameters with potentially expensive DB operations. We don’t want all of them to be executed, so we can’t pass them by value as we normally would. The arrow symbol => means that we are passing the function itself as opposed to the return value of the function. Now we can call it whenever we need to.

Laziness and by-name parameters are used to implement one of the most powerful constructs in functional programming: infinite data structures. In imperative programming, all data structures have a predefined size, which works fine in most cases but sometimes we don’t know what the size of the data structure is until the very end. With delayed execution it becomes possible to define our data structures in general form without “filling them up” with data until we actually have to do it.

All of this sounds great in theory but how does it actually work? Let’s use an infinite data structure, called stream, for generating prime numbers. In Java, in order to generate primes, we would write a function that would generate prime numbers up to some limit. Then we would call this function to generate a list of N prime numbers and pass the result elsewhere. If we need a different list of primes (say, prime numbers up to N + M), we’d have to recalculate our list from scratch. In Scala, we would do it differently:

val primes: Stream[Int] = {
  def generatePrimes (s: Stream[Int]): Stream[Int] =
    s.head #:: generatePrimes(s.tail filter (_ % s.head != 0))
    
  generatePrimes(Stream.from(2))
}

This syntax probably doesn’t make much sense if you never worked with Scala but it’s not important in this case. What’s important is what we can do with this infinite data structure (not needing an implicit lower or upper bound). Say, we want to get the first five prime numbers that are greater than 100. It’s a piece of cake with our implementation:

primes.filter(_ > 100).take(5).toList

This chain of functions will return List(101, 103, 107, 109, 113) as expected. We can pass primes around to functions anywhere in our program without it being executed. We can also chain any actions on top of it (like filter in the example above) and pass the expression around, again, without generating any actual results until we need them. This allows us to come up with composable abstractions and to mold programs like play-doh.


Join Discussion

Subscribe to our Grokked Weekly newsletter to receive excellent engineering articles and the latest lessons from Grok Academy.