Learning Concurrent Programming in Scala: Chapter No. 1 "Introduction"
Learning Concurrent Programming in Scala: Chapter No. 1 "Introduction"
Learning Concurrent Programming in Scala: Chapter No. 1 "Introduction"
Programming in Scala
Aleksandar Prokopec
Chapter No. 1
"Introduction"
Acknowledgments
First of all, I would like to thank my reviewers Samira Tasharofi , Lukas Rytz, Dominik
Gruntz, Michel Schinz, Zhen Li, and Vladimir Kostyukov for their excellent feedback
and valuable comments. They have shown exceptional dedication and expertise in
improving the quality of this book. I would also like to thank the editors at Packt
Publishing: Kevin Colaco, Sruthi Kutty, Kapil Hemnani, Vaibhav Pawar, and Sebastian
Rodrigues for their help in writing this book. It was really a pleasure to work with
these people.
The concurrency frameworks described in this book wouldn't have seen the light of day
without a collaborative effort of a large number of people. Many individuals have, either
directly or indirectly, contributed to the development of these utilities. These people are
the true heroes of Scala concurrency, and they deserve thanks for Scala's excellent
support for concurrent programming. It is difficult to enumerate all of them here, but I
have tried my best. If somebody feels left out, he should ping me, and he'll probably
appear in the next edition of this book.
It goes without saying that Martin Odersky is to be thanked for creating the Scala
programming language, which was used as a platform for the concurrency frameworks
described in this book. Special thanks go to him, to all the people who were a part of the
Scala team at the EPFL for the last 10 or more years, and to the people at Typesafe, who
are working hard to make Scala one of the best general-purpose languages out there.
Most of the Scala concurrency frameworks rely on the works of Doug Lea in one way or
another. His Fork/Join framework underlies the implementation of the Akka actors, Scala
Parallel collections, and the Futures and Promises library; and many of the JDK
concurrent data structures described in this book are his own implementations. Many of
the Scala concurrency libraries were influenced by his advice. Furthermore, I would like
to thank the Java concurrency experts for the years of work they invested into making
JVM a solid concurrency platform, and especially, Brian Goetz, whose book inspired
our front cover.
The Scala Futures and Promises library was initially designed by Philipp Haller, Heather
Miller, Vojin Jovanovi, and myself, from the EPFL; Viktor Klang and Roland Kuhn
from the Akka team; Marius Eriksen from Twitter; with contributions from Havoc
Pennington, Rich Dougherty, Jason Zaugg, Doug Lea, and many others.
Although I was the main author of the Scala Parallel Collections, this library benefited
from the input of many different people, including Phil Bagwell, Martin Odersky, Tiark
Rompf, Doug Lea, and Nathan Bronson. Later on, Dmitry Petrashko and I started
working on an improved version of parallel and standard collection operations, which
were optimized through the use of Scala Macros. Eugene Burmako and Denys Shabalin
are among the main contributors to the Scala Macros project.
The work on the Rx project was started by Erik Meijer, Wes Dyer, and the rest of theRx
team. Since its original .NET implementation, the Rx framework has been ported to many
different languages, including Java, Scala, Groovy, JavaScript, and PHP, and has gained
widespread adoption, thanks to the contributions and the maintenance work of Ben
Christensen, Samuel Grtter, Shixiong Zhu, Donna Malayeri, and many other people.
Nathan Bronson is one of the main contributors to the ScalaSTM project, whose default
implementation is based on Nathan's CCSTM project. The ScalaSTM API was designed
by the ScalaSTM expert group, which comprised of Nathan Bronson, Jonas Bonr, Guy
Korland, Krishna Sankar, Daniel Spiewak, and Peter Veentjer.
The initial Scala actor library was inspired by the Erlang actor model, and developed by
Philipp Haller. This library inspired Jonas Bonr to start the Akka actor framework. The
Akka project had many contributors, including Viktor Klang, Henrik Engstrm, Peter
Vlugter, Roland Kuhn, Patrik Nordwall, Bjrn Antonsson, Rich Dougherty, Johannes
Rudolph, Mathias Doenitz, Philipp Haller, and many others.
Finally, I would like to thank the entire Scala community for their contributions, and for
making Scala an awesome programming language.
Learning Concurrent
Programming in Scala
Concurrency is everywhere. With the rise of multicore processors in the consumer
market, the need for concurrent programming has overwhelmed the developer world.
Where it once served to express asynchrony in programs and computer systems, and was
largely an academic discipline, concurrent programming is now a pervasive methodology
in software development. As a result, advanced concurrency frameworks and libraries are
sprouting at an amazing rate. Recent years have witnessed a renaissance in the field of
concurrent computing.
As the level of abstraction grows in modern languages and concurrency frameworks, it
is becoming crucial to know how and when to use them. Having a good grasp of the
classical concurrency and synchronization primitives, such as threads, locks, and
monitors, is no longer sufficient. High-level concurrency frameworks, which solve
many issues of traditional concurrency and are tailored towards specific tasks, are
gradually overtaking the world of concurrent programming.
This book describes high-level concurrent programming in Scala. It presents detailed
explanations of various concurrency topics and covers the basic theory of concurrent
programming. Simultaneously, it describes modern concurrency frameworks, shows
their detailed semantics, and teaches you how to use them. Its goal is to introduce
important concurrency abstractions, and at the same time show how they work in
real code.
We are convinced that, by reading this book, you will gain both a solid theoretical
understanding of concurrent programming, and develop a set of useful practical skills
that are required to write correct and efficient concurrent programs. These skills are
the first steps toward becoming a modern concurrency expert.
We hope that you will have as much fun reading this book as we did writing it.
The goal of this book is not to give a comprehensive overview of every dark corner of the
Scala concurrency APIs. Instead, this book will teach you the most important concepts of
concurrent programming. By the time you are done reading this book, you will not just be
able to find additional information in the online documentation; you will also know what
to look for. Rather than serving as a complete API reference and feeding you the exact
semantics of every method, the purpose of this book is to teach you how to fish. By the
time you are done reading, you will not only understand how different concurrency
libraries work, but you will also know how to think when building a concurrent program.
Introduction
"For over a decade prophets have voiced the contention that the organization of a
single computer has reached its limits and that truly significant advances can be
made only by interconnection of a multiplicity of computers."
Gene Amdahl, 1967
Although the discipline of concurrent programming has a long history, it gained
a lot of traction in recent years with the arrival of multicore processors. The recent
development in computer hardware not only revived some classical concurrency
techniques, but also started a major paradigm shift in concurrent programming. At
a time, when concurrency is becoming so important, an understanding of concurrent
programming is an essential skill for every software developer.
This chapter explains the basics of concurrent computing and presents some Scala
preliminaries required for this book. Specifically, it does the following:
Concurrent programming
In concurrent programming, we express a program as a set of concurrent
computations that execute during overlapping time intervals and coordinate in some
way. Implementing a concurrent program that functions correctly is usually much
harder than implementing a sequential one. All the pitfalls present in sequential
programming lurk in every concurrent program, but there are many other things
that can go wrong, as we will learn in this book. A natural question arises: why
bother? Can't we just keep writing sequential programs?
Introduction
Chapter 1
At the lowest level, concurrent executions are represented by entities called processes
and threads, covered in Chapter 2, Concurrency on the JVM and the Java Memory Model.
Processes and threads traditionally use entities such as locks and monitors to order
parts of their execution. Establishing an order between the threads ensures that the
memory modifications done by one thread are visible to a thread that executes later.
Often, expressing concurrent programs using threads and locks is cumbersome.
More complex concurrent facilities have been developed to address this such as
communication channels, concurrent collections, barriers, countdown latches, and
thread pools. These facilities are designed to more easily express specific concurrent
programming patterns, and some of them are covered in Chapter 3, Traditional
Building Blocks of Concurrency.
Traditional concurrency is relatively low level and prone to various kinds of errors,
such as deadlocks, starvations, data races, and race conditions. You will rarely use
low-level concurrency primitives when writing concurrent Scala programs. Still,
a basic knowledge of low-level concurrent programming will prove invaluable in
understanding high-level concurrency concepts later.
Introduction
[ 16 ]
Chapter 1
Preliminaries
This book assumes basic familiarity with sequential programming. While we advise
the readers to get acquainted with the Scala programming language, an understanding
of a similar language, such as Java or C#, should be sufficient for reading this book.
A basic familiarity with concepts in object-oriented programming, such as classes,
objects, and interfaces is helpful. Similarly, a basic understanding of functional
programming principles such as first-class functions, purity, and type-polymorphism
are beneficial in understanding this book, but are not a strict prerequisite.
[ 17 ]
Introduction
We can run this program using the Simple Build Tool (SBT), as described in the
Preface. When a Scala program runs, the JVM runtime allocates the memory required
for the program. Here, we consider two important memory regions: the call stack
and the object heap. The call stack is a region of memory in which the program
stores information about the local variables and parameters of the currently executed
methods. The object heap is a region of memory in which the objects are allocated by
the program. To understand the difference between the two regions, we consider a
simplified scenario of this program's execution.
First, in figure 1, the program allocates an entry to the call stack for the local variable
s. Then, it calls the square method in figure 2 to compute the value for the local
variable s. The program places the value 5 on the call stack, which serves as the
value for the x parameter. It also reserves a stack entry for the return value of the
method. At this point, the program can execute the square method, so it multiplies
the x parameter by itself, and places the return value 25 on the stack in figure 3. This
is shown in the first row in the following illustration:
[ 18 ]
Chapter 1
After the square method returns the result, the result 25 is copied into the stack
entry for the local variable s, as shown in figure 4. Now, the program must create the
string for the println statement. In Scala, strings are represented as object instances
of the String class, so the program allocates a new String object to the object heap,
as illustrated in figure 5. Finally, in figure 6, the program stores the reference to the
newly allocated object into the stack entry x, and calls the println method.
Although this demonstration is greatly simplified, it shows the basic execution
model for Scala programs. In Chapter 2, Concurrency on the JVM and the Java Memory
Model, we will learn that each thread of execution maintains a separate call stack,
and that threads mainly communicate by modifying the object heap. We will learn
that the disparity between the state of the heap and the local call stack is frequently
responsible for certain kinds of error in concurrent programs.
Having seen an example of how Scala programs are typically executed, we now
proceed to an overview of Scala features that are essential to understand the
contents of this book.
A Scala primer
In this section, we present a short overview of the Scala programming language
features that are used in the examples in this book. This is a quick and cursory glance
through the basics of Scala. Note that this section is not meant to be a complete
introduction to Scala. This is to remind you about some of the language's features,
and contrast them with similar languages that might be familiar to you. If you would
like to learn more about Scala, refer to some of the books referred in the summary
of this chapter.
A Printer class, which takes a greeting parameter, and has two methods named
printMessage and printNumber, is declared as follows:
class Printer(val greeting: String) {
def printMessage(): Unit = println(greeting + "!")
def printNumber(x: Int): Unit = {
println("Number: " + x)
}
}
In the preceding code, the printMessage method does not take any arguments,
and contains a single println statement. The printNumber method takes a single
argument x of the Int type. Neither method returns a value, which is denoted
by the Unit type. The Unit type can be omitted, in which case it is inferred
automatically by the Scala compiler.
[ 19 ]
Introduction
Scala allows the declaration of singleton objects. This is like declaring a class and
instantiating its single instance at the same time. We saw the SquareOf5 singleton
object earlier, which was used to declare a simple Scala program. The following
singleton object, named Test, declares a single Pi field and initializes it with the
value 3.14:
object Test {
val Pi = 3.14
}
Where classes in similar languages extend entities that are called interfaces, Scala
classes can extend traits. Scala's traits allow declaring both concrete fields and
method implementations. In the following example, we declare the Logging trait
that outputs custom error and warning messages using the abstract log method,
and then mix the trait into the PrintLogging class:
trait
def
def
def
}
class
def
}
Logging {
log(s: String): Unit
warn(s: String) = log("WARN: " + s)
error(s: String) = log("ERROR: " + s)
PrintLogging extends Logging {
log(s: String) = println(s)
Classes can have type parameters. The following generic Pair class takes two type
parameters P and Q, which determine the types of its arguments, named first
and second:
class Pair[P, Q](val first: P, val second: Q)
Scala has support for first-class function objects, also called lambdas. In the following
code snippet, we declare a twice lambda, which multiplies its argument by two:
val twice: Int => Int = (x: Int) => x * 2
Chapter 1
In the preceding code, the (x: Int) part is the argument to the lambda, and x *
2 is its body. The => symbol must be placed between the arguments and the body
of the lambda. The same => symbol is also used to express the type of the lambda,
which is Int => Int. In the preceding example, we can omit the type annotation
Int => Int, and the compiler will infer the type of the twice lambda automatically,
as shown in the following code:
Alternatively, we can omit the type annotation in the lambda declaration and arrive
at a more convenient syntax, as follows:
val twice: Int => Int = x => x * 2
Finally, whenever the argument to the lambda appears only once in the body of the
lambda, Scala allows a more convenient syntax, as follows:
val twice: Int => Int = _ * 2
A by-name parameter is formed by putting the => annotation before the type.
Whenever the runTwice method references the body argument, the expression
is re-evaluated, as shown in the following snippet:
runTwice { // this will print Hello twice
println("Hello")
}
Scala for expressions are a convenient way to traverse and transform collections.
The following for loop prints the numbers in the range from 0 until 10, where 10
is not included in the range:
for (i <- 0 until 10) println(i)
In the preceding code, the range is created with the expression 0 until 10, which
is equivalent to the expression 0.until(10), which calls the method until on
the value 0. In Scala, the dot notation can sometimes be dropped when invoking
methods on objects.
[ 21 ]
Introduction
Every for loop is equivalent to a foreach call. The preceding for loop is translated
by the Scala compiler to the following expression:
(0 until 10).foreach(i => println(i))
The negatives value contains negative numbers from 0 until -10. This forcomprehension is equivalent to the following map call:
val negatives = (0 until 10).map(i => -1 * i)
It is also possible to transform data from multiple inputs. The following forcomprehension creates all pairs of integers between zero and four:
val pairs = for (x <- 0 until 4; y <- 0 until 4) yield (x, y)
Throughout this book, we rely heavily on the string interpolation feature. Normally,
Scala strings are formed with double quotation marks. Interpolated strings are
preceded with an s character, and can contain $ symbols with arbitrary identifiers
resolved from the enclosing scope, as shown in the following example:
val magic = 7
val myMagicNumber = s"My magic number is $magic"
Pattern matching is another important Scala feature. For readers with Java, C#,
or C background, it suffices to say that Scala's match statement is like the switch
statement on steroids. The match statement can decompose arbitrary datatypes,
and allows you to express different cases in the program concisely.
[ 22 ]
Chapter 1
Finally, Scala allows defining package objects to store top-level method and value
definitions for a given package. In the following code snippet, we declare the
package object for the org.learningconcurrency package. We implement the
top-level log method, which outputs a given string and the current thread name:
package org
package object learningconcurrency {
def log(msg: String): Unit =
println(s"${Thread.currentThread.getName}: $msg")
}
We will use the log method in the examples throughout this book to trace how the
concurrent programs are executed.
This concludes our quick overview of important Scala features. If you would like to
obtain a deeper knowledge about any of these language constructs, we suggest that
you check out one of the introductory books on sequential programming in Scala.
[ 23 ]
Introduction
Summary
In this chapter, we studied what concurrent programming is and why Scala is a good
language for concurrency. We gave a brief overview of what you will learn in this
book, and how the book is organized. Finally, we stated some Scala preliminaries
necessary for understanding the various concurrency topics in the subsequent
chapters. If you would like to learn more about sequential Scala programming, we
suggest that you read the book Programming in Scala, Martin Odersky, Lex Spoon, and
Bill Venners, Artima Inc.
In the next chapter, we will start with the fundamentals of concurrent programming
on the JVM. We will introduce the basic concepts in concurrent programming,
present the low-level concurrency utilities available on the JVM, and learn about
the Java Memory Model.
Exercises
The following exercises are designed to test your knowledge of the Scala
programming language. They cover the content presented in this chapter, along
with some additional Scala features. The last two exercises contrast the difference
between concurrent and distributed programming, as defined in this chapter. You
should solve them by sketching out a pseudocode solution, rather than a complete
Scala program.
1. Implement a compose method with the following signature:
def compose[A, B, C](g: B => C, f: A => B): A => C = ???
The resulting Option object should contain a tuple of values from the
Option objects a and b, given that both a and b are non-empty. Use
for-comprehensions.
3. Implement a check method, which takes a set of values of the type T and
a function of the type T => Boolean:
def check[T](xs: Seq[T])(pred: T => Boolean): Boolean = ???
[ 24 ]
Chapter 1
The method must return true if and only if the pred function returns true
for all the values in xs without throwing an exception. Use the check
method as follows:
check(0 until 10)(40 / _ > 0)
4. Modify the Pair class from this chapter so that it can be used in a
pattern match.
If you haven't already, familiarize yourself with pattern
matching in Scala.
[ 25 ]
Alternatively, you can buy the book from Amazon, BN.com, Computer Manuals and
most internet book retailers.
www.PacktPub.com