Learning F# Functional Data Structures and Algorithms - Sample Chapter
Learning F# Functional Data Structures and Algorithms - Sample Chapter
$ 39.99 US
26.99 UK
P U B L I S H I N G
If you have just started your adventure with F#, then this book
will help you take the right steps to become a successful F#
coder. An intermediate knowledge of imperative programming
concepts, and a basic understanding of the algorithms and
data structures in .NET environments using the C# language
and BCL (Base Class Library), would be helpful.
ee
Sa
pl
e
C o m m u n i t y
E x p e r i e n c e
D i s t i l l e d
Preface
"If there's a book that you want to read, but it hasn't been written yet, then you
must write it."
Toni Morrison
F# is a multiparadigm programming language that encompasses object-oriented,
imperative, and functional programming language properties. The functional
paradigm can be defined as programming with pure functions, programming by
function composition, and a combination of both. For over a quarter of a century,
functional programming languages such as Lisp, Haskell, and standard ML existed
in academia, but industry adaption has been quite slow. With the introduction
of F#, an open source functional programming language, this trend is witnessing
a significant change. F# runs on the .NET runtime and supports libraries from other
IL-based programming languages.
Due to the seemingly overarching title of this manuscript, a few disclaimers are in
order. This book is an introduction to F#, functional data structures, and algorithms.
These topics are fairly large in their individual capacity. A large body of growing
literature exists in each of these areas itself. Therefore, it won't be a reasonable
expectation to provide a comprehensive picture of data structures and algorithms in
the limited amount of space available in this book. Instead, this book is intended as a
cursory introduction to the use and development of data structures and algorithms
using F#. The goal is to provide a broader overview and resources to the reader to
get started with functional programming using F#.
This book is written with a few assumptions, keeping the reader in mind. We assume
that the reader has basic knowledge of an imperative programming language and
object-oriented concepts. Readers are highly encouraged to try out examples, use
the resources listed in Chapter 10, Where to Go Next?, and review specialized texts for
a more comprehensive treatment of algorithms and data structures.
Preface
Starting with the basic concepts of F#, this book will help you to solve complex
computing problems with simple, maintainable, and robust code. Using easy-tounderstand examples, you will learn how to design data structures and algorithms
in F# and apply these concepts in real-life projects, as well as gain insights into how
to reuse libraries available in community projects. You will also learn how to set
up Visual Studio .NET and F# compiler to work together, implement the Fibonacci
sequence and Tower of Hanoi using recursion, and apply lazy evaluation for
quick sorts. The book will then cover built-in data structures and take you through
enumerations and sequences. You will gain knowledge about stacks, graph-related
algorithms, and implementations of binary trees. Next, you will understand the
custom functional implementation of a queue and look at the already available
collection and concurrent collection structures. You will also review sets and maps
and explore the implementation of a vector.
In the final leg of this book, you will find resources and references that will give you
a great overview of how to build an application in F# and do great things. We have
tried our best to provide attribution to all the resources used in this book. However,
if anything has been missed, let us know. To build upon the fundamentals you
would learn in this book, we have created a code repository to solve project Euler
algorithmic problems. Project Euler is a series of challenging mathematical and
computer programming problems that require working with algorithms and data
structures. You will see our solutions on the GitHub repo at https://github.com/
adnanmasood/Euler.Polyglot.
In the cover, the choice of lush landscape and central figure reminiscent of general
Sherman trail is an attempt to portray the variety of programming paradigms and
the potential strength of functional concepts. In the words of Ryan Bozis, learn these
functional constructs, and you'll be able to program your very own forest. Being
polyglot is good! Learning a new programming language broadens your thinking
and provides you a competitive edge. Happy functional programming!
Preface
Chapter 2, Now Lazily Get Over It, Again, will prepare you to delve into the
intermediate F# concepts which you are going to utilize later. It will help you in
setting up the Visual Studio .NET and F# Compiler to work together along with the
environment and runtime, review how to run your F# programs in IDE and through
interactive REPL shell, implement the Fibonacci sequence and Tower of Hanoi using
recursion, and apply lazy evaluation for quick sort.
Chapter 3, What's in the Bag Anyway?, will provide insights about the built-in data
structuresarray, list, set, and map, and will present their typical use cases.
Chapter 4, Are We There Yet?, delves into sequence expression (seq), implementation
of custom enumeration for purpose of sequence expression (that is, paging
functionality), and application of simple custom types using records, tuples.
Chapter 5, Let's Stack Up, will help you build a basic ADT of a stack using F#,
implement the fundamental operations, and proceed to make a concurrent version
of a stack. You will also learn how to do unit testing in C# for an F# program and
implement the same test method in F#.
Chapter 6, See the Forest for the Trees, will explain graph related algorithms, and
teach you the implementation of your own trees. You will also learn to tackle tree
searching and various other traversal techniques.
Chapter 7, Jumping the Queue, discusses the custom functional implementation
of a queue. You will then be introduced to the FSharpX open source collection
of functional data structures. Finally, you will explore the F# agent of
MailboxProcessor, for creating async work flows, throttling, and post-processing of
the results of asynchronous calls as an example usage of a queue.
Chapter 8, Quick Boost with Graph, will briefly discuss how a graph can be
implemented in a functional language, and why it is a rather difficult task to
undertake. You will then discover some commonly used graph implementations
and explore one of the most typical shortest path graph implementation, Dijkstra.
Chapter 9, Sets, Maps, and Vectors of Indirections, reviews sets and maps, and explores
a custom implementation of a vector. Additionally, you are going to discuss
Intermediate Language and how it works in the .NET ecosystem.
Chapter 10, Where to Go Next?, is a reference chapter in which you can acquaint
yourself with the detailed list of different resources around the functional eco-system,
and the F# programming language. You will also find various guides, source code
and links, which will assist you in getting additional information you will need to
polish your knowledge about F#.
By the end of this chapter, you will be familiar with a brief history of functional
programming. With comparative code examples, we will analyze code samples using
mutable, and immulatable data structures as well as imperative control flow syntax that
will allow you, the reader, to fully understand and embrace the hybrid nature of F#.
In this chapter, we will cover the following topics:
Like its namesake, the functional programming paradigm uses pure functions, as in
mathematical functions, as its core construct. The prcis of function as a programming
language construct stems from Alonzo Church and J. Barkley Rosser's work on lambda
calculus. As in mathematical functions, the imperative in function based programing is to
avoid state and mutation. Like mathematical functions, one should be able to invoke a
function multiple times with no side effects, that is, always get the same answers.
This style of programing has deep implementation consequences; a focus on
immutable data (and structures) leads to programs written in a declarative manner
since data structure cannot be a modified piecemeal.
[2]
Chapter 1
Pure functions offer referential transparency, that is, a function always returns the
same value when given the same inputs. Pure functions are not always feasible
in real-world scenarios such as when using persistent storage, randomization,
performing I/O, and generating unique identifiers. Technically speaking, a time
function shouldn't exist in a pure functional programming language. Therefore,
pure functional languages such as Haskell use the notion of IO Monads to solve this
dogmatic conundrum. Luckily for us, the hybrid (albeit more practical) languages
such as F# are multi-paradigm and can get around this restriction quite easily.
Functional syntax tends to be less verbose and more terse than its imperative or
object oriented counterpart. The terseness keeps KLOC low and often results to
the improved developer productivity. In terms of productivity, since functional
programming promotes and encourages rapid prototyping, it benefits building and
testing out proof of concept implementations. This results in code that has more
brevity, is more resilient to change, and has fewer bugs.
Even though this is not strictly a feature of functional programming, several crosscutting concerns come standard along with most functional programming languages.
These include protected environments, pattern matching, tail-call optimization,
immutable data structures, and garbage collection.
If you have written multi-threaded code, you'd know that debugging the concurrency
issues in a multi-threaded environment is difficult to say the least. Arguably, one of
the best features of functional programming is thread safety through immutability.
The notion of concurrent collections in modern programming languages has its roots
in functional programming. The design and use of immutable data structures prevents
the process from running into race conditions and therefore does not present a need for
explicit locking, semaphores, and mutex programs. This also helps in parallelization,
one of the unrealized promises of functional programming.
In this book, we will discuss these and various other functional programming
features in detail, especially in context of F#. As a reader who is potentially familiar
with either object oriented or imperative programming, you will enjoy the use of
fluent-interface methods, lazy and partial evaluation, currying and memoization,
and other unique and interesting concepts that make your life as a developer more
fulfilling, and easier too.
A historical primer of F#
With the advent of a multi-paradigm language with functional programming
support such as Lisp in 1958 by John McCarthy, the functional paradigm gained
mainstream exposure. Due to its multi-paradigm nature, there is a debate around
Lisp being a pure functional programming language. However, Scheme, one of
the Lisp dialects which didn't appear till 1975, tends to favor the functional style.
The salient features of this style includes use of tail recursion and continuations to
express control flow.
[4]
Chapter 1
[5]
Yes, this is all you need. Notice the terseness, simplicity, and lack of clutter. Now
let's run this in the F# interactive environment. In order to run it, you would need to
have ";;" at the end of the statement. We will provide more details on this interactive
environment setup later in Chapter 2, Now Lazily Get Over It, Again.
This is the response that you see when you run the preceding line of code. It is
a minimal viable example; however these attributes of simplicity, terseness, and
simplification extend beyond HelloWorld samples as you will see.
Let's look at a simple function, square. You can write a function in F# as follows:
let square = fun n -> n * n
Or you can write it in a simpler syntax like the next one. Notice the first-class
citizenship in action here:
let square n = n * n
[6]
Chapter 1
When this function is executed in F# interactive, you can immediately see the results
upon invocation as in the following screenshot:
byte b = 10uy
sbyte sb = -128y
int16 i = -100s
uint16 ui = 100us
int = -42
uint = 0x42u
int64 = 238900L
uint64 = 2,660,000,000UL
float f = 3.14159265359
double db = 2.718281828459045
float32 f32 = 2.7182818
decimal d = 3.14159265358979323846264338
bignum gogol = 10I ** 100
string = "n, mi gunxi"
[7]
Similar to standard CLR data types, F# also uses the standard mathematical
operators such as + /, % ( modulus ) and and ( power ) . Logical operators
such as & & ( and ) | | ( or ) and !(not ) are supported along with mathematical functions
such as abs,ceil,exp, floor,log,sqrt,cos,sin,tan,and pown . A detailed F# language
reference, including Symbol and Operator Reference, can be found at http://msdn.
microsoft.com/en-us/library/dd233228.aspx.
At this time, we would also like to briefly introduce you to one of the highly useful
features of F# IDE, the REPL. REPL (ReadEvalPrint Loop) is an interactive
language shell to take expression inputs, evaluate, and provide output to the users.
REPL allows developers to interact with the language easily and to invoke and test
expressions in real-time before writing the entire program. FSI (F Sharp Interactive)
is the REPL implementation in F#. You will read more about installing and
configuring FSI in Chapter 2, Now Lazily Get Over It, Again. For now you can use the
command line version of FSI by invoking it directly in a console:
C:\Program Files\Microsoft F#\v4.0\fsi.exe
You can also use the #help;; directive to list other directives inside FSI.
You will see the let binding being used quite frequently for declaring variables,
functions, and so on. Functions put functions in functional programming and hence,
they are ubiquitous. Technically speaking, F# doesn't have any statements, it only
has expressions. The following is a simple example of a function:
let cubeMe x = x * x * x;;
Recursive functions are defined using the keyword rec. Here is a simple
implementation of a recursive Fibonacci function:
let rec fib n =
if n <= 2 then 1
else fib (n - 1) + fib (n - 2)
The preceding code for the Fibonacci method takes one parameter as an input.
However, a function can have more than one parameters following the same code.
let Mult x y = x * y ;;
[8]
Chapter 1
However the preceding function will fail upon execution without a hint. We will see
the following error on screen:
error FS0001: This expression was expected to have type
here has type
int
float
but
But the same method will work just fine if the specified data type is passed as float.
> areaOfCircle 8.0;;
val it : float = 201.0619298
Moreover, you cannot call the inner function directly. That is why the direct call to
the square method will return the following error:
square 10;;
^^^^^^
error FS0039: The value or constructor 'square' is not defined
=
= 0 then
"Number ends in 0"
"Number does not end in zero";;
The print expression will return a value. You can also see the use of elif which is
used as a shortcut for else if.
[9]
Tuples are now part of a standard CLR system, but most of us remember the
struggle before tuples. Tuples are the containers for potentially different types,
as seen in the following code:
let t = ("cats", "dogs", System.Math.PI, 42, "C#", "Java");;
val t : string * string * float * int * string * string =
("cats", "dogs", 3.141592654, 42, "C#", "Java")
This also applies to the strings where you can access an individual element of a
string as follows:
let str = "Lo pngyu, n knqli hn yu jngshn."
printfn "%c" str.[9]
They can be sliced using index (arrays are zero base indexed) as seen in the
following code:
let TopThree = OneToHundred.[0..2];;
[ 10 ]
Chapter 1
This can be explicitly called like the original method, allowing us to compose complex
methods using the basic ones. Here, Add10 is a closure that takes one argument and
adds 10 to it as seen in the following code:
Add10 42
>
val it : int = 52
Closures are functionally defined as a first-class function with free variables that are bound
in the lexical environment. In F#, functions are first class members of the programming
society; closures encapsulate an environment for pre-bound variables and create a code
block. This way we can pre-define some arguments (in this case, 10). Closure promotes
reuse and helps in building complex functions from simpler ones.
With functions as the first class citizens, in F# we can create higher order functions,
that is, functions upon functions. Higher order functions operate by taking a function
as an argument, or by returning a function. Following are two simple functions:
let increament n = n + 1
let divideByTwo n = n / 2
Now we will define a higher order function which applies function upon function:
let InvokeThrice n (f:int->int) = f(f(f(n)))
Now we will use the InvokeThrice function, which will apply the function upon
itself three times as defined in the preceding line of code:
let res = InvokeThrice 6 increament
>
val res : int = 9
In this example, you witnessed the amazing power of declaring functions. A similar
approach can be applied to division as follows:
let res = InvokeThrice 80 divideByTwo
>
val res : int = 10
[ 11 ]
In the preceding syntax for the InvokeThrice function, you will notice the use of a
lambda expression. Lambda expressions are ubiquitous in functional programming.
In reality, these expressions are syntactic sugar (directives, shortcuts, or a terse way of
defining something) to declare anonymous methods. A lambda expression is created
using the fun keyword, that is, function, followed by arguments which are supposed
to be passed to the function. This function declaration is then followed by the lambda
arrow operator -> and the lambda expression which defines the body of the function.
For example, instead of passing the function, I can pass the lambda expression during
the InvokeThrice invocation to apply exponential operation (power 3).
let InvokeThrice n (f:double->double) = f(f(f(n)))
let x = InvokeThrice 2.0 (fun n -> n ** 3.0)
[ 12 ]
Chapter 1
The following is the mapping function that will square all the elements in the array,
and return a new array:
let squares =
nums
|> Array.map (fun n -> n * n)
When you run the square method on nums, you get the following output:
val squares : int [] =
[|0; 1; 4; 9; 16; 25; 36; 49; 64; 81; 100; 121; 144; 169; 196; 225;
256;
//snip
8649; 8836; 9025; 9216; 9409; 9604; 9801|]
The opposite of the map operation is the fold operation. You can think of the folding
operations as aggregations. As seen in the preceding code snippet, map takes a
collection of arrays and generates another collection. However, the folding operation
takes a collection of arrays as input and returns a single object.
For example, in the next statement, Array.fold takes three argumentsa function,
an initial value for the accumulator, and an array. It sums up the squares of all the
three parameters and returns the output:
let sum = Array.fold(fun acc n ->
acc + n ) 0 squares
Along with map and fold, filtering is another operation which comes in handy to
select and filter elements based on a condition (predicate). In the following example,
Array.filter takes an array of last names and folders them based on the length.
Any last name longer than 6 characters will be classified as a long name.
let castNames = [| "Hofstadter"; "Cooper"; "Wolowitz"; "Koothrappali";
"Fowler"; "Rostenkowski"; |]
let longNames = Array.filter (fun (name: string) -> name.Length > 6)
castNames
[ 13 ]
Similar to map, which applies a function on a collection, a zipping function takes two
collections and combines them. In the following example we have two lists:
let firstNames = [| "Leonard"; "Sheldon"; "Howard"; "Penny"; "Raj";
"Bernadette"; "Amy" |]
let lastNames = [| "Hofstadter"; "Cooper"; "Wolowitz"; "";
"Koothrappali"; "Rostenkowski"; "Fowler" |]
A zip operation when applied on the array returns their full names:
let fullNames = Array.zip(firstNames) lastNames
Last but not the least, another salient feature of F# language is Lazy or delayed
evaluation. These lazy expressions only get evaluated when forced, or when a value
is required to be returned. The value then gets memoized (a fancy functional name for
caching), and is returned on future recalls. The following is a simple divide method:
let divide x y =
printfn "dividing %d by %d" x y
x / y
val divide : x:int -> y:int -> int
When you invoke the method with the Lazy keyword, the output shows that the
value does not get created right away.
let answer = lazy(divide 8 2)
val answer : Lazy<int> = Value is not created.
> dividing 8 by 2
4
val it : unit = ()
Now upon force invocation, you would see the value was evaluated by calling
the function and therefore you also see dividing 8 by 2 getting printed on the FSI
console. Upon consecutive calls such as
printfn "%d" (answer.Force())
[ 14 ]
Chapter 1
You would not see dividing 8 by 2 getting printed on the FSI console because the
value has been computed and memoized. Collections such as sequence are lazy by
default, which you will learn in subsequent chapters.
This concludes our whirlwind introduction to the F# programming language; if you
are new to F#, you should revise this section a couple of times and run this in the
interactive environment to gain familiarity with these fundamental language constructs.
Here you see the use of one of F#'s celebrated operators, that is, the |> pipe forward
operator. It essentially performs piping operations by passing the results from left
the side of the function to the right side, and can be concatenated.
Running this program in F# the interactive console yields the following results for
sumOfSquares 2
and
sumOfSquares 3
respectively:
[ 15 ]
Again, the C# version is quite verbose and can be made more functional by using
LINQ as seen next:
public static int SquaresSum(int n)
{
return Enumerable.Range(1, n)
.Select(i => i * i)
.Sum();
}
In this case, IEnumerable is used along with a Select filter, which sums up the
results. Numbers from a sequence are each squared and aggregated into a sum.
Project Euler provides a series of mathematical and programming problems that can
be solved using programming languages of your choice. Following is problem #1
from Project Euler:
If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5,
6 and 9. The sum of these multiples is 23 Find the sum of all the multiples of 3 or 5
below 1000.
[ 16 ]
Chapter 1
In this case we operate on 1-999, chain the operator with map to perform a modulus
operation, and then sum up the results. An alternate approach is to use a filter that
categorizes the results and provides a collection to perform a sum on. This approach
can be listed as follows:
let total = [1..999] |> List.filter (fun i -> i % 5 = 0 || i % 3 = 0)
|> List.sum
Another better way of doing this can be seen in the next code listing:
var total = Enumerable.Range(1, 999).Sum(x => x%3 == 0 || x%5 == 0 ? x
: 0);
The F# solutions of Project Euler problems, to further help understand algorithms and
data structures can be found at https://github.com/adnanmasood/Euler.Polyglot.
A good analogy is LaTeX versus Microsoft Word. You may have used LaTeX for
typesetting. For complicated tasks, Word becomes too complex or even unusable.
Marko Pinteric explains why you would want to use LaTeX instead of Word with
the following graph:
impossible to do
MS Word
LaTeX
Complexity and learning curve. Using LaTeX on Windows by Marko Pinteric (www.pinteric.com/miktex.html)
The same applies to F#. Functional programming does have a learning curve but it
equips you with the tools needed to go further in algorithmic software development.
This eventually leads to the argument of general benefits of functional programming
over imperative and object oriented languages.
Using functional programming with F#, one can arguably formulate and design
solutions in an easier, more effective manner, especially if these problems pertain
to the algorithmic domain. As a functional language, F# facilitates keeping the
problem closer to their definition in a concise and terse manner. From the testability
perspective, the resulting code becomes less error-prone due to its powerful type
system, intuitive recursive representation of algorithms, and built-in immutability.
Data structure immutability is especially helpful in the case of multi-threaded
scenarios. This is, in essence, due to built-in data type immutability.
The specific F# advantages include the following:
1. Interoperability with the .NET CLR languages.
2. Ease of asynchronous programming, intuitive use of async {} expressions.
3. Full Visual Studio .NET IDE integration with compiler and debugger support.
4. Suitability for writing domain-specific languages and compilers.
[ 18 ]
Chapter 1
Summary
To summarize, F# provides the combined benefits of succinct syntax, immutable
types, interoperability, efficiency, concurrency, and scalability an impressive list.
Functional programming has a well established repertoire as an efficient way of
modeling complex problems in its respective mathematical form. F#, as a modern
multi-paradigm language, is quite practical for enterprises, and gives developers
and software architects an excellent reason to start using functional programming in
their projects.
We recommend reading Functional thinking: Why functional programming is on the
rise, by Neal Ford, who is a software architect at ThoughtWorks, at www.ibm.com/
developerworks/library/j-ft20/ as a follow up reading to reinforce some of the
concepts discussed in this chapter.
In this chapter, we have covered an introduction to functional programming
paradigm along with some key syntactical elements of the F# programming language.
We have established the notion of thinking in functional style and explained why
functional programming matters? We also elaborated on the benefits of functional
programming and functional data structures along with code based comparisons of
imperative and functional paradigms.
In the next chapter, we will gain further knowledge about the F# tooling, syntax,
and semantics of the language and learn to write some programs using F#.
[ 19 ]
www.PacktPub.com
Stay Connected: