Julia For Beginners Sample
Julia For Beginners Sample
Julia For Beginners Sample
Preface
I began programming as a teenager with fun books containing comic book strips
with wizards and turtles. I read magazines that showed me how to make my own
simple games or make silly stuff happen on the screen. I had fun.
But when I went to university, books started talking about bank accounts, bal-
ances, sales departments, employees and employers. I wondered if my life as
a programmer would mean putting on a gray suit and writing code handling
payroll systems. Oh the horror!
At least half of my class hated programming with a passion. I could not blame
them. Why did programming books have to be so boring, functional and sensi-
ble?
Where was the sense of adventure and fun? Fun is underrated. Who cares if a
book is silly and has stupid jokes if it makes you learn and enjoy learning?
That is one of the reasons I wrote this book. I wanted the reader to enjoy learn-
ing programming. Not through cracking jokes, but by working through pro-
gramming examples that are interesting or fun to do.
I promise you, there will be no examples modeling a sales department. Instead
we will simulate rocket launches, pretend to be Caesar sending a secret message
to his army commanders using old Roman encryption techniques, as well as
simulating a beautiful old handheld mechanical calculator, the Curta, and many
other things.
The second important reason I wanted to write this book is because people keep
telling me: “Julia? Isn’t that a language only for science and scientists?”
Julia has had major success in this area, which is why the Julia community today
is full of brainy people working on hard problems such as developing new drugs,
modeling the spread of infectious diseases, climate change or the economy.
But no, you don’t need to be a genius or a scientists to use Julia. Julia is a
wonderful general purpose programming language for everyone! I am not a
scientist and I have enjoyed using it for over 7 years now. With Julia you will
find that you can solve problems more quickly and elegantly than you have done
in the past. And as a cherry on top, computationally intensive code will run
blisteringly fast.
1
2 PREFACE
Introduction
Software is everywhere around us. Every program you see on your computer,
smart phone or tablet has been made by someone, who wrote code in a program-
ming language.
But that isn’t the only places you’ll find software. You may think of a computer
as box that sits on your desk or a laptop computer, but there are tiny computers
we call micro controllers inside almost any kind of technical device we use. In
cars for instance, these little computers figure out how much gasoline needs to
be injected into the engine cylinder.
Have you ever seen the Falcon 9 rocket land at sea on a barge by firing its rocket
engines right before it smashes into the deck? There are numerous computers
inside this rocket keeping track of how fast the rocket is going, how far it is from
the ground and exactly how much thrust it has to apply, and for how long, to
avoid crashing.
3
4 INTRODUCTION
4. Batteries included. Don’t you hate it when you unpack a cool new thing
and it doesn’t work, because batteries are sold separately? A lot of pro-
gramming languages are like that, but not Julia.
5. Fast. You can write code that runs slow in any language, but Julia gives
you the ability to write very high performance code.
6 INTRODUCTION
Overview
The chapters in this book are meant to be read in sequence, and build upon each
other. However not every single chapter needs to be read.
This book tries to balance the needs of three different kinds of readers:
7
8 OVERVIEW
Another problem was to figure out the correct angle to orient the artillery can-
nons to hit their targets. Say you want to hit an enemy bunker 8 km away. At
9
10 WORKING WITH NUMBERS
This is a mathematical problem, and the poor grunts fighting the war could not
bust out a pen and paper and start doing complicated math calculations each
time they wanted to fire a cannon. How do you think they did it?
The soldier manning the artillery cannons had books with numerous tables. The
tables would tell them what elevation (angle) to put the cannon in order to fire
the artillery projectile (cannon ball) the desired distance. But these tables could
get very complicated, because so many things affect how far the cannon ball
would go:
• wind
• amount of gunpowder
• the particular kind of cannon (artillery) used.
ANGLE OF REACH 11
This meant they needed countless tables, which is why during WWII the allies
had huge rooms filled with people calculating these tables. People doing these
calculations were called computers. That was what a computer was before elec-
tronic computers. It was a person doing lots of calculations.
The first computers were made to replace thousands of human computer, so
that all these tables could be quickly calculated by a machine. Let us look at
how these calculations where done.
Angle of reach
Say you got a cannon and you want to shot an enemy. Your enemy is at a dis-
tance from you. The cannon ball you fire, exits the cannon with a velocity .
What angle does your cannon need to be elevated?
That angle is called the “angle of reach” and is calculated as follows:
arcsin
Figure 4: A triangle with sides of length a, b and h. The longest side h is called
the hypothenuse
sin arcsin
12 WORKING WITH NUMBERS
Normally you give an angle to the sin function. The result is , where is the
hypothenuse and the length of the triangle opposite the angle .
Thus if we want to calculate multiple trajectories for our artillery book we could
use Julia as a calculator.
julia> 0.5*asin(9.81*12000/762.425^2)
0.10196244313304187
julia> 0.5*asin(9.81*25000/762.425^2)
0.217772776595389
The angles here are in radians so they will be from -π to π, rather than from 0
to 360.
Imagine doing thousands of these calculations. No wonder they needed rooms
full of people calculating! This is going to get boring and tedious even with a cal-
culator. And they did not have electronic calculators but mechanical calculators
which could basically only add and subtract.
This poses several problems. We have to keep writing the same long numbers
over and over again. It is boring and time consuming to write 762.425 again
and again. Sooner or later we will get the wrong answer, mixing up one digit or
forgetting another one.
That is how what we do in natural language. Remember how I talked about our
ability to talk about context? Humans talking to each other don’t need to keep
repeating 762.425 when talking about the velocity. They can just say “initial
velocity”.
concepts in programming?
In this case it will actually be a constant as our initial velocity will be fixed to
the same value every time we do the calculations. But in programming we still
usually refer to it as a variable.
So what would be a good way of telling Julia that we want to use a letter or a
word to refer to a number?
There are many possible ways of doing this, which are equally valid. It is the
designers of the programming language who decide how to do it. What is most
important, is that it is done in a manner which is easy for programmers to re-
member.
v <- 762.425
v := 762.425
v = 762.425
NOTE
This is potentially confusing, because it is exactly the same sym-
bol used in mathematics for comparing two numbers. Keep in
mind that in Julia (and most other programming languages) it is
used for assignment and not for comparing values. The mathe-
matical expression is written x == y in Julia.
Thus on the Julia REPL (command line) we can input the values for velocity v
and the acceleration g caused by gravity:
julia> v = 762.425
762.425
julia> g = 9.81
9.81
Once written, Julia will remember the values of v and g and we can keep writing:
julia> 0.5*asin(g*12000/v^2)
0.10196244313304187
However it is still tedious to write this whole equation. Imagine writing this a
thousand times and the only thing you really change is the velocity. Everything
else stays the same each time, yet we have to keep writing it.
Is there perhaps a way in which we can get Julia to not just give a name to a
number, but to give a name to a whole equation or calculation?
14 WORKING WITH NUMBERS
Functions
You guessed it, functions. Functions allow you to give names to whole calcu-
lations. Thus instead of having to remember the details of how something is
calculated, you can simply refer to a calculation by name.
Lets make a naive attempt at writing such a function:
julia> angle = 0.5*asin(g*12000/v^2)
Nope, that won’t work. All we did was calculate an angle and stick it in a variable
named angle. We can write
julia> angle
Over and over again, but the problem is we get the same result each time. We
haven’t told Julia yet which variable we want to keep changing the value of. We
want to change the distance from 8, 12 to 25 km.
Thus somehow, when we define a function with a name, we have to tell Julia
which variable will keep changing. Having used variables already, we can use
a variable name to refer to this distance. But how can we tell Julia that the
variable named distance should keep changing but not the other variables?
Function arguments
When we define the function we specify function arguments. That is a list of the
variables which we want to change each time we use our function.
julia> angle(distance) = 0.5*asin(g*distance/v^2)
angle (generic function with 1 method)
This defines a function angle, with one single argument named distance. Now
I can write:
julia> angle(8000)
0.06771158922454301
julia> angle(12000)
0.10196244313304187
julia> angle(25000)
0.217772776595389
That is a lot better. Now we can calculate angles much faster and we don’t have
to memorize the gravitational acceleration, initial velocity and the equation any-
more.
Arrays
But it is still tedious to do the calculations. Say we have multiple distances in a
list:
ARRAYS 15
and you want to calculate the angle for each of these distances. Is there a simpler
way than writing angle() five times? Or what if we had a hundred or thousand
numbers. Manually calling angle for each of these numbers would be very time
consuming.
You may already be familiar with spreadsheet applications such as Microsoft
Excel or Apple’s Numbers. These programs excel at performing a calculation
on multiple numbers stored in tables. Below is an example from a spreadsheet
for calculating angles for different distances.
At the bottom of the image you can spot the formula used to calculate angles
given distances.
Julia also offers a way of working with tables of numbers. In Julia you create
the equivalent of tables with the Array data type. Julia arrays allow you to work
with numbers in rows and columns. When an array is just a row or column, we
call it one-dimensional. When an array is made up of several rows or columns
we call it a two-dimensional array or a matrix.
16 WORKING WITH NUMBERS
1 2 3 4 1 2 3 4
5 7 8 9 1 11 12 13 14
v[2] 2 21 22 23 24
We can store an array of numbers in a variable just like single numbers (scalars).
Notice how we can use _ to separate digits in long numbers.
julia> distances = [8_000, 12_000, 25_000, 31_000, 42_000]
5-element Array{Int64,1}:
8000
12000
25000
31000
42000
The benefit of having numbers in arrays is that you can work with the numbers
collectively. For instance you can tell Julia to find the average of all the numbers
in an array or the sum of them.
julia> sum(distances)
118000
118000
To get the median or mean we would need to use the statistics package.
julia> mean(distances)
23600.0
julia> median(distances)
25000.0
Accessing Elements
You can get hold of one or more values stored in an array in different ways. If
we deal with a 1 dimensional array, then the position of the element is defined
by an index. If we have a 2 dimensional array, the position is given by a row and
column.
julia> distances[1]
8000
julia> distances[2]
12000
julia> distances[5]
42000
julia> distances[end]
42000
julia> two_dim[1, 2]
4
julia> two_dim[2, 2]
12
18 WORKING WITH NUMBERS
map is a function which takes two arguments. The first argument is a function
and the second is an array. When you run map(f, xs) it will give you a new
array, where each number is the result of applying the function f given as first
argument to each successive number in the array xs given as second argument.
NOTE
Higher order functions are functions which take other functions
as arguments. So map is a higher-order function, while sum isn't.
Higher-order functions will be covered more in detail in [Func-
tional Programming]
Let us apply map one more time. We got our angles in radians. How about
turning all your angles into degrees? You can use the Julia function rad2deg
for this purpose.
julia> rad2deg(π)
180.0
julia> rad2deg(π/2)
90.0
Functions such as sum and map allow us to do what early computers did, when
they made artillery trajectory tables: perform the same calculation repeatedly.
This makes a computer different from a calculator. Calculators must be manu-
ally operated for every repeated calculation.
These functions may seem magical. Somehow they are able to look at each indi-
vidual element in an array and do something with it. How do you do that? Can
you build functions like this yourself? Yes you can!
LOOPS 19
Loops
Most programming languages today have what we call loop constructs or state-
ments. The common ones are the forloop and the whileloop.
Now we will do something we haven’t done thus far. We will write functions
spanning multiple lines. When writing statements or functions which span mul-
tiple lines we need to inform Julia of where they begin and end. All the code be-
tween the function and end keyword is part of the function. Here is the angle
function written over multiple lines:
function angle(distance)
0.5*asin(g*distance/v^2)
end
The first statement is a function statement which Julia can figure out by look-
ing at the first keyword. The very last end keyword indicates the end of the
function statement.
The forloop starts with the for keyword and ends with end. Every statement
between these two lines are performed multiple times. One way to think about
how a function is executed (performed) is to imagine a recursive substitution of
variables and function calls for values.
Step 1
Step 2
When calling the function we’ve got to imagine substituting the arguments of
the function with the passed value:
function addup([2, 4, 6])
total = 0
for num in [2, 4, 6]
total = total + num
end
total
end
Step 3
Lets focus on the forloop alone. We will successively assign the variable num a
value in the array [2, 4, 5].
for 2 in [2, 4, 6]
total = 0 + 2
end
Step 4
total has a new value 2 on next iteration.
for 4 in [2, 4, 6]
total = 2 + 4
end
Step 5
total is now 6. While num is also 6 as that is the last value in the numbers array.
for 6 in [2, 4, 6]
total = 6 + 6
end
Step 6
The last value in a function evaluates to the whole value of the function, so total
= addup(nums) becomes:
total = 12
the range and j is the last index (number) in the range. 2:6 is an example of a
range.
total = 0
for x = 2:6
total = total + x
end
total will get the value 2 + 3 + 4 + 5 + 6, as those are the values x will
successively assume.
You can use ranges in all sort of circumstances. You can use them instead of
specifying an array.
julia> sum(2:6)
20
A more basic but equivalent way of writing the first iteration is to use a while
loop
total = 0
x = 2
while x <= 6
total = total + x
x = x + 1
end
In this case we will keep performing the lines between while and end until the
condition x <= 6 is no longer true. So when x turns into 7, it will no longer
be true as 7 is not less than or equal to 6. We can compare numbers with these
operators:
• < less than
• > greater than
• <= less than or equal
• >= greater than or equal
• == equal to
• != not equal to
You might wonder where I am going with all this? This is prerequisite knowl-
edge to be able to explain how you can write a map function yourself. Still we
have only looked very briefly at how you access individual values in an array.
julia> angles = []
0-element Array{Any,1}
That doesn’t look right. It says we made an empty array to hold Any value. That
means we could put Bool, AbstractString or whatever there. Since we didn’t
put any numbers into it, Julia isn’t able to figure out what we intend to use it
for. So we need to help out Julia by writing the Type of the items as a prefix:
julia> angles = Float64[]
0-element Array{Float64,1}
We can use a for loop to iterate over distances and calculate angles.
julia> for dist in distances
push!(angles, angle(dist))
end
push! is a function which pushes values at the end of an array. You can see the
previously empty angles array has been filled with the same results as we got
previously with map.
julia> angles
5-element Array{Float64,1}:
0.06771158922454301
0.10196244313304187
0.217772776595389
0.27527867645775067
0.3938981904086532
When we put all these parts together we get our own map function. Let us call
the function transform. This map function is of course not as flexible as the
one bundled with Julia. For instance ours assume the output is always floating
point numbers.
We cannot solve that problem until we have covered more about types. The only
types you know of thus far are different types of numbers.
function transform(fun, xs)
ys = Float64[]
for x in xs
push!(ys, fun(x))
end
ys
end
However due to sloppy writing, these are the recordings the scientists got:
Then we assign the accurate measured distances and poorly recorded distances
to two separate variables.
measured = [11.5, 10.8, 11.4, 12.1, 12.2, 10.9]
recorded = [11.5, 10.8, 71.4, 12.1, 122, 10.9]
julia> mean(recorded)
39.78333333333334
filter is similar but it expects a function that returns true or false instead of
a number. We call this a boolean value. Where do these values come from?
24 WORKING WITH NUMBERS
julia> x = 4
4
julia> x > 5
false
julia> x < 5
true
julia> x == 4
true
julia> x != 4
false
You can see that expression using comparison operators such as >, < and ==
gives true or false as result. Functions which give a boolean result instead of
a number are called predicates. Let us define a predicate to use with our filter
function.
isvalid(x) = x < 14
julia> isvalid(15)
false
We can use it with filter to get only valid distances and calculate the average.
julia> filter(isvalid, recorded)
4-element Array{Float64,1}:
11.5
10.8
12.1
10.9
For the actual measured values this filtering will have no impact on the result.
julia> filter(isvalid, measured)
6-element Array{Float64,1}:
11.5
10.8
11.4
12.1
12.2
10.9
11.483333333333334
The code between if and end is only run if the expression x < 14 evaluates to
the boolean value true. This if-statement is equivalent to:
if isvalid(x)
push!(ys, x)
end
It does not matter what isvalid does with x as long as it evaluates to (returns)
a boolean value (true or false).
With this knowledge we can modify the pick function to make it work.
function pick(pred, xs)
ys = Float64[]
for x in xs
if pred(x)
push!(ys, x)
end
end
ys
end
26 WORKING WITH NUMBERS
In this version we use the predicate function pred passed in as first argument
with the if-statement to decide whether an element should be added or not.
Let us test out our function and see if it works.
julia> pick(isvalid, [4, 14, 18, 3, 1])
3-element Array{Float64,1}:
4.0
3.0
1.0
We can compare it with Julia’s built-in filter function to see if it gives the same
result
julia> xs = [4, 14, 18, 3, 1]
5-element Array{Int64,1}:
4
14
18
3
1
Roman Numerals
While roman numerals are not very practical to use today, they are useful to
learn about in order to understand number systems. In particular when pro-
gramming you will encounter various number systems.
Both Roman numerals and the binary, system used by computers, may seem
very cumbersome to use. However it often appears that way because we don’t
use the numbers as they were intended.
It is hard to make calculations using Roman numerals with pen and paper com-
pared to Arabic numerals (which is what we use). However the Romans did
not use pen and paper to perform calculations. Rather they performed their
calculations using a roman abacus.
It is divided into multiple columns. You can see the I, X and C column:
• In the I column every pebble is a 1.
• In X, every pebble represent 10.
• In C, every pebble represent 100.
Above each of these columns we got the V, L and D columns, which represent
the values 5, 50 and 500.
NOTE
The beauty of the Roman system is that you can quickly write
down exactly what the pebbles on the abacus say. Likewise it is
27
28 STORING DATA IN DICTIONARIES
Let us look at how we can use this knowledge to parse roman numerals and turn
them into Arabic numerals. Put the code below into a text file and save it.
roman_numerals =
Dict('I' => 1, 'X' => 10, 'C' => 100,
'V' => 5, 'L' => 50, 'D' => 500,
'M' => 1000)
function parse_roman(s)
s = reverse(uppercase(s))
vals = [roman_numerals[ch] for ch in s]
result = 0
for (i, val) in enumerate(vals)
if i > 1 && val < vals[i - 1]
result -= val
else
result += val
end
end
result
end
THE DICT TYPE 29
Load this file into the Julia REPL environment to test it out. This is an example
of using parse_roman with different roman numerals as input.
julia> dump(pair) # 3
Pair{Char,Int64}
first: Char 'X'
second: Int64 10
julia> pair.first # 4
'X': ASCII/Unicode U+0058 (category Lu: Letter, uppercase)
julia> pair.second
10
30 STORING DATA IN DICTIONARIES
When used in a dictionary we refer to the first values in each pair as the keys in
the dictionary. The second values in each pair form the values of the dictionary.
So I, X and C are keys, while 1, 10 and 100 e.g. are values.
We can ask a dictionary for the value corresponding to a key. This takes a Ro-
man letter and returns the corresponding value.
julia> roman_numerals['C']
100
julia> roman_numerals['M']
1000
julia> for ch in s
push!(vals, roman_numerals[ch])
end
julia> vals
3-element Array{Int8,1}:
10
1
5
“XIV” is turned into the array of values [10, 2, 5] named vals. However the
job is not quite done. Later we need to combine these values into one number.
Before converting input strings, our code turns every letter into uppercase. “xiv”
would not get processed correctly, because all the keys to our dictionary are
uppercase.
julia> s = "xiv"
"xiv"
julia> s = reverse(uppercase(s))
"VIX"
Enumerate
In our for-loop we need to keep track of the index of the value val of each loop
iteration. To get the index we use the enumerate function. That is what you
see used in the line for (i, val) in enumerate(vals). Here is a simple
demonstration of how it works:
julia> collect(2:3:11)
4-element Array{Int64,1}:
2
5
8
11
julia> collect(enumerate(2:3:11))
4-element Array{Tuple{Int64,Int64},1}:
(1, 2)
(2, 5)
(3, 8)
(4, 11)
The collect function will simulate looping over something, just like a for-loop.
Except it will collect all the values encountered into an array, which it returns.
So you can see with enumerate you get a pair of values upon each iteration: an
integer index and the value at that index.
32 STORING DATA IN DICTIONARIES
Conversion
We cannot simply add up the individual roman letters converted to their corre-
sponding values. Consider the roman number XVI. It turns into [10, 5, 1].
We could add that and get the correct result 16. However XIV is supposed to
mean 14, because with Roman numerals when you got a smaller value in front
of a larger one, such as IV, then you subtract the smaller value from the larger.
So we cannot just sum up the corresponding array [10, 1, 5]. Instead we
reverse it and work our way upwards. At every index we ask if the current value
is lower than the previous one. If it is, we subtract from the result. Otherwise
we add.
if i > 1 && val < vals[i - 1]
result -= val
else
result += val
end
That is what val < vals[i - 1] does. It compares the current value val, to the
previous value vals[i -1]. result is used to accumulate the value of all the
individual Roman letters.
Using Dictionaries
Now that we have looked at a practical code example utilizing the dictionary
type Dict in Julia, let us explore some more ways of interacting with a dictio-
nary.
Creating Dictionaries
There are a multitude of ways to create a dictionary. Here are some examples.
Multiple arguments, where each argument is a pair object:
julia> Dict("two" => 2, "four" => 4)
Dict{String,Int64} with 2 entries:
"two" => 2
"four" => 4
Pass an array of pairs to the dictionary constructor (a function named the same
as the type it makes instances of).
julia> pairs = ["two" => 2, "four" => 4]
2-element Array{Pair{String,Int64},1}:
"two" => 2
"four" => 4
julia> Dict(pairs)
Dict{String,Int64} with 2 entries:
"two" => 2
"four" => 4
USING DICTIONARIES 33
Pass an array of tuples to the dictionary constructor. Unlike pairs, tuples may
contain more than two values. For dictionaries they must only contain a key
and a value though.
julia> tuples = [("two", 2), ("four", 4)]
2-element Array{Tuple{String,Int64},1}:
("two", 2)
("four", 4)
julia> Dict(tuples)
Dict{String,Int64} with 2 entries:
"two" => 2
"four" => 4
Notice the {Any, Any} part. This describes what Julia has inferred is the type
of the key and value in the dictionary. Compare this with the other examples
where you see {String, Int64}. When you provide some keys and values upon
creation of the dictionary, Julia is able to guess the type of the key and value.
When you create an empty dictionary, Julia cannot guess the types anymore
and assumes the key and value could be Any type.
You can however explicitly state the type of the key and value:
julia> d = Dict{String, Int64}()
Dict{String,Int64} with 0 entries
julia> d["five"] = 5
5
Which means if you try to use values of the wrong type for key and value, you
will get an error (something called an exception is thrown). In this case we are
trying to use an integer 5, as key when a text string key is expected.
julia> d[5] = "five"
ERROR: MethodError: Cannot `convert` an object of type Int64 to an object of type String
Closest candidates are:
convert(::Type{T}, !Matched::T) where T<:AbstractString at strings/basic.jl:209
convert(::Type{T}, !Matched::AbstractString) where T<:AbstractString at strings/basic.jl:210
convert(::Type{T}, !Matched::T) where T at essentials.jl:171
Element Access
We have already looked at one way of getting and setting dictionary elements.
But what happens if we try to retrieve a value for a key that does not exist?
julia> d["seven"]
ERROR: KeyError: key "seven" not found
julia> d["seven"]
7
But how do we avoid producing an error when we are not sure if a key exists?
One solution is the get() function. If the key does not exist, a sentinel value is
returned instead. The sentinel can be anything. The example below uses -1.
julia> get(d, "eight", -1)
-1
julia> d["eight"] = 8
8
35
36 SHELL SCRIPTING
It shows a taxonomy of animals, grouped into various subgroups. You can cre-
ate this hierarchy yourself either using a graphical file manager or the command
line.
Once done, you can use the Unix command line tools to go into the amphibians
directory and create an empty file called frog:
$ cd animals/vertebrates/amphibians
$ touch frog
$ cd ..
$ cd ..
$ cd ..
julia> cd("..")
julia> cd("..")
julia> cd("..")
julia> pwd()
"~"
You can see in this example, that when cd calls pwd we are in the ani-
mals/vertebrates/amphibians location, but afterwards we are back to our
home directory ~. Please note I have edited the output of pwd for clarity. You
will likely see a full path and not ~.
This example is not very useful, so let us pair cd with a more useful function,
such as readdir. This is the Julia equivalent of the Unix shell command ls:
julia> readdir("animals")
2-element Array{String,1}:
"invertebrates"
"vertebrates"
julia> readdir("animals/vertebrates/amphibians")
2-element Array{String,1}:
"frog"
"salamander"
If we combine it with cd you can get a better sense of how useful it is to take a
function as an argument.
julia> cd(readdir, "animals/vertebrates/amphibians")
2-element Array{String,1}:
"frog"
"salamander"
Remember whenever the first argument is a function we can use the do-end
form instead. This makes it easy for us to add more files using the touch func-
tion.
cd("animals/vertebrates/mamals") do
touch("cow")
touch("human")
end
This can be done more succinct by using the foreach function. foreach will
apply a function on every element in a collection.
cd("animals/vertebrates/birds") do
foreach(touch, ["crow", "seagul", "mockingjay"])
end
We don’t have any insects yet. They belong under arthropods. If we don’t know
if that directory already exists we can use:
mkpath("animals/invertebrates/arthropods/insects")
To add crabs e.g. we need the crustaceans group. To avoid writing the same
paths out multiple times one can store it in a variable, and use joinpath to
38 SHELL SCRIPTING
After all these file and directory manipulations, we should have a hierarchy of
files and directories looking like this:
animals
├── invertebrates
│ ├── arthropods
│ │ ├── crustaceans
│ │ └── insects
│ ├── flatworms
│ └── molluscs
└── vertebrates
├── amphibians
│ ├── frog
│ └── salamander
├── birds
│ ├── crow
│ ├── mockingjay
│ └── seagul
├── fish
└── mamals
├── cow
└── human
julia> basename("animals/vertebrates/mamals/human")
"human"
With dirname we can get the directory part of the path. For instance, what
directory the file human is inside:
julia> mamals = dirname("animals/vertebrates/mamals/human")
"animals/vertebrates/mamals"
As seen before we can join a directory path with a file to create a file path:
julia> joinpath(mamals, "human")
"animals/vertebrates/mamals/human"
Julia has various function to get the absolute path, relative path and home di-
rectory:
julia> abspath("animals")
"/Users/erikengheim/animals"
julia> relpath("animals/vertebrates/../invertebrates")
"animals/invertebrates"
julia> abspath(homedir())
"/Users/erikengheim"
cd(root) do
for file in readdir()
visitfiles(fn, file)
end
end
end
With any function manipulating the filesystem, it is useful to backup your files
first or simply print out the actions which would have been performed, rather
than actually performing them. In this case the latter is not practical since we
actually need to create a directory to enter. Here is a walkthrough:
42 SHELL SCRIPTING
julia> replace_animal("foobar")
shell> ls foobar
description.txt looks.jpg
When one is confident it works, you can visit all the files and perform a replace:
julia> visitfiles(replace_animal, "animals")
When are these functions useful? Remember how we created rocket engines
and tanks by reading CSV files? In this case we processed every line and ev-
ery line produced an object. In such cases, seeking through a file and marking
positions has little value.
However in other cases you work with larger files where there are only particu-
lar parts you are interested in or the data isn’t clearly structured by lines. For
instance when parsing a source code file, a statement doesn’t necessarily limit
itself to a line.
Let use the construction of the book you are reading as an example. It was
originally written in markdown, but there are many flavors of markdown and
you may have to switch from one type of markdown to another. In this case
most of the text can be preserved but there are particular syntactic structures
you want to change.
For instance in Pandoc or Github style markdown, inline math equation are
written as:
NAVIGATE INSIDE FILES 43
$y = 10x + b$
While in Markua style markdown, inline math equations would be written as:
`y = 10x + b`$
Converting this kind of text can be tricky, because you have to distinguish inline
math which uses a single $ and math blocks which use double dollar signs $$,
like this:
$$y = 10x + b$$
The function below takes the name of a file, opens that file, and search for inline
math equations, replace them and write the result back to file.
function relace_inline_math(filename)
out = IOBuffer()
open(filename) do io
while !eof(io)
s = readuntil(io, '$')
write(out, s)
if s[end] == '`'
write(out, '$')
continue
end
mark(io)
s = readuntil(io, '$')
if isempty(s)
reset(io)
s = readuntil(io, raw"$$")
write(out, '$')
write(out, s)
write(out, raw"$$")
continue
end
write(out, '`')
write(out, s)
write(out, raw"`$")
end
end
seekstart(out)
s = read(out, String)
open(filename, "w") do io
write(io, s)
end
end
size is also partially an outcome of the imperative nature of the code: State is
repeatedly mutated. More functional oriented code tends to be easier and more
natural to write as short functions.
Anyway, let us talk through this function. The bulk of the code is made up of the
while !eof(io) loop which keeps reading from the file. The loop ends when
we have reached EOF (End Of File).
As you can see checking the last character of s would verify whether this was
the case. In this case we are lucky because we don’t need any further processing.
There is no need to continue running the reset of the code, we can just skip to
the beginning of the loop again, hence the use of continue.
Readuntil
Most of the code is built around skipping through the text until the next inter-
esting part using the readuntil function.
It will be tricky to understand this code without a clarification of how exactly this
function works. We will go through some simple examples to demonstrate how
it. The IOBuffer type gives a practical solution to simulating the interactions
with a file. It allows us to treat a simple text string as if it was the contents of a
file.
julia> buf = IOBuffer("two + four = six");
julia> eof(buf)
true
Notice that after readuntil has reached the end of the IO stream object (EOF),
it will just keep returning empty strings.
Also observe that the character +, which we read until, gets swallowed but not
included in the string returned. We can include it if we want to.
We use seekstart to move to the start of the stream, so we can repeat the read-
ing, this time with keep=true, to retain the character we are reading until.
julia> seekstart(buf);
In our relace_inline_math function you can see that we read until the dollar
sign without keeping it in the returned string s.
s = readuntil(io, '$')
The reason for this is that we are replacing expressions enclosed with dollar
symbols with backticks. Thus we don’t need to save the dollar symbols.
if isempty(s)
reset(io)
s = readuntil(io, raw"$$")
write(out, '$')
write(out, s)
write(out, raw"$$")
continue
end
46 SHELL SCRIPTING
We deal with math blocks by locating the end of the block using readuntil(io,
raw"$$"). Other than that we write to our output stream exactly what we read.
Afterwards we are done and can jump to beginning of loop with continue.
Seekstart
Our output is an IOBuffer named out which we keep writing our transformed
text to. When done processing we want to write the contents of this IOBuffer
to the same file as the one we read from. That is accomplished with this code
segment:
seekstart(out)
s = read(out, String)
open(filename, "w") do io
write(io, s)
end
Notice that we use seekstart(out) before calling read. That is because at this
point we are at the end of the IO stream object. Any attempt at reading from it
would produce an empty string. There is nothing at the end. We need to move
to the beginning and read from there.
But how can this functionality be utilized from Julia instead of reimplementing
the functionality from scratch? Let me begin by showing you a code example.
function findfiles(start, glob)
readlines(`find $start -type f -name $glob`)
end
julia> files[1:3]
READING AND WRITING FROM EXTERNAL PROCESSES 47
3-element Array{String,1}:
"animals/vertebrates/mamals/human/looks.jpg"
"animals/vertebrates/mamals/cow/looks.jpg"
"animals/vertebrates/amphibians/frog/looks.jpg"
julia> files[end-3:end]
4-element Array{String,1}:
"animals/vertebrates/amphibians/salamander/description.txt"
"animals/vertebrates/birds/mockingjay/description.txt"
"animals/vertebrates/birds/seagul/description.txt"
"animals/vertebrates/birds/crow/description.txt"
Notice we are able to read the output, from the process, as if it was a regular file.
Using readlines we even get an array of strings, we can easily slice and dice.
Let us look at a simple example to better understand how this works:
julia> dir = "animals"
"animals"
julia> typeof(cmd)
Cmd
Notice this kind of looks like string interpolation. The value of the variable dir
gets interpolated with $dir. However the backticks cause a Cmd object rather
than a String object to be created.
A Cmd object can be opened and read from just like a regular file:
julia> io = open(cmd)
Process(`ls animals`, ProcessRunning)
julia> typeof(io)
Base.Process
julia> readline(io)
"invertebrates"
"tebrates\n"
julia> close(io)
The io object returned when we open a Cmd object is of type Process. Remem-
ber the IO type hierarchy we showed before. It shows that Process is a concrete
type at the bottom of this hierarchy.
IO
haskey(io, key)
get(io, key default)
IOContext Process
Regular filesystem files
Read and write to
IOContext(io, properties)
external process
haskey(io, key)
get(io, key default)
In this case we are using run instead of readline. This is basically the same as
running the shell command and getting the output sent to stdout (your terminal
window).
Pipes
In a Unix shell we have an awesome concept called pipes. These allow you to
pipe the output of one command into the input of another command. Here is a
demonstration of this concept:
$ ls animals/vertebrates
amphibians birds fish mamals
$ ls animals/vertebrates | sort -r
mamals
fish
birds
amphibians
The ls command sends a list of filesystem entries to stdout. The sort com-
mand will take everything you write on the keyboard (stdin) and send it sorted
to stdout (console).
However by using the pipe symbol | we connect the stdout of ls to the stdin of
sort -r. The sort command has no idea that its input is coming from another
command.
Pipes gave a lot of flexibility to early Unix systems. Small programs doing a
single thing could be chained together using pipes to create new functionality.
We can create these sort of pipes between Julia Cmd objects as well:
julia> dir = "animals/vertebrates"
"animals/vertebrates"
julia> io = open(pipe);
julia> readline(io)
"mamals"
50 SHELL SCRIPTING
julia> readline(io)
"fish"
julia> readline(io)
"birds"
julia> close(io)
Environment Variables
Another important part of working with the Unix shell is environment variables.
These are accessible through a special global dictionary called ENV.
julia> ENV["JULIA_EDITOR"]
"mate"
julia> ENV["SHELL"]
"/usr/local/bin/fish"
julia> ENV["TERM"]
"xterm-256color"
julia> ENV["LANG"]
"en_US.UTF-8"
Environment variables can be useful in many contexts, not just when working
with the shell. For instance the text editor, TextMate which I typically use for
programming, has a plugin-system based around:
• Stdin and stdout redirection.
• Environment variables.
A plugin-script basically reads input from stdin and writes output to stdout.
In addition information can be conveyed from TextMate to the plugin-script
through environment variables. Scripts launched by TextMate will see these
environment variables. You don’t see them in your regular shell.
ENVIRONMENT VARIABLES 51
This is based on the Unix behavior of how running programs (processes) inher-
ent their environment from their parent process (the processes that spawned
them). Here are some of the environment variables used by TextMate:
To use Julia code in your plugin, you have turn the source code file into an
executable script.
#!/usr/local/bin/julia
row = ENV["TM_LINE_NUMBER"]
col = ENV["TM_LINE_INDEX"]
println(row, ", ", col)
The first line has to start with a hashbang #!, next comes the location of the
interpreter to run the script. Because the hash symbol # marks the beginning of
comments in most script languages including Julia, this line is ignored by the
interpreter executing the code.
Another variant commonly used when you don’t want to hardcode the location
of the interpreter is to use the env command:
/usr/bin/env julia
In this case the OS will use the Julia executable which can be located using the
PATH environment variable.
Our example script is not very useful, other than to demonstrate how a plugin-
system can work. The picture below shows the plugin editor for TextMate,
where the code for the plugin has been added.
52 SHELL SCRIPTING
column 3:
$ export TM_LINE_NUMBER=8
$ export TM_LINE_INDEX=3
How you set environment variables will differ depending on the shell you use.
The example above is from the Bash2 shell as that is widely used. If you used
Fish3 shell instead it would be:
$ set -x TM_LINE_NUMBER 8
$ set -x TM_LINE_INDEX 3
We can then run the script from the command line and see what output it gives.
$ ./where.jl
8, 3
Now you may wonder why I picked TextMate as an example, given that it is not
a widely used text editor and only works on macOS. It is very simple: Most other
text editors have their plugin-system tied to specific programming languages.
In this example animals, -type, f, -name and "*.jpg" are the command line
arguments. If you want to create a shell command by writing a Julia script you
need a way of obtaining these arguments.
This is done in a very similar way to how we obtained environment variables.
Instead of a dictionary we have a global variable named ARGS, containing all the
arguments.
Here is a simple demonstration of replicating the Unix cat command:
#!/usr/bin/env julia
for file in ARGS
s = read(file, String)
print(s)
end
We loop over each element in the ARGS array which should contain a file name.
Then we open the file and print its output.
Say we put this code inside a file called cat.jl, we have to give it execute per-
mission:
$ chmod +x cat.jl
To test the command I made two files foo.txt and bar.txt, with a single line
in each. You can test with whatever files you like.
2
The Bourne Again Shell, which is a play on the name Bourne Shell.
3
Fish is a less known user-friendly shell.
54 SHELL SCRIPTING
$ ./cat.jl foo.txt
foo text
$ ./cat.jl bar.txt
bar text
The reason why you need to put a ./ in front of the file to execute it is because,
we have not placed it in a location stored in the PATH environment variable. If
you put cat.jl in for instance /usr/local/bin or another location which the
OS typically search for executable files then you would not need the ./ prefix.
If we write the same in Julia it is far more obvious what is being done:
s = "Hello World"
println(replace(s, "World" => "Mars"))
println(s[4:end])
Using Julia you get access to superior handling of arrays, and the ability to do
set operations, and write proper functions.
The major downside is that Julia is not installed on every operating system.
However Julia programs can be ahead-of-time compiled for easier distribution.
We will not cover that in this book as that is a more advanced topic. Anyone
interested should explore the PackageCompiler package.