MATLABsimplified
MATLABsimplified
Getting Started
T
his introduction to programming and signal processing with
MATLAB is aimed towards teaching basic programming skills to
scientists who have never programmed before. While computers
increasingly allow collections of large datasets, software for
analyzing these datasets typically lags behind. Scientists are then left with
two choices. The first is to try to get by with pre-packaged software. These
software packages are often less than ideal for the task at hand, and often
lead to a computer aided ‘brute-force’ data analysis. While this is sometimes
adequate, if any of the parameters used in the data analysis change, it often
has to be repeated again by brute force. The second option is to learn how
to program with a software package geared towards scientists and engineers,
such as MATLAB. While this may involve a fair amount of pain at the
beginning, being a proficient programmer allows you the flexibility to ask
questions unbounded by what software packages currently exist. A major
advantage of writing your own software is that if you decide to change
parameters in the analysis, it is little effort to make a small change and run
the software again to process the data. OK—so there is a third
option…paying someone to program for you. But, programming is fun, so
why pay someone else to do it?
5
code until you get it right. Do not get discouraged. It is a rewarding
experience to break through the bugs to get something that works.
MATLAB
MATLAB was chosen as the programming environment for this text,
because it is easily available (although, relatively expensive) and has a
tremendous number of pre-programmed routines available to the scientific
programmer. There are other languages that you could use to do the same
things, such as C or the open source language Python. You will find that
once you learn to code in MATLAB, it is not too difficult to migrate to
another language. They all share the same basic constructs.
Launch MATLAB
The examples in this text will use MATLAB 7. While earlier versions of
MATLAB will work just as well, MATLAB 7 has vastly improved graphics
editing capabilities. The basic functionality of MATLAB can be extended
with various toolboxes. Some chapters will use the filter functions of the
signal processing toolbox. Otherwise, the basic MATLAB program will be
sufficient for the examples in this text. A student version of MATLAB is
available.
6
Figure 1-1. MATLAB interface.
Type
>>x=10
MATLAB tells you x is now equal to 10 and a new variable ‘x’ has appeared
in the pane on the upper left called the workspace (Figure 1-2).
7
Figure 1-2. Screenshot showing creation of a variable in MATLAB.
>>y=x/20
A new variable ‘y’ is created in the workspace, and its value is calculated as
0.5.
8
Figure 1-3. Creation of a second variable in the MATLAB workspace by dividing x by
20.
Double click on the ‘x’ variable in the workspace, and you will see a
spreadsheet window pop-up showing you the value of ‘x’. Close this
window.
9
If a line does not end with a semicolon ‘;’, its output will be printed on the
MATLAB screen (as in Figures 1-2 and 1-3). Many times you do not need
to see the output of every variable. Use a semicolon ‘;’ to suppress the
output.
>> y=x/20;
10
Table 1-1 MATLAB Basic Math Functions
Function Command
Addition +
Subtraction -
Multiplication *
Division /
Power ^
Exponential exp
Average mean
11
Matrices
Setting variables to single values is great, but if you have a lot of data to
analyze it will not get you far. Fortunately, MATLAB allows you to create
and perform math on arrays or matrices of numbers. In fact, that is why it is
called MATLAB, for ‘MATrix LABoratory’.
Type:
>> z=[1:10];
Note that nothing seems to have happened. The semicolon suppressed the
output to the screen. But, notice that a new variable ‘z’ has appeared in the
workspace.
Figure 1-5. Creation of a matrix with values from 1 to 10 in the variable ‘z’.
Double click on the ‘z’ in the Workspace to open it in the array editor. You
will now see that ‘z’ has values ranging from 1 to 10 increasing by 1 (Figure
12
1-6). Also notice that the Size of ‘z’ listed in the workspace is 1x10. That
tells you there is 1 row and 10 columns.
Say we want these values to go from 10 to 100. We can simply multiply ‘z’
by 10.
Type:
>>z=z*10
This is an important trick (it saves you from coming up with a lot of variable
names). In plain English it means, ‘the new values of z will equal the old
values of z times 10’. The new values of z will be shown as:
z =
10 20 30 40 50 60 70 80 90 100
A First Program
Most (if not all) programming texts start off with a simple program called
‘Hello World’ that prints the text ‘Hello World’ to the screen. Well, this one
won’t. What you really want to do as a scientist is manipulate, analyze, and
plot data. So, our first program will be a simple program to create and plot
two variables against one another. We’ll call this the scientific equivalent of
‘Hello World’.
Example 1-1
>> x=[0:10];
>> y=[0:2:20];
>> plot(x,y)
14
Figure 1-7. Basic MATLAB plot.
Program Description
>> x=[0:10];
This line creates a matrix in variable ‘x’ from 0 to 10 counting by 1.
>> y=[0:2:20];
This line creates a matrix in variable ‘y’ from 0 to 20 counting by 2. If you
look in the Workspace window in MATLAB you will see both ‘x’ and ‘y’ are
matrices of size 1x11 (i.e. 1 row and 11 columns).
>> plot(x,y)
Plot is one of the most fundamental MATLAB commands. There are many
variations of plot. This one plots two variables against each other. The first
variable in the list is plotted on the x-axis and the second variable is plotted
on the y-axis. It is important that each variable have the same number of
values. Once you type plot, a new window is created and in this case titled
‘Figure No. 1’. You can edit the plot using the menus in the Figure.
15
Figure 1-8. MATLAB help. Use it.
You can also access basic help descriptions from the MATLAB command
line.
Type:
You will see a detailed (but not complete) text help for the MATLAB plot
command.
Try This
Using the MATLAB help try plotting the same data from Example 1-1 with
red dots (without the line) so that the output looks like Figure 1-9.
16
Figure 1-9. Variation on MATLAB plot.
Mathworks Online
The MATLAB help file is also available online (www.mathworks.com). The
Mathworks also posts m-files written by other users of MATLAB. These
are often quite good and can save a good deal of time programming basic
functions. So, if you think MATLAB should have a command for what you
are trying to do, but doesn’t, check their online database of programs.
17
2
Chapter
Learning to Program
This chapter introduces two basic programming constructs: loops and
conditional statements. The majority of programs can be written with these
programming constructs. Up until now we have just run through a set of
statements that mover from one to the next. Loops are used to cycle
through blocks of code to repeatedly perform the same set of operations.
Conditional statements are used to test for certain conditions, such as
equality of a variable with a value, and are used in program flow control.
To create an m-file, click on the File menu, select New and then M-file.
Type in the commands used for Example 1-1 (see Figure 2.1).
Save file as example1.m (Click on the File menu, then click on Save. Be sure
to remember where you saved this file.).
The first two lines that start with a percentage sign (%) are comments. Any
line that starts with a % is ignored by MATLAB. Comments are important
18
for documenting your m-files, especially when they get complicated. They
allow other people to figure out what your code does, and remind you when
you go back to an m-file that you wrote months ago.
Figure 2-1. An example m-file using the commands from Example 1-1.
MATLAB Path
Before you can run the m-file, you need to make sure that MATLAB knows
where to find the file you just created.
The MATLAB path is simply the list of directories where MATLAB ‘looks’
for your m-files. It starts at the top of the list and works its way down. So, if
you save a new m-file to a folder that is not in the path, and then try to run it
from the command line, you will get an undefined function error.
>> example1
19
To add the correct path, click on the File menu and select Set Path (Figure
2-2).
Click Add Folder, then search for the folder where you saved example1.m.
Click OK.
Click the Save button in the Set Path dialog box, then click the Close button.
To run example1.m simply type the m-file name in the MATLAB command
window:
>> example1
Figure 2-2. MATLAB Set Path dialog box that allows the addition of directories for
MATLAB to search for m-files.
20
TIP: M-file naming
For all of these examples, type the code below into an m-file and then run it.
For Loops
The easiest way to learn about for loops is to run one.
Example 2-2
% example2: a basic for loop
for x=1:3 % loop x from 1 to 3 counting by 1
y=x*2 % do some math; y is x times 2
end % end of for loop
When you run this example, you get the following output.
>> example2
y = 2
y = 4
y = 6
Just like the MATLAB colon operator, the ‘for’ loop can step using different
step sizes. Try the following code example:
Example 2-3
for x=10:-2:2 % loop x from 10 to 2 stepping by -2
y=x*2 % do some math;
end % end of for loop>> x=[0:10];
If
The most important control statement is the ‘if’ statement. ‘If’ statements
are used to check whether a value satisfies a certain condition, and if it does
it executes a block of code. Like loops there are several versions of the
‘if…then’ statement. Here we will learn the most basic version.
Example 2-4
for x=1:5 %loop from 1 to 10
if (x>4) % check if x is >4
x % show the x value to the screen
disp('X is greater than 4');
end % end of if block
end % end of for loop
Example 2-4 is a simple (and somewhat useless) example that sets up a for
loop where x will step from 1 to 5. The ‘if’ statement checks whether x>4.
If x is less than or equal to 4, as it is on the first four steps, then the program
skips to the ‘end’ following the ‘if’. If x is greater than 4, as it is when x=5,
the lines of code after the ‘if’ statement are executed. The first just displays
the value of x to the command window. The ‘disp’ command displays the
text between the single quotes to the command window.
22
TIP: Indentation
Notice in Example 2-4 how the program is indented after the ‘for’ and ‘if’
statements. This is done so that it is clear where the ‘if’ and ‘for’ code blocks
are. Since each has an ‘end’, it is easy to see which statements are within the
‘if’ statement and which are within the ‘for’ loop.
Table 2-1. Relational operators that can be used with ‘if’ statements.
Operation Symbol
Equal ==
Not equal ~=
23
Table 2-2. Logical operators that can be used to test multiple conditions in
an if statement.
Operation Symbol
And &
Or |
Example 2-5 demonstrates the use of the double equal sign and the ‘or’
logical operator to test whether a value is odd or even.
The ‘else’ statement is useful when you want to run a set of commands
when the ‘if’ statement fails.
Example 2-5
for x=1:5 % loop from 1 to 5
x % display x value
if((x==1)|(x==3)|(x==5)) % test whether x is odd
disp('x is odd'); % if x is odd, say x is odd
else % otherwise, if x is not odd
disp('x is even'); % say x is even
end % end of if
end % end of for loop
Now you know most of the basic tools of programming to actually get
something accomplished.
Let’s look at a more useful program that uses the looping functions. Say we
have a time series of temperature measured every hour for one month, and
we want to calculate the minimum, maximum, and average temperature for
each day.
First we need to get the time series into MATLAB (the time series has been
supplied as temp.txt). To do this we will use the MATLAB importing
features.
24
Navigate to temp.txt and click on the file. This will open the MATLAB
import wizard, which will load the data file into a variable called temp. Now,
let’s write an m-file to step through the temp matrix to calculate the
minimum, maximum, and average for each day.
Example 2-6
% tempanalysis
% m-file to calculate daily minimum, maximum, and average
% temperatures.
% This program assumes that temperature data have been
% loaded into the Workspace and given the variable name
% temp.
% The temp data have been collected hourly for one month
Program Description
The ‘length’ function returns how many data points are in a given matrix if it
is one-dimensional (has 1 row or 1 column). For matrices that are two-
dimensional (has multiple rows and columns), ‘length’ returns the number of
row. The purpose of this function is to determine how much data we have
to process. This lets you feed the same program with datasets with different
amounts of data without reprogramming.
25
counter=0; %initialize counter to 0
A counter is a variable that gets incremented in a program, and is usually
used to keep track of the number of times something has been done. Here
we initialize a counter, which we have called ‘counter’, to zero. It will later
be used to provide and index into the matrices holding the minimum,
maximum, and average temperatures.
The value returned by the min function is saved in a matrix called minT.
The (counter) part allows us to save into specific columns in the matrix. The
first time through the loop, counter=1, so the value is saved in minT(1); i.e.
the first column. On the second time through the loop, counter=2, so the
value is saved in minT(2); i.e. the second column.
26
maxT(counter)=max(temp(x:x+23)); % get max value
avgT(counter)=mean(temp(x:x+23)); % calc average
The max function calculates the maximum value, just as the min function
calculated the minimum value. The mean function calculates the average.
end
This is the end of the for loop. You can see that there are four functions in
this for loop. The first increments the counter, and the other three calculate
the minimum, maximum, and average values. The for loop continues until
x>=nsamples-23. Why do we have the -23 in here? Think about what
would happen in the min function. If on the last pass through x=nsamples,
the min function would read min(temp(nsamples:nsamples+23)). If the
temp matrix only has a total of nsamples, you can’t ask for data beyond that;
nsamples+23 would not be in the matrix.
Try running the program without the -23 in the for loop, and you get my
favorite MATLAB error:
??? Index exceeds matrix dimensions.
28
3
Chapter
>> test=importdata('test.txt')
29
What you will likely get is the following error:
This is because MATLAB can’t find your file. MATLAB looks in the
current directory for the text file, and no further. It does not use the
MATLAB path to look for data files, just m-files.
>> test=importdata('c:\test.txt')
Note that any text characters you pass to MATLAB functions need to be
surrounded by single quotes as above. Otherwise, MATLAB will try to
interpret the text as another variable in the workspace.
Changing Directories
Another way to handle this is to change the current directory to the directory
that has your data in it prior to running the importdata command.
There are two ways to do this. The first is to use the Browse button on the
MATLAB toolbar, and navigate to the directory of interest.
30
The second way to do this is to use the change directory command, cd.
>> cd('c:\')
Now, you can import the test.txt file without specifying the path.
>> test=importdata('test.txt')
Header Lines
Many files you will want to import will have header lines containing text.
Since MATLAB arrays typically hold only one kind of data (e.g. numbers or
text), you need to be able to deal with these when you import data.
If you look at the Help file for importdata, you will see several extended
forms of this command. One form is able to deal with header lines.
A = importdata(filename,delimiter,headerline)
Create new text file, this time with a header line. Let’s say you have
maximum temperatures for six days in a month. There are two columns,
one for day, and one for temperature. Make sure the values are separated by
spaces in the text file. Save this in the ‘c:’ directory as temperature.txt.
Day Temperature
10 22
11 23
12 24
13 21
14 20
15 25
>> temperature=importdata('c:\temperature.txt','
',1)
We are loading the values into a matrix called temperature. The delimiter is
set to a space (' '), because that is the character separating columns
(commas and tabs are also commonly used as delimiters). Headerline is set
to 1, because the first row contains the header text data. MATLAB returns
the following:
temperature =
32
the column headers). Structures are useful ways of organizing data of
different types, like character text and numbers.
MATLAB Structures
In the example above temperature is a structure containing three matrices.
The syntax for accessing particular matrices within a structure is as follows:
structure.matrix, where structure is the structure name, and matrix is the name of
the matrix within the structure you want to access.
>>temperature.data
ans =
10 22
11 23
12 24
13 21
14 20
15 25
>> temperature.colheaders
ans =
'Day' 'Temperature'
33
If we wanted to plot the date versus the temperature, we can address each
part of the structure separately.
>>
plot(temperature.data(:,1),temperature.data(:,2)
);
>> xlabel(temperature.colheaders(1));
>> ylabel(temperature.colheaders(2));
We get this plot, which has the appropriate label for each axis.
25
24.5
24
23.5
23
Temperature
22.5
22
21.5
21
20.5
20
10 10.5 11 11.5 12 12.5 13 13.5 14 14.5 15
Day
34
File Open Example
For Example 3-1 we will build on Example 2-6 and modify it so that it can
automatically load and process multiple files within a directory.
We will make three text files with temperature data in them as before (called
temperature1.txt, temperature2.txt, and temperature3.txt). Place these all in
one folder called Example31 (with no other text files).
temperature1.txt
Day Temperature
10 22
11 23
12 24
13 21
14 20
15 25
temperature2.txt
Day Temperature
10 25
11 26
12 22
13 19
14 20
15 31
temperature3.txt
Day Temperature
10 12
11 13
12 14
13 11
14 10
15 15
35
Example 3-1
% Example 3.1
% M-file to calculate minimum, maximum, & average temperatures
% This program assumes there are multiple .txt files with
% temperature data in the same folder.
temperature=importdata(files(x).name,' ',1);
% calculate average
avgT(x)=mean(temperature.data(:,2));
end
Program Description
>> files
files =
This tells us files is a 3x1 structure array. The 3 is because we put three .txt
files in the directory. Here we are just concerned with the file names. To
look at the file names we can type into the MATLAB command window:
>> files.name
ans =
temperature1.txt
ans =
temperature2.txt
ans =
temperature3.txt
37
These are the names of all of the *.txt files in the current directory. You
need to be careful when using this command, because the filenames are
retrieved in alphabetical order. Don’t count on them being in chronological
order.
for x=1:nfiles
The for statement will increment from 1 to nfiles. In this case nfiles=3,
because we placed three .txt files in the directory.
temperature=importdata(files(x).name,' ',1);
The importdata command is used as before. This time we need to use
the variable x to get the file name from the files structure. The first time
through the loop, x=1, so files(1).name will be ‘temperature1.txt’. The ' '
tells importdata that the text files are space-delimited. The 1 means that
the first row is a header. The file is then loaded into the structure called
temperature
minT(x)=min(temperature.data(:,2));
maxT(x)=max(temperature.data(:,2));
avgT(x)=mean(temperature.data(:,2));
The structure temperature holds both the header text as well as the date and
temperature data, as we saw above. The min function finds the minimum
value from the second column of temperature.data, which contains the
temperature readings. The max function finds the maximum value, and the
mean function calculates the average.
The values are stored in new matrices called minT, maxT, and avgT, which are
incremented with the x. So, minT(1) holds the minimum value of the second
column in the file temperature1.txt. avgT(2) holds the mean value of the
second column in the file temperature2.txt.
end
38
The end statement is the end of the for loop. When the end is reached,
the program execution returns to the for line, and increments the value of
x. This way each file is opened in turn and analyzed.
Next Steps
You can see the power of this simple routine. It can go through a directory
with thousands of files, determine their names, open them, and process
them. Simply by changing the path variable setting, any folder could be
analyzed.
There were undoubtedly bits of the examples here that did not make sense.
The structure returned by dir was not fully explained. Check out the
MATLAB Help file to learn what data the additional members of the
structure hold. Explore these in the command window. These other
structure members can come in handy when sorting through large numbers
of data files.
There are also many other ways to import data into MATLAB. We will
explore some of these in later chapters, but it is also a good idea to read this
section of the MATL
39
4
Chapter
Matrix Manipulation
Up until now we have just created matrices by either using MATLAB’s
colon operator or by reading in text files. There are many times you will
want to add or delete rows or columns. This chapter covers basic matrix
manipulation in MATLAB as well as some techniques for using data
returned from one function to index into a matrix.
Deleting Rows
Let’s create a simple [5x3] (5 row by 3 column matrix).
Here we have directly entered the data manually into an array. Values in
columns are separated by spaces (you could also use commas), and rows are
separated by semicolons. If we look at the value in the array editor we see
our five rows by three columns.
40
To delete the second row, type:
>> a(2,:)=[];
Here we use the Null matrix operator (closed square brackets) to delete the
part of the matrix specified by (2,:), which means the 2nd row and all of the
columns. In the Array Editor window you can see that the second row has
been deleted.
Deleting Columns
Deleting columns is similar to deleting rows.
>> a(:,3)=[];
This tells MATLAB to delete ([]) from the matrix a, all of the rows (:) from
column 3. The data in the array editor now look like:
If you get real angry at MATLAB, you can delete an entire matrix.
41
>> a=[];
This is actually useful when you want to make sure a matrix is empty before
filling it with data again. Say you had loaded a matrix with 10 rows of data.
Then you start processing another dataset and load only 5 rows of data into
the same matrix. You will still have the last 5 rows from the first set of data
you loaded. So, you need to clear the matrix before loading the second set.
Concatenating Matrices
Concatenating means combining. Say you have to matrices that you want to
combine into one. Perhaps they were data files from different years that had
the same format, and you want to put them all together.
>> a=[1:10];
>> b=[20:2:30];
Matrix a.
Matrix b.
42
Matrix a goes from 1 to 10, counting by 1. Matrix b goes from 20 to 30,
counting by 2. So, matrix a has 10 values, and matrix b has 6 values.
Let’s concatenate them into a new matrix (c), which should end up with 16
values.
Again, we have used the square brackets to manipulate matrices. This time
we told MATLAB to put a and b into the same matrix. The space means to
put them in the same row.
You could also use the function horzcat to accomplish the same thing.
>> c=horzcat(a,b);
>> c=[a;b];
43
You get an error saying that the dimensions are not consistent. What
happened here was that you were trying to make a new matrix where the
first row was a and the second row was b. The problem is that a had 10
values, but b only had 6 values. Standard MATLAB arrays do not allow
missing values (unlike most spreadsheet programs).
For example, let’s create a [2x10] matrix that we can flip using some of the
techniques we just learned.
>> a=[1:10];
>> b=[21:30];
>> c=vertcat(a,b);
So, matrix a is 1 row x 10 columns and counts from 1 to 10. Matrix b is also
1x10, and counts from 21 to 30. We use the vertcat function to create
matrix c, which is now 2 rows x 10 columns. Below you can see these values
in the Array Editor window.
>> d=c';
44
Flipping (flipud and fliplr)
Another useful technique is flipping a matrix. MATLAB provides flipud
(flip up-down) to flip a matrix along rows, and fliplr (flip left-right) to
flip a matrix along columns.
>> e=fliplr(c);
Now, instead of starting at 1, the matrix starts at 10 and counts down across
columns.
>> r_unsorted=rand(10,1)*10;
45
List of 10 random numbers between 0 and 10.
>> r_sorted=sort(r_unsorted);
You can see the MATLAB help file for other versions of sort (like
descending). Often you have a two-dimensional matrix that you want to
sort based on the values in one column, but keeping the rows together. The
MATLAB function sortrows does this.
>> r_unsorted=rand(5,4)*100;
46
Now, let’s sort by rows in ascending order on the first column.
>> row_sorted=sortrows(r_unsorted);
>> a=[1:5];
>> b=[2:2:10];
>> c=a.*b;
47
If you tried using just * (without the period), you would get an error in this
case.
Remember, these are the data files we had for Example 3-1.
temperature1.txt
Day Temperature
10 22
11 23
12 24
13 21
14 20
15 25
temperature2.txt
Day Temperature
10 25
11 26
12 22
13 19
14 20
15 31
temperature3.txt
Day Temperature
10 12
11 13
12 14
13 11
14 10
48
15 15
Example 4-1
% Example 4.1
Program Description
temperature=importdata(files(x).name,' ',1);
This is the same line as in Example 3-1. It imports the data from the file
where the filename is in the files structure. The second argument indicates
the file is space-delimited, and the third argument indicates that there is 1
header line.
end
This is the end of the for loop. Notice that we have not done any of our
minimum, maximum, or mean calculations inside the loop. That is because
we are loading all of the temperature values into one matrix first.
50
After the loop all of the temperature values are in the all_temp matrix, so we
can use the simple min, max, and mean functions to calculate the values
for each column. Compare this to what we did in Example 3-1. There are
many ways to achieve the same end in programming, and usually it doesn’t
matter which route you take (as long as you get there).
51
5
Chapter
The two fundamental aspects of digitization are the sample rate and the
sampling resolution. They are important to understand, because it is
important to understand the limitations and potential pitfalls of digital
sampling.
52
Sampling Rate
This plot shows a continuous tidal signal, which cycles once per day (Fig. 5-
1).
0.5
Tidal Height (m)
-0.5
-1
0 5 10 15 20 25 30 35 40 45 50
Time (h)
Figure 5-1. Time series showing tidal height continuously. This is an analog
signal.
Tides can be measured sampled by recording water depth every hour (Fig. 5-
2). In this example, the sample rate is 1/hour. The sample period is
1/sample rate; and is thus, 1 hour. This digital signal captures tidal variation.
0.5
Tidal Height (m)
-0.5
-1
0 5 10 15 20 25 30 35 40 45 50
Time (h)
Figure 5-2. Digital sampling of the analog signal from Fig 5-1. Each dot
shows a digital sample measured hourly.
53
Aliasing
Natural signals are never so clean. There may be variation in water height
due to waves, which vary on a shorter time scale than the tide we are
interested in measuring. If we ignore this we run the risk of collecting
erroneous data.
Figure 5-3 shows a more realistic dataset for what water depth might look
like when there are waves.
1.5
Tidal Height (m)
0.5
-0.5
-1
0 10 20 30 40 50
Time (h)
Figure 5-3. Tidal time series with variation in height caused by waves.
If we collect data once every hour, we end up collecting erroneous data with
respect to the tidal signal we are characterizing (Fig. 5-4).
54
2
0.5
-0.5
-1
0 10 20 30 40 50
Time (h)
Figure 5-4. Digital sampling of the time series from Fig. 5-3 with digital
samples measured every hour.
This is more obvious if we connect the sampled data points with a line.
There are oscillations in the measured time signal that were not present in
the original signal. This type of undersampling is called aliasing. Aliasing is
bad.
1.5
Tidal Height (m)
0.5
-0.5
-1
0 10 20 30 40 50
Time (h)
Figure 5-5. Aliasing as a result of sampling an analog signal that varies faster
than the digital sampling rate.
55
Avoiding Aliasing
Nyquist Theorem
It is usually (but not always) easy to avoid aliasing. The first rule of thumb
(the Nyquist Theorem) is that you need to sample the data at least twice as
fast as the highest frequency in the system. For example, if you know that
the highest frequency of change in the system is once per hour, you need to
sample at least twice per hour. Indeed, you are better off sampling at an
even higher rate to give yourself some room for error.
Put another way, the sampling period needs to be at least half of the shortest
period of change in the data. So, if you know that the shortest period of
change of the signal is 1 day, you need to sample at least every 0.5 days.
Anti-Aliasing Filters
The second way to avoid aliasing is to use an anti-aliasing filter on the signal
from the transducer before you digitize the signal. A low-pass filter will let
frequencies lower than the cut-off frequency through the filter. If you have
a low-pass filter with a cut-off of 10 Hz (10 cycles per second), you need
only sample the filtered signal at 20 Hz (20 samples per second). In our tide
example, if we had an analog pressure transducer reporting the water depth,
we could (in theory) use a low-pass filter to filter out the wave ‘noise’ from
the tidal signal before digitizing.
A second way to do this is to sample the water depth at a high enough rate
to capture the variation due to waves (say once per second), and then
average the resulting data to yield the tide signal. Simple averaging is a
method of low-pass filtering. We will cover filters in more detail in later
chapters, as there are more sophisticated techniques that yield filters with
more desirable properties.
There are times when it is not possible to use an anti-aliasing filter, such as
when you are sampling populations of animals and it is impossible to do it
continuously. Consider monitoring changes in zooplankton populations at
one location over time with a plankton net tow. Each plankton net tow
might last 10 minutes with 20 minutes in between tows to rinse and prepare
the samples. In this case, there is no way to know if the plankton population
signal is being aliased. You would have to develop a new sampling protocol
to be sure to avoid aliasing. Such a protocol might involve some sampling at
high rates to understand the high frequency variability. In other cases, you
may know something fundamental about the system that limits the rate of
variability. For instance, if you are studying nesting birds, and count the
56
number of eggs laid, you may know that the biology of bird reproduction
does not allow more than one clutch of eggs to be laid per day. This
fundamental knowledge can be used to constrain sampling rates.
Sampling Resolution
Now that you know how often to sample, you need to also consider the
resolution with which you record the measurements. Up until now we have
ignored how numbers are stored on computers. There are several
computer data formats, which are directly tied to digital sampling
considerations and to other issues such as file formats.
Bits
Computers store information as bits whose value can be either 0 or 1. This
is a binary code—meaning that each digit can have only two values. The
reason for using a binary code is that data can be transmitted using switches
(a signal is either on or off). Binary coding is a base-2 numbering system.
To code numbers that are higher than one, you must string bits together.
Table 6-1 illustrates how 8 bits can be used to represent values from 0-255 (a
total of 256 values).
57
Table 5-1. Binary numbering system using 8 bits to represent an integer
value.
00000000 0
00000001 1
00000010 2
00000011 3
00000100 4
00000101 5
... ...
11111111 255
58
Digital Resolution
If you sample a signal with 8-bit resolution, it means that 8-bits are used to
encode the value at each time point the signal is sampled (Figure 5-6). This
gives you 256 different values that can be used.
Time Value
Point
1 0
2 39
3 75
4 103
5 121
6 128
7 121
Figure 5-6. 8-bit encoding of an analog signal. At each time point the level
is measured and encoded as an 8-bit number.
While this looks good, you can see that there will be values that can not be
encoded with 8-bits. For instance, say your measurement from Figure 6-6 at
time point 2 was 39.2. There is no way to encode this as an 8-bit integer.
So, it will be rounded to 39. This rounding is called quantization error, and
it raises two issues: how to increase measurement resolution and how to
represent decimal values.
Measurement Resolution
Since the number of values that can be coded in binary is 2bits, to increase the
resolution increase the number of bits (Table 5-2). For example, a 12-bit
number can encode 212, or 4096, different values.
59
Table 5-2. Relationship between number of bits and the number of values
that can be represented.
Decimal Numbers
Decimal numbers can be represented in different ways. For instance some
bits can be used to represent the integer part of a decimal number, and the
remaining bits can be used to represent the fractional part of a decimal
number. A more commonly used format is floating point, which is the
default MATLAB data type. Double precision floating point uses 64 bits to
encode a number, with 15 decimal digits and an exponent. Double-
precision floating point numbers will not code more than 15 decimal places.
However, most data acquired from instruments is saved at a lower
resolution, like 16-bit integers. When it is imported into MATLAB, it is
usually encoded as double precision data. This essentially decreases the
amount of rounding error that occurs in subsequent calculations (at the
expense of memory).
60
8-bit Binary Data: 00000001 00000001 00000001
00000001
Integer Value: 1 1 1 1
If the same data file is read in as 16-bit binary data, the values are
misinterpreted.
MATLAB has the flexibility to load data files with arbitrary data formats.
For instance, if you need to load a 16-bit binary file you can use MATLAB’s
basic file reading functions.
%name of file
filename='c:\tides.I16';
Table 5-3. Common MATLAB data types for use with fread.
62
MATLAB Functions
RMS
One way to characterize signal levels of oscillating signals is to calculate the
RMS level. The reason is if a signal oscillates around a zero value, the
average value of the signal will be zero, no matter what the amplitude. The
RMS technique overcomes this problem by squaring all of the values before
taking the average.
RMS stands for root-mean-square. Simply put, first you square the data (to
change negative values to positive values), then you take the mean of those
squared values, and finally take the square root of the result to get back to
the original units. There is no MATLAB function to calculate rms levels, so
let’s write a simple one that we can use in the future.
Type in the following rms function and save it in a new folder called
MyToolbox as rms.m. The function file name has to be the same name as the
function. This is where you will save all of the MATLAB functions you
create, so that you can easily find them and use them. Remember to add
MyToolbox to the MATLAB Path (see Chapter 1).
function h=rms(data)
% h=rms(data)
% calculates root-mean-square
% of input matrix data
63
% Calculate the mean of the squared data
mean_ds=mean(datasquared);
Program Description
function h=rms(data)
% h=rms(data)
% calculates root-mean-square
% of input matrix data
The first line of a MATLAB function starts with the word function followed
by a space and then the function declaration. In this case the function
declaration is h=rms(data). This determines what you will type from
the MATLAB command line or other m-files to use the rms function. The
matrix data will be passed from the calling function (i.e. the command line or
other m-file). The matrix h will be returned by the rms function, so we
want to make sure that we put the final result into h.
The three commented lines following the function declaration are important.
They will be the lines returned to the screen when you type:
MATLAB returns:
h=rms(data)
calculates root-mean-square
of input matrix data
These are useful in that they describe the syntax for how to use the rms
command, and what the function does.
64
% Square the data using cell-by-cell multiplication
datasquared=data.*data;
First we square the data by multiplying the data matrix by itself. Notice that
the .* function is used, because we want to multiply cell-by-cell.
%name of file
filename='c:\tides.I16';
tideRMS=rms(tides);
Program Description
65
tideRMS=rms(tides);
The rms function is used to calculate the rms of tides. The data we pass to
the rms function is the tides matrix. We store the resulting rms value in the
matrix tideRMS. Notice that in the rms function, we used the matrix h to
store the rms value. The calling function does not need to use the same
argument name or returned matrix name as the rms function.
Look in the MATLAB workspace. Notice that there are no matrices called
data, data_squared, mean_ds, or h. The matrices created in the rms function
are only in memory temporarily; they are not saved in the MATLAB
workspace.
66
6
Chapter
Fourier Transform
This chapter introduces one of the most powerful (and old) techniques for
analyzing the periodicity of signals in a time series. We will not go through a
derivation of the Fourier Transform, but instead present an intuitive
description of what it is and how it is used in MATLAB. Other texts (e.g.
Digital Signal Processing for Scientists and Engineers) may be consulted for
mathematical details of how the Fourier Transform works.
The Fourier transform converts a signal from the time domain into a signal
in the frequency domain. When we use the word ‘domain’ we are referring
to the units on the x-axis. The term ‘transform’ means that you can convert
in both directions (Fig. 7-1).
Figure 6-1. Schematic of process of Fourier Transform converting from time domain into
67
frequency domain, and the reverse process, the Inverse Fourier Transform. Each plot has
the signal amplitude on the y-axis.
The Fourier Transform is one of the most commonly used tools for
analyzing signals. The idea behind it is that any signal can be broken down
into the sum of sine waves with different amplitudes and phases.
To understand the Fourier Transform we will create some sine wave signals
in MATLAB and then analyze them with the Fast Fourier Transform (FFT).
The FFT is a mathematically efficient way of calculating the Fourier
Transform.
1. To create a sine wave we first need to decide on our sample rate. Let’s
create a 1 s long signal with a 100 Hz sample rate. Note that we subtract off
one speriod from the dur when we create the signal, since we start counting
from 0.
2. Now let’s make a sine wave with a frequency of 2 Hz (i..e. 2 cycles per
second). Note that since t is a matrix, we use ‘.*’ in the equation with t.
In the plot see that we have exactly two sine wave cycles in one second.
68
1
0.8
0.6
0.4
0.2
-0.2
-0.4
-0.6
-0.8
-1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
3. Now, let’s look at this signal with the FFT. First, we use the fft function
to calculate the Fourier Transform.
sfft=fft(s); % calculate Fast Fourier Transform
The first thing to note is that the values are returned as a complex number
which contains both the frequency components and the phase components.
For now we are just going to look at the frequency components by using the
abs (absolute value) function to return the magnitude of the FFT.
If you examine the sfft_mag matrix in the workspace, you will see it has the
same number of points (100) as the original time series. The plot shows two
peaks. If we zoom in on the first peak we see that it is plotted at index 3.
69
The second peak is the mirror image of the first peak rotated around the
Nyquist frequency. Since our sample rate is 100 Hz, the Nyquist frequency
would be at 50 Hz. Since the first index (or bin) in the FFT corresponds to
a frequency of 0 Hz, the index at 3 corresponds to 2 Hz.
60
50
40
30
20
10
0
0 20 40 60 80 100 120
Figure 6-3. Plot of magnitude of FFT of 2 Hz signal. Note that the x-axis values
correspond to the indices of the sfft_mag matrix, they do not correspond to frequency.
70
50
45
40
35
30
25
20
15
10
0
1 2 3 4 5 6 7
Examine the magnitude of the peak at index 3. Notice that it is 50. If you
look back at Figure 6-2, you will see that the peak amplitude of the sine wave
was 1, not 50. Because of the way MATLAB returns the FFT, it needs to be
scaled by the number of points used in the FFT.
fftpts=length(s);
hpts=fftpts/2; % half of the number points in FFT
sfft_mag_scaled=sfft_mag/hpts; % scale FFT result by
half of the number of
points used in the FFT
plot(sfft_mag_scaled); % plot scaled fft result
Now when we plot the results, the y-axis is properly scaled to 1 (Fig. 6-5).
71
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0 20 40 60 80 100 120
Figure 6-5 is now properly scaled for the amplitude, but we still do not have
a proper frequency scale. So, we need to create one. First, we need to know
the frequency resolution (or binwidth) we obtained from the FFT.
In our example:
So, our binwidth (or frequency resolution) is 1 Hz. Since the first bin in the
FFT result is 0 Hz, we create a matrix for the frequency axis starting at 0.
The resulting plot, now has both axes correctly scaled (Figure 6-6).
72
from 0 to sample rate.
Since we are counting from zero,
we subtract off one binwidth to
get the correct number of points
plot(f,sfft_mag_scaled); % plot fft with frequency axis
xlabel(‘Frequency (Hz)’); % label x-axis
ylabel(‘Amplitude’); % label y-axis
0.9
0.8
0.7
0.6
Amplitude
0.5
0.4
0.3
0.2
0.1
0
0 10 20 30 40 50 60 70 80 90 100
Freuqency (Hz)
xlabel('Frequency (Hz)');
ylabel('Amplitude');
73
1
0.9
0.8
0.7
0.6
Amplitude
0.5
0.4
0.3
0.2
0.1
0
0 5 10 15 20 25 30 35 40 45 50
Frequency (Hz)
Figure 6-7. First half of FFT magnitude results plotted, showing peak at 2
Hz.
74
Measuring Amplitudes of Specific Frequencies
The frequency resolution of the FFT is determined by the number of points
in the FFT.
Example,
If you expect a signal at 800 Hz, and you want to measure its magnitude,
then you would look in the 101st bin. This is because 800 Hz/8 Hz
binwidth=100. Since 0 Hz is the first bin, 8 Hz the second, 16 Hz the 3rd...
you would need to look in bin 101 (not bin 100).
[magnitude, index]=max(sfft_mag_scaled(1:hpts))
Returns:
magnitude = 1
index = 3
So, to calculate the frequency of the peak signal, we need to look up the
index in our frequency scale variable.
maxfreq=f(index)
If the number of points in the signal is longer than fftpts, MATLAB will
truncate the signal and only use the first fftpts.
If the signal is shorter than fftpts, MATLAB will zero-pad the signal before
calculating the FFT.
Additional Exercises
1. Create a 1 second long sine wave with a 1000 Hz sample rate and a
frequency of 10 Hz and amplitude of 1.
2. Create a 1 second long sine wave with a 1000 Hz sample rate and a
frequency of 100 Hz and amplitude of 1.
4. Perform FFT on signal from part 3 and plot with appropriate axes.
6. Perform FFT on signal from part 5, and plot with appropriate axes.
76
7
Chapter
Digital Filters
The two filter types we will discuss are named the Finite Impulse Response
(FIR) filter and the Infinite Impulse Response (IIR) filter. While their names
may sound complicated, the implementation of these filters is
straightforward and involves nothing more than multiplication and addition.
77
Finite Impulse Response (FIR) Filters
Let’s start with an FIR filter that you are probably already familiar with, but
didn’t know it had such a fancy name: the moving average. In a moving
average a certain number of points in the time series are averaged together
over a sliding window to create a smoothed time series.
For example, say we have the following data (Table 7-1), and we wish to
calculate a 3-point moving average. We add the first three rows (4+2+1) to
get 7, which we divide by 3 to calculate the average. Then we slide the
window down one place, and add the second through fourth rows (2+1+5)
to get 8, which we divide by 3 to calculate the average.
2 7 7/3=2.333
1 8 8/3=2.667
5 9 9/3=3
3 13 13/3=4.333
5 9 9/3=3
1 8 8/3=2.667
2 9 9/3=3
78
This is the general scheme of an FIR filter. The only difference is that in
FIR filter implementations we first multiply by a fraction (i.e. divide) and
then sum the results. For example, for a three-point moving average we
would multiply each value by 0.3333333 first and then sum the results. You
can see that this is mathematically the same as adding first and then dividing
by the number of values added together.
For our first FIR filter in MATLAB we will implement a 5-point moving
average, so we will multiply each value by 0.2 (=1/5) and then add the
results together. We will apply this filter to a random noise sequence, so we
can see the affect of the moving average.
The first step is to create an array with the factors we are going to use in the
moving average. These are referred to as filter coefficients . They are also
referred to as taps . We are going to use the MATLAB (and standard)
nomenclature and call these ‘b’. We then use the filter function to filter a
random time series called raw_data.
npts=1000;
b=[.2 .2 .2 .2 .2]; % create filter coefficients for 5-
point moving average
x=(rand(npts,1)*2)-1; % raw data from -1 to +1
filtered_data=filter(b,1,x); % filter using 'b'
coefficients
subplot(2,1,1); % 1st subplot
plot(x); % plot raw data
title('Raw Data');
Program Description
79
filtered_data=filter(b,1,x);
When using the filter function for FIR filters, the second argument is always
1. The reason for this will become clear later when we discuss IIR filters.
We pass it the filter coefficients ‘b’ and the data to be filtered ‘x’.
One can see how the moving average removes the peaks, and smooths, the
original time series (Figure 7-1).
Raw Data
1
0.5
-0.5
-1
0 100 200 300 400 500 600 700 800 900 1000
Filtered Data
1
0.5
-0.5
-1
0 100 200 300 400 500 600 700 800 900 1000
Time
80
Figure 7-1. Smoothing a random time series with a 5-point moving average. The moving
average removes the peakiness of the raw data.
You need to pay attention to this for large filters (filters with a lot of
coefficients), because it results in a time shift in the filtered signal.
81
Program Description
The resulting plots (Figure 7-2) shows that the original signal has energy at
all frequencies. The signal filtered with the 5-point moving average shows a
reduction in energy at the higher frequencies. This makes intuitive sense,
because the filters smooths out the sharp excursions. This is a type of low-
pass filter, because it allows low-frequencies to pass through.
82
Raw Data
0.08
0.06
0.04
0.02
0
0 50 100 150 200 250 300 350 400 450 500
Filtered Data
0.08
0.06
0.04
0.02
0
0 50 100 150 200 250 300 350 400 450 500
Frequency
Figure 7-2. Frequency domain plots from FFT of time series from Figure 7-1. The top
plot shows the original noise signal had energy at all frequencies. The bottom plot shows
that the 5-point moving average acts as a low-pass filter (and filters off high frequencies and
lets low frequencies through the filter).
ntaps=20;
83
b=ones(1,ntaps)/ntaps; % create filter coefficients for
ntap-point moving average
There are many types of IIR filters, usually named for some fantastic
mathematician. None are named for a biologist. They include Butterworth,
Chebyshev, Bessel, and elliptic. For this section we will just demonstrate use
of the Butterworth filter (mostly because we like the MATLAB name for it:
butter).
Let’s start with a noise signal and assume it is sampled at 1000 Hz and is
2000 points long (i.e. 2 seconds in duration).
% Calculate FFT
fftpts=npts; % number points in FFT
hpts=fftpts/2; % half number of FFT points
binwidth=srate/fftpts;
f=[0:binwidth:srate-binwidth];
subplot(2,2,3)
plot(t,filtered_data);
title('Filtered Time Series');
xlabel('Time (s)');
subplot(2,2,2)
plot(f(1:hpts),x_fft(1:hpts));
title('Raw FFT');
subplot(2,2,4)
plot(f(1:hpts),filtered_fft(1:hpts));
title('Filtered FFT');
xlabel('Frequency (Hz)');
Program Description
85
order FIR filter (the filter order is 1 less than the number of ‘b’ filter
coefficients).
If we look at the ‘b’ coefficients, we see that they are not all the same value:
0.167179268608490
0.668717074433960
1.00307561165094
0.668717074433960
0.167179268608490
86
Raw Time Series Raw FFT
1 0.08
0.5 0.06
0 0.04
-0.5 0.02
-1 0
0 0.5 1 1.5 2 0 200 400 600
1 0.06
0 0.04
-1 0.02
-2 0
0 0.5 1 1.5 2 0 200 400 600
Time (s) Frequency (Hz)
Figure 7-3. Raw time series (top) filtered with a 4th order Butterworth filter with a 300
Hz corner frequency (bottom). The time series are shown on the left and the frequency
domain plots are shown on the right.
Try This
1. Try changing the filter order to see the effect on the sharpness of the
cutoff.
FDAtool: MATLAB has a Filter Design and Analysis Toolbox that can be
used to create and visualize filters. The ‘b’ and ‘a’ coefficients can then be
exported for use in an m-file. Just type:
87
>FDAtool
Figure 7-4. MATLAB’s FDAtool showing the calculation of a 5th order Butterworth
high-pass filter.
88
In the IIR filter there is also feedback in which the ‘a’ feedback coefficients
are multiplied by the previous outputs and subtracted from the feedforward
results. Thus, each filtered output value of y can be calculated as:
y(n)=b(1)*x(n)+b(2)*x(n-1)+ ... + (feed-forward)
b(nb+1)*x(n-nb)
89
8
Chapter
Energy Detectors
The first type of signal detection we will implement is the energy detector.
Energy detectors use a threshold to determine when the signal of interest
goes above a certain level.
90
Step 2: Rectify signal.
% Plot signal
subplot(5,1,1)
91
plot(t,signal);
hold on
%plot red dot on location of sine wave start
plot(sinepos/srate,0,'.r');
hold off
title('Original Signal');
Original Signal
5
0
-5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Step 1. Filter Signal
2
0
-2
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Step 2. Rectify Signal
2
1
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Step 3. Envelope Signal
1
0.5
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Step 4. Threshold Signal
1
0.5
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Time (s)
Figure 8-1. Original signal with sine wave embedded in a noise signal. The lower four
plots show the sequential signal processing steps used to detect the signal in the time domain.
In this case we will filter the signal with a 10 Hz wide bandpass filter
centered at the frequency of the sine wave. See the second plot in Figure 9-1
to see the result of the filtering in the time domain. Note that in each plot
we have plotted a red dot at the start time of the sine wave.
subplot(5,1,2)
plot(t,signal_f); %plot filtered signal
hold on
%plot red dot on location of sine wave
plot(sinepos/srate,0,'.r');
hold off
title('Step 1. Filter Signal');
subplot(5,1,3)
plot(t,signal_r); %plot filtered signal
hold on
%plot red dot on location of sine wave
plot(sinepos/srate,0,'.r');
hold off
title('Step 2. Rectify Signal');
93
Step 3. Envelope Signal
We want to detect when the tone starts and when the tone stops. We do
not want to detect each small peak corresponding to the tone, so we will
low-pass filter the signal to extract the envelope (or the edges of the signal
that you might trace). Note that there are other techniques for doing this.
In this example we use a 6-pole Butterworth filter with a corner frequency at
10 Hz. This captures the envelope of the signal (Figure 8-1, fourth plot).
Notice that the filters are producing delays in the timing of the sine wave
(the signal is after the location of the red dot).
subplot(5,1,4)
plot(t,signal_env); %plot filtered signal
hold on
%plot red dot on location of sine wave
plot(sinepos/srate,0,'.r');
hold off
title('Step 3. Envelope Signal');
In this case we set the threshold value to half of the amplitude of the original
sine wave. The result is either 0 or 1. It is 1 when the signal_env time series is
greater than the threshold value. This has been plotted on top of the
enveloped signal, and appears as a rectangular shape where the enveloped
time series exceeds the threshold. This is the gated signal. You can see that
there is a delay between the onset of this detected signal and the original
signal.
94
% Step 4: Threshold signal
threshold=amp_t/2;
gated=(signal_env>threshold);
subplot(5,1,5)
plot(t,signal_env); %plot filtered signal
hold on
%plot red dot on location of sine wave
plot(sinepos/srate,0,'.r');
plot(t, gated, 'r'); % plot threshold signal
hold off
title('Step 4. Threshold Signal');
xlabel('Time (s)');
Now we need to determine the time when the gated signal first increases to
1. The diff function performs a derivative-like process in which a new value is
calculated by subtracting the previous value from the current value. The time
series that results has one less point than the original time series, because
there is no point to subtract from the first point in the time series. For our
gated time series, this results in a 1 at the time of signal onset and a -1 at the
time the signal goes back below the threshold (Figure 8-2).
d_gated=diff(gated);
plot(d_gated);
1
0.5
-0.5
-1
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
Figure 8-2. Differenced time series in which a value of +1 occurs at the sine wave onset
and a value of -1 occurs at the end of the sine wave.
95
find
The find function can be used to find the indices of values equal to +1.
find(d_gated==1)
This returns 2689, which is the index in the differenced time series equal to
1. To find the time corresponding to the onset, we need to use the index+1:
2690.
t(2690)=2.689
So, we detect the signal at 2.689 seconds. In this case, we know the signal
started at sinepos, which is index 2496. This corresponds to 2.495 s. Thus,
there is approximately a 200 ms delay in detection of the signal. The source
of the delay is from each of the filters and from the thresholding, which
defined the start as the time the enveloped signal exceeded half of the sine
wave amplitude. You can see that if the threshold was lowered, the delay
would be reduced. The downside of this is that it increases the probability
of falsely detecting background noise.
Try This
96
9
Chapter
Image Processing
Images are composed of pixels (picture elements). This makes them easy to
store in MATLAB, because a two-dimensional matrix can be used to map a
color or intensity to each pixel in an image. The purpose of this chapter is to
introduce you to the basic MATLAB functions for importing images and
plotting them in figures. MATLAB also produces a more advanced image
processing toolbox, which has advanced functions for image anlaysis,
including Fourier analysis, edge finding, and filtering. MATLAB also
produces a mapping toolbox (maps are also images) that has useful features
for including maps in MATLAB figures.
Note that images can also be constructed from vectors, which are simply the
instructions for how to draw lines and polygons to recreate an image. We
will restrict our discussion to pixel-based images.
imagesc: image scale is similar to image, but it allows you to scale the intensity
of the pixel range.
97
colormap: color maps are used by indexed images, where the matrix value
points into another colormap matrix where each index specifies the pixel
color (e.g. three values for RGB indexed images).
Click on the y matrix in the workspace view, and you will see a series of
columns of alternating between 1 and 0 (Fig. 9-1). You can imagine this as
our image. These values will directly map to pixels in the image, with the
value of 1 being one color and the value of 0 being another color.
98
Figure 9-1. Array Editor view of the y matrix showing alternating columns of 1’s and
0’s. This matrix will be used to create the image plot.
Plotting the Matrix as an Image
imagesc is used to plot the y matrix as an image.
The clim color limit sets the range of values that we want to use to scale the
color in the image. Since we want a black and white image, with 1 for white
and 0 for black, we set the color limits to [0 1]. Once imagesc has created the
99
figure with the image, we set the colormap to gray. This results in a figure
with alternating white and black bars (Fig. 9-2).
10
12
14
16
18
20
1 2 3 4 5 6 7 8 9 10
Figure 9-2. Figure created using imagesc of matrix from figure 10-1. The rows are plotted
starting from the top of the figure.
Colormap
To understand what colormap is doing, click on the Edit menu of the figure
and select Colormap...
100
Figure 9-3. Colormap Editor showing currently selected gray colormap. Note that the
Color data min value is set to 0 (this is what makes 0 values black in the image), and the
Color data max value is set to 1 (this is what makes 1 values white in the image).
101
Try changing to a different colormap by clicking on the Tools menu and
selecting Autumn from the Standard Colormaps dropdown menu. You will
see that the figure now alternates between yellow and red bars. This is
because values of 0 in this colormap correspond to red, and values of 1
correspond to yellow.
2. Double click on the bar on the far right below the colormap (Fig. 9-
4).
Double-click on the
bar below the
colormap to change
the color.
3. This opens the Select Marker Color dialog box (Fig. 9-5). Now
select blue and click OK
102
Figure 9-5. Dialog box to select marker color.
4. Now, you will see the the Colormap change (Fig. 9-6) as well as your
original figure (Fig. 9-7).
Figure 9-6. The Colormap Editor maximum value is now blue, and the intermediate
values grade evenly between white and blue.
103
Figure 9-7. Figure with the newly created color map. With the new colormap, values of 1
correspond to blue, and values of 0 correspond to white.
Figure 9-8. Figure with colorbar indicating the scale corresponding to the image colors.
104
Try This
Modify the program to produce a gradation of bars ranging from 0 to 1 (e.g.
bars with values of 0, .1, .2, ..., 1), rather than alternating bars of 0 and 1 (Fig.
9-9).
10
12
14
16
18
20
1 2 3 4 5 6 7 8 9 10
Figure 9-9. Example modified to produce a gradation of bars with the gray
colormap.
>> r=imread('reef.tif','tif');
105
the data type is uint8. This means an unsigned 8-bit integer is used to
represent the color of each pixel.
>> r(1,1,1:3)
ans(:,:,1) =
66
ans(:,:,2) =
153
ans(:,:,3) =
174
Remember each pixel is specified by three colors. The first pixel is made up
of Red: 66, Green 153, and Blue 174. Since they are 8-bit values they can
have 256 different values (28=256). The values range from 0 to 255. The
‘unsigned’ part of uint8, means that there are no negative values. Using
uint8 saves a lot of memory over MATLAB’s standard double data type.
>> image(r)
The image function decodes the RGB scale to create the color image (Fig. 9-
10).
106
50
100
150
200
250
300
350
400
450
Thresholding Images
To give an example of basic image processing, let’s write a script to calculate
the total area covered by reefs. The idea is that we will threshold the image
so that all pixels that are blue will be set to 0, and all pixels that are not blue
(brown where there are corals) will be set to 1. Then we just need to add all
the ones together.
We will only work with the blue layer (layer 3) for this. We will load the blue
layer into a new matrix called rb.
107
rb=r(:,:,3); %just grab 3rd layer (blue)
imagesc(rb);
colormap(gray);
This results in an image where the light areas (areas with high values) are
areas with a lot of blue (Fig. 9-11). Areas with little blue, come out as dark
(areas with low values).
50
100
150
200
250
300
350
400
450
Figure 9-11. Blue layer of image from figure 10-10 plotted as a scaled image.
Now we threshold the image to turn it into a black and white image. The
goal is to have areas with corals white, and areas of sand result in black. In
this code, a new matrix called rbt is created which has values of 1 where rb
was less than 130, and values of 0 where rb was greater than or equal to 130.
The threshold value of 130 was determined by trial-and-error, for this
example. This code yields the black and white image shown in figure 9-12.
108
rbt=rb<130; %threshold dark values
imagesc(rbt)
colormap(gray)
Compare figure 9-11 and figure 9-12. This simple routine does a fairly good
job of finding areas with reefs. But, you can see in the upper right where
areas of sand have been ‘identified’ as reef (white), and areas in the left of the
image where reefs have been identified as ‘sand’ (black). We could achieve a
better discrimination if we used all color channels.
50
100
150
200
250
300
350
400
450
Figure 9-12. Thresholded image of coral reefs. Areas with reefs are in white (values of 1)
and areas of sand are black (values of 0).
To calculate the percent of the image that is composed of reefs, simply add
up all of the values in the matrix to get the number of white pixels (which
109
have the value 1). Then, divide this by the total number of pixels in the
image.
We have to use sum twice because the first sum returns the sum of each
column. The second sum then adds the column totals together. From this
we find that 0.1329 of the image is composed of reefs.
110
Formal Mathematical Description
Discrete Fourier Transform (DFT)
N-1
X(m)=∑x(n)[cos(2πnm/N)-isin(2πnm/N)]
n=0
where,
X(m)=the mth DFT output component, i.e. X(0), X(1), X(2)…X(N-1)
m=the index of the DFT output in the frequency domain
m=0,1,2,3,…,N-1
To convert the complex number to something that makes sense (i.e. as the magnitude and phase of different
frequency components), you perform the following:
Xmag(m)= │X(m)│=√(Xreal(m)2+Ximag(m)2)
Windowing
Signals are often windowed before performing the FFT to minimize leakage between frequency bins.
111
112