12important Unix Commands
12important Unix Commands
Carl Mason
carlm@demog.berkeley.edu
Contents
1 Introduction 1
2 Terminal windows 2
3 The Filesystem 3
1 Introduction
Although Unix has a point and click graphic user interface, called X11, which
works just like those other operating systems, Unix is at heart a command
line operating system. So while it is possible in many cases to do what you
want via pointing and clicking, using the command line and other text based
tools will make you happier and much more efficient... eventually.
To operate with the command line, you will need to know the 12 most
important Unix commands described in Section 5. To enjoy it you will also
need to know a few tricks that are also covered in this document.
You don’t need to know much about Unix in order to start doing Science,
but it would not hurt to learn more. In your copious free time, check out
1
some of the Unix primers on the web. Ask google something like “Unix
beginner” to find more resources than you could possibly want.
Note that since the Mac OS is simply a Unix application, nearly ev-
erything in this document works the same way in a mac. On a mac, the
terminal window application is under Applications/Utilities.
2 Terminal windows
In order to use the command line or shell, you must open a terminal win-
dow (also known an xterm window). A terminal window can be launched
from: [Application]→[Accessories]→[Terminal].
It should look something like Figure 1. Notice that the window features
a menu bar, unfortunately the menu bar it is both useless and misleading.
Make the menu bar disappear by pressing the right button and clearing
the “show Menubar” check-box.
Now aside from the title bar at the top, the only words in the terminal
window should be the Unix prompt. The purpose of the Unix prompt
is to indicate that the shell is ready to accept commands. It also con-
tains useful information. In Figure 1, the prompt is is [carlm@twins ~]$,
indicating the user, carlm, the machine, twins and the current directory
which is indicated by the ~. In this and other documents, the Unix prompt
will look like this: @:> . In the real Unix prompt, the symbol ~ is a
special character whose meaning is ”home directory”. ~/Dissertation
means a file or directory called ”Dissertation” which is located within your
home directory. In my case this would be /hdir/0/carlm/Dissertation.
~wachter/Brilliant/insight translates to a file (or possibly a directory)
called insight in a directory1 called Brilliant in Ken Wachter’s home
directory, or /hdir/0/wachter/Brilliant/insight. More about home di-
rectories can be found in Section 2.
If you are in no particular hurry to finish your dissertation, you can
modify a large number of colors and beeps and other important features
of the terminal window. right button [Edit Current Profile] is the
place to start wasting time.
If you have already wasted time on this sort of thing and are thus old
enough to find the default font a bit small, a useful trick with terminal
windows (and browser windows too) is the CTRL + SHIFT + + to
increase and CTR + - to decrease the size of the typeface.
1
directories are also called ”folders”
2
Figure 1: terminal window
3 The Filesystem
Whenever you login to a machine on the Demography network, your initial
present working directory – the location within the filesystem in which
applications will begin looking for the files that you specify – is your home
directory. Every user has exactly one home directory.
In a multiuser system such as the Demography Lab, your home direc-
tory is one of a huge number of interconnected directories that form a single
unified filesystem. The magic of the filesystem is that even though the var-
ious files and directories of which it is composed are “physically”/footnoteor
electromagnetically present on various different machines all over the net-
work, to us users, the whole thing appears to be one single thing and that
thing looks and feels the same no matter which Demography Lab machine
we happen to be using at the moment.
An upside down tree makes a pretty good metaphor for the filesystem.
Such a “tree” is shown in Figure 2. At the top of the figure is a directory
called “/” which is the “root” of the filesystem. Every file and directory
in the filesystem can be uniquely specified by a filepath that begins with
root. For example, the file that holds my correspondence with my mother
is /hdir/0/carlm/mail/mom.
As you can see in Figure 2 home directories all live in a directory called
3
/hdir/0. Although it is just one of many directories within this giant upside
down tree of a filesystem, your home directory is a special place that you
will come to know and love and where you will do your very best work. It is
the part of the filesystem that you own and the “place” where you will find
yourself when you first login.
Because the entire filesystem looks the same to all users all the time,
it is easy to share data with your colleagues. This is good thing because
humanity benefits when scientists collaborate. But unfortunately scientists
can occasionally turn out to be creeps so sharing a filesystem is a little scary
as well.
The “solution” to the creep problem is to not keep sensitive information
on Demography computers. You have already promised not to keep data
covered by SB 13862 . It goes without saying that files that can tie you to
illegal activities are also a no-no. There are however, a few files that belong
on the network and yet where privacy is an issue (e.g. email). For those
files, managing who may read and/or change them requires understanding
the mode and ownership of files. Each file and directory has an owner and
the owner can determine who is allowed to read, write and/or execute each
file. See the chmod command below for how to change the various file modes
or permissions. The chmod command is described in 5.
4
Figure 2: The Demography Lab filesystem
5
are done talking. Many Unix utilities need this explicitly. Sometimes,
ctrl + d will have a similar function as ctrl + c – it is certainly
something to type when you are desperate.
history The history feature allows you to recall and edit any command
that you have previously issued. To make the previous command appear at
the @:> hit ctrl + p or equivalently the up arrow key. To see even
more previous stuff type ctrl + p more times. ctrl + n or equiv-
alenlty the down arrow will make the next command appear – obviously,
this makes no sense unless you have typed ctrl + p at least once.
You can operate on a recalled command using several standard emacs
editing keys:
You can also use the left arrow and right arrow to move about within
a recalled line. The delete and backspace keys do what you would
expect.
6
TAB completion If you hit the tab key anytime while constructing
a command, the shell will do it’s best to figure out what you are planning to
type next. If you are typing a command it will try to find a command that
starts out with what you have already typed. If you are typing the name of
a file the shell will try to complete if for you. If what you have typed does
not uniquely determine a command or filename, the shell will beep at you
and provide a list of possible completions. You can then type a few more
characters and hit tab again.
scripting Whenever you find yourself typing the same command several
times, it’s time to consider scripting. A shell script is just a file of commands
that you could have entered at the keyboard, but typed into a file instead.
You can then set the file’s execute bit (See Most Important Command num-
ber 5) and execute that file – perhaps now, perhaps later. You will need to
use an editor such as emacs to create that shell script. Knowing how to use
emacs can save you lots of time and hair loss – particularly if many of the
commands you are typing are quite similar.
Scripts are also very useful for people who like the idea of being able to
reproduce results.
cool stuff The shell is also responsible for displaying the results of the ls
command (See 1) in lots of colors.
Regular expressions are combinations of symbols that the shell inter-
prets in clever ways. Generally we use regular expressions to specify lists
of files or directories on which a command should operate, but they have
many other uses as well. A typical use would be to delete from your current
working directory all of the .pdf files whose name begins with a vowel:
@:> rm [AEIOUaeiou].pdf
regular expressions come up in several of the “12” important commands.
7
how to do them from the command line, makes you more efficient, reduces
errors and opens the possibility of automating tasks with shell scripts.
8
3. cd <directory-name> The “change directory” command makes an-
other directory your present working directory. With no argument, it
”moves you” to your home directory. To move one directory ”higher”
use ”..” (two dots) in place of the directory’s name. The one and only
parent directory of the current directory is always addressable as ”..”.
9
@:> -rf directory1
the above command will remove directory1 and all of the files and
subdirectories within it, the -f argument ensures that rm will not
ask for permission with each file. -rf * is a VERY dangerous
command. If you find yourself typing it make sure you are not
drunk.
10
It would be a good idea to create the above link right now. Use the
mkdir to create a new directory in /data/commons called your userid.
Then create a link in your home directory so that you can start storing
and accessing huge data sets right away.
7. mv file-name new-file-name The “move” command changes the name
or location within the filesystem of a file or a directory.
8. less file-name Variant of the more command – less is used to scroll
through a file on the screen. While displaying a file, enter scrolls
one additional line; space scrolls one additional screen full; b
scrolls backwards, q quits, /word searches forward for “word”,
?word searches backward for “word”.
9. lpr <-Pprinter-name> filename The “line printer” command prints
a file to the named printer. While you should now about lpr for the
test, one generally prints from within applications. When printing
from outside of an application, xpp is a printer gui that is a bit easier
to use than lpr.
Note that by default each of the above printers (except for the ones
in Barrows Hall) is configured to print in economy mode thus saving
toner and by extension the world. If you need high quality printouts
add “HQ” to the printer name e.g. lpr -PsesHQ filename.
10. pwd “present working directory” tells you where you are, that is, it
tells you which directory the shell thinks is the current directory.
11
11. du <directory> The “disk use” command is designed to tell you how
much disk space each directory is consuming. It’s main use, however,
is simply to display the directory structure.
12. ssh <-l userid> hostname “ssh” stands for “secure shell” it is really
a separate application but it behaves like a shell command and is
really useful so it is included here. If you type ssh galton at the unix
prompt, (and then your password when prompted) a remote shell will
open on an entirely different machine from the one you are sitting in
front of.
The new remote shell on galton will have a prompt like: [userid@galton ~]$
indicating that the commands that you type will be executed the ma-
chine galton which happens to live in the basement of 2224 Piedmont.
Happily the new shell will see the same filesystem and understand the
same Unix commands.
The reason for ssh’ing to galton is that it is much more powerful than
any of the workstations.
NOTE: when using ssh or ssh-like programs on machines outside of the
Demography Lab, you will need to specify both your userid (with the -l
flag) and the fully qualified hostname e.g. galton.demog.berkeley.edu5 .
NOTE even more urgently: Since we started running freeNX
servers, ssh’ing from outside of the department instantly became anachro-
5
surprise - from outside the department you will probably end up on coale rather than
galton if you type this. The reason is that from non Demography Lab machines, all
.demog.berkeley.edu hostnames “resolve” to the outside interface of our firewall. You can
ssh to coale from whichever host you wind up on
12
nistic. FreeNX provides a much better way of connecting to the De-
mography network if your goal is to do science. http://lab.demog.
berkeley.edu/LabWiki is the place to go to find out about freeNX.
12. exit or logout closes the current Unix window, and logs you off – if
the current window is the console window.
* The asterisk or “star” character is used in regular expressions (See item 1).
When the shell sees a * by itself as in @:> ls * it replaces * with a
list of all the files and subdirectories in the current directory. @:>
ls * tells the shell to run the ls command on each and every file and
subdirectory in the current directory. So where @:> ls will show files
and subdirectories @:> ls * will list the files that live in subdirectories
of the current directory as well.
& The ampersand tells the shell to run the process in the “background”.
When a process is launched in the background, the xterm (See 2)
immediately returns with a prompt. When you run a process in the
foreground (the usual case) the prompt comes back only when the
process exits.
NOTE it only makes sense to run programs in the background if
the program spawns a new window. So emacs, Stata, userfirefox,
or oowriter are all fine running in the background. The 12 most
important Unix commands are not. They all write their responses to
the terminal window. If you put them in the background they cannot
do this.
REALLY important: R should not be run with the & for the same
reason: it runs in the window from which it was launched. This will
all make sense after the first week or two of 213.
To bring a backgrounded program to the foreground, type
@:> fg <%n>
where %n is the percent sign followed by a number indicating which
backgrounded process you want to foreground. You only need to enter
13
the %n if you have more than one process running in the background.
Type @:> jobs
| The “pipe” is used to send the standard output of one process into the
standard input of another. For example, if you wanted to know the
number of lines in every data file in the current directory you might
type: @:> ls *.data | wc -l . The ls *.data produces a list of
files in the current directory that end in “.data”, the | then feeds this
list to the word count command “wc”. The -l argument tells wc to
only report the number of lines. This example assumes that you have
named all of your data files somethingorother.data.
> The right angle bracket (or greater than sign) is used to send the out-
put of a process into a file. @:> ls > file.list would produce
a file called file.list containing (surprise) a list of files. Use dou-
ble angle brackets to append a process’s output to an existing file.
@:> ls ∼/public html >> file.list
would add the names of the file’s in your public html directory.
14