Introduction To Linux
Introduction To Linux
Courtesy of unix.org
History of Linux (3)
● UNIX originated as a research project at AT&T Bell Labs in
1969 by Ken Thompson and Dennis Ritchie.
● The first multiuser and multitasking Operating System in the
world.
● Developed in several different versions for various hardware
platforms (Sun Sparc, Power PC, Motorola, HP RISC
Processors).
● In 1991, a student at the University of Helsinki (Linus Torvalds)
created a UNIX-like system to run on the Intel 386 processor.
Intel had already started dominating the PC market, but UNIX
was nearly absent from the initial processor Intel market.
Why should I choose Linux?
● Best price/performance ratio
● Reliable
● User friendly
● Ubiquitous (from your mobile phone to a
supercomputer)
● Scientific software is developed mostly in Linux
today.
What is Linux made of?
Linux distributions
● Often referred to as 'distros'.
● The Linux kernel with a set of
programs/applications (text editors, compilers,
office suites, web browsers, etc) that make the
system usable.
● Slackware was one of the first Linux distributions.
● Debian, RedHat (Fedora, RHEL) and Canonical
(Ubuntu) are some of the most popular ones today.
Linux distributions (2)
● There is a plethora of Linux distros out there,
one of the strongest points of the Linux
community.
● Which one to choose?
● General distributions: to replace your average
desktop/server)
● Function specific distributions: They are tailored
towards a specific audience (i.e. life science)
Linux distributions (3)
Generic distros:
● Redhat based: Fedora, RHEL, CentOS, Scientific Linux
● Debian based: Debian, Ubuntu
Or task-specific ones (tailored distributions):
● BioLinux
● BioKnoppix
● BioSLAX
● And many others
Package repositories
● Each Linux distro can connect to one or more
package repositories
● They make it easy to search for/install/uninstall
specific applications
● Package manager (yum, apt)
● “Find me all sequence analysis apps and install
them”
How to choose a Linux distro
● Try more than one to get a feeling.
● What do your colleagues/team members use?
● Do the package repositories have the applications you
wish to use?
● How long the distro authors will keep maintaining it?
● Do you have a less common laptop/desktop that might
have hardware compatibility problems with that distro?
(rare but it happens)
● http://en.wikipedia.org/wiki/Linux_distribution
Interacting with Linux
● Using it via a Graphical User Interface (GUI)
(aka Like Windows/Mac, your
smartphone/tablet)
● Using it via the command line (like the
PowerShell on Windows, or your Terminal
window on your Mac)
● Pros and cons in each approach
Linux GUI mode (GNOME)
Linux GUI mode (KDE)
Linux Command Line mode
So how to install/try Linux?
● Without affecting your current computer setup:
– Use a Live CD (boot your computer from it)
– Do a full installation of Linux on a virtual machine
● Links to distro Live CDs:
– http://fedoraproject.org/wiki/FedoraLiveCD
– http://www.debian.org/CD/live/
● Link to a video (install a Linux OS on Windows using
VirtualBox):
– https://www.youtube.com/watch?v=7jOnscRjaFs
Basic demo of a Linux system
Example: cd /storage/mydata
Tip: Always make sure that you have a space between a shell
command and its argument(s).
Basic Shell Principles (2):
● All UNIX shells are case sensitive with regards to both the
commands and their arguments, in contrast to versions of
Windows/DOS systems. This means that typing:
cd /mydirectory/programs
Is not the same as typing:
CD /MYDIRECTORY/PROGRAMS
Tip: Usually, shell commands are lower case, unless otherwise stated.
The shell prompt
● The shell prompt is an indication that the system is ready
to execute your commands, but it also gives you useful
info:
georgios@biotin /usr/bin/virexp $
Tip: Remember not to confuse the term 'path' with the shell's
execution path, as described in earlier slides.
Directory Hierarchy Diagram
/
gm
/home/gm/mydata/backseq1 mydata
Navigating the filesystem
● Use 'pwd' to Print your Working Directory. For example, if I
login to the host 'biotin' and I type pwd, I get the following:
georgios@biotin ~ $ pwd
/mn/biotroll/u1/georgios
georgios@biotin ~ $
georgios@biotin ~ $ cd mysequences
georgios@biotin ~/mysequences $
Navigating the filesystem (4)
● The “cd” command (Change Directory) can be used for moving
around the filesystem. It takes a path as its argument.
● The path can be “absolute”. For example:From your home
directory, you can go to the /usr/bin directory by typing:
georgios@biotin ~ $ cd /usr/bin
georgios@biotin /usr/bin $
● The path can also be “relative”. For example: If you are already
under the /usr directory, you could just type:
georgios@biotin /usr $ cd bin
georgios@biotin /usr/bin $
Navigating the filesystem (5)
● The command “cd ..” will get you one level up. For example, if we go back
to slide 30 and we assume that you are under the 'mysequences' directory,
if you want to go back to the toplevel of your home directory, you type:
georgios@biotin ~/mysequences $ cd ..
georgios@biotin ~ $
● “..” is a shorthand notation for the previous directory level and it can really
save you from typing long directory names that you cannot remember. It
always works in a relative path context.
● The alternative would be to give an “absolute” path to the cd command:
georgios@biotin ~/mysequences $ cd /mn/biotroll/u1/georgios
georgios@biotin ~ $
Listing files
● You are back at the mysequences directory under
your home directory. Your instructor asked you to list
the files in the directory:
georgios@biotin ~/mysequences $ ls
seqdocs v2.3_admin.pdf xlrhodop.fasta
georgios@biotin ~/mysequences $
● Note that the wildcard character (*) towards the end of the filename
we are trying to search for. This says that we know that the name
contains the string “xlrhodop.fas”. This would match all relevant
filenames (reporting their exact location in the directory tree)
/mn/biotroll/u1/georgios/xlrhodop.fasta
/mn/biotroll/u1/georgios/mysequences/xlrhodop.fasta
File permissions (1)
● Every file in UNIX has a set of permission flags that define in a
strict way, who is allowed to read, write (modify) or execute that
file.For example, let's take one of the listed files of the ls -la output
command:
Starting from the left, this says: The file xlrhodop.fasta can be read
(r)read, (w)modified,(x)executed by its owner (georgios). Ignore the
rest of the flags for now.
File permissions (2)
● Directories are no exception to this rule and they also have
permission flags. For example:
The + sign says add write permissions (w) for the user (u) that owns
the file.
● You can also add/remove more than one flag at a time:
georgios@biotin ~/mysequences $ chmod u-wx v2.3_admin.pdf
This removes the xlrhodop.fasta file and re-generates it with the name
myxlr.fasta, under the same directory.
-r-------- 1 georgios biotek 1777 Mar 26 15:22 myxlr.fasta
'mv' does not only preserve file permissions and ownership rights but it
does also preserve timestamps, so it is an effective way to rename a file.
The UNIX shell has a rename command, but mv could be used
effectively to rename a file.
Tip: All the points we have made about mv for files are also true for
directories.
Redirecting command output
●
The > symbol is the output redirection operator and can be
used to re-direct the output of any UNIX command that prints
something on the screen.
● Lets suppose that you want to merge two fasta sequences into
a single file:
cat myseq1.fasta myseq2.fasta
would print the contents of both files one-after the other on the
screen (stdout). But what you really want is to place this output to
a file. You can then type:
cat myseq1.fasta myseq2.fasta > mergedseq.fasta
to place the output in the file mergedseq.fasta .
Redirecting command input
● Suppose that you have a file with numbers and you wish
to sort it from the smaller to the larger number
sort -g < numbers.txt
● Normally, 'sort' would take its input from the keyboard.
However, because you use the input redirection symbol
(<), it is like typing the contents of the file (numbers) in
one step.
● Bottom line: You get your numbers sorted.
● Question: What do you think about this command?
sort -g < numbers.txt > sortednumbers.txt
The Shell Pipe
● Do you ever wonder how the term 'pipeline' was
established in computing/bioinformatics context?
● One of the most powerful concepts of the
command line environment.
● The more you learn to use it, the more you will
appreciate its power.
● Mastering the shell pipe will allow you to build very
powerful processing utilities to solve your
problems.
The Shell Pipe (2)
The Shell Pipe (3)
● Quite often, we need to direct the standard output of one
command to the standard input of another.
●
The most commonly used operator to do that is the pipe oparator |
● Suppose for example that we need to count the number of lines of
a text file to see how long it is.
cat mytext.txt | wc -l
The 'cat' command will print all the lines of the file. However, instead
of doing that on the screen, it gives all the output to the 'wc -l'
command. The result is an integer representing the number of lines
of the mytext.txt file.
References
● UNIX has a built-in reference manual. The
'man' command should be you best friend,
whenever you need help for a particular
command. For example, type
man cat
Every UNIX system should have this facility.
References (2)
● What if you don't know which command to use?
Let's say for example that I am looking for
pattern matching commands. I would type
apropos pattern
at the shell prompt, and this would give me a list
of relevant commands
References (3)
● University of Surrey Unix Tutorial for Beginners on the World Wide Web:
http://www.ee.surrey.ac.uk/Teaching/Unix/