Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Unit 1

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 9

RTS

UNIT I
Introduction
Unix is an Operating System that is truly the base of all Operating Systems like Ubuntu, Solaris,
POSIX, etc. It was developed in the 1970s by Ken Thompson, Dennis Ritchie, and others in the
AT&T Laboratories. It was originally meant for programmers developing software rather than non-
programmers.
Unix and the C were found by AT&T and distributed to government and academic institutions,
which led to both being ported to a wider variety of machine families than any other operating
system. The main focus that was brought by the developers in this operating system was the Kernel.
Unix was considered to be the heart of the operating System. The system Structure of Unix OS are
as follows:
UNIX is a family of multitasking, multiuser computer operating systems developed in the mid
1960s at Bell Labs. It was originally developed for mini computers and has since been ported to
various hardware platforms. UNIX has a reputation for stability, security, and scalability, making it
a popular choice for enterprise-level computing.
The basic design philosophy of UNIX is to provide simple, powerful tools that can be combined to
perform complex tasks. It features a command-line interface that allows users to interact with the
system through a series of commands, rather than through a graphical user interface (GUI).
Some of the key features of UNIX include:
1. Multiuser support: UNIX allows multiple users to simultaneously access the same system and
share resources.
2. Multitasking: UNIX is capable of running multiple processes at the same time.
3. Shell scripting: UNIX provides a powerful scripting language that allows users to automate
tasks.
4. Security: UNIX has a robust security model that includes file permissions, user accounts, and
network security features.
5. Portability: UNIX can run on a wide variety of hardware platforms, from small embedded
systems to large mainframe computers.
6. Communication: UNIX supports communication methods using the write command, mail
command, etc.
7. Process Tracking: UNIX maintains a record of the jobs that the user creates. This function
improves system performance by monitoring CPU usage. It also allows you to keep track of
how much disk space each user uses, and the use that information to regulate disk space.
Today, UNIX is widely used in enterprise-level computing, scientific research, and web servers.
Many modern operating systems, including Linux and macOS, are based on UNIX or its variants.

Figure – system structure

SEAT Page 1
RTS

 Layer-1: Hardware: It consists of all hardware related information.


 Layer-2: Kernel: This is the core of the Operating System. It is a software that acts as the
interface between the hardware and the software. Most of the tasks like memory management,
file management, network management, process management, etc., are done by the kernel.
 Layer-3: Shell commands: This is the interface between the user and the kernel. Shell is the
utility that processes your requests. When you type in a command at the terminal, the shell
interprets the command and calls the program that you want. There are various commands like
cp, mv, cat, grep, id, wc, nroff, a.out and more.
 Layer-4: Application Layer: It is the outermost layer that executes the given external
applications.

Essential Unix Commands



Input-output system calls in C | Create, Open, Close, Read, Write



System calls are the calls that a program makes to the system kernel to provide the services to
which the program does not have direct access. For example, providing access to input and output
devices such as monitors and keyboards. We can use various functions provided in the C
Programming language for input/output system calls such as create, open, read, write, etc.
Before we move on to the I/O System Calls, we need to know about a few important terms.
Important Terminology
What is the File Descriptor?
The file descriptor is an integer that uniquely identifies an open file of the process.
File Descriptor table: A file descriptor table is the collection of integer array indices that are file
descriptors in which elements are pointers to file table entries. One unique file descriptors table is
provided in the operating system for each process.

SEAT Page 2
RTS

File Table Entry: File table entries are a structure In-memory surrogate for an open file, which is
created when processing a request to open the file and these entries maintain file position.

Standard File Descriptors: When any process starts, then that process file descriptors table’s
fd(file descriptor) 0, 1, 2 open automatically, (By default) each of these 3 fd references file table
entry for a file named /dev/tty
/dev/tty: In-memory surrogate for the terminal.
Terminal: Combination keyboard/video screen.

SEAT Page 3
RTS

Read from stdin => read from fd 0: Whenever we write any character from the keyboard, it reads
from stdin through fd 0 and saves to a file named /dev/tty.
Write to stdout => write to fd 1: Whenever we see any output to the video screen, it’s from the
file named /dev/tty and written to stdout in screen through fd 1.
Write to stderr => write to fd 2: We see any error to the video screen, it is also from that file write
to stderr in screen through fd 2.
Input/Output System Calls
Basically, there are total 5 types of I/O system calls:
1. C create
The create() function is used to create a new empty file in C. We can specify the permission and the
name of the file which we want to create using the create() function. It is defined
inside <unistd.h> header file and the flags that are passed as arguments are defined
inside <fcntl.h> header file.
Syntax of create() in C
int create(char *filename, mode_t mode);
Parameter
 filename: name of the file which you want to create
 mode: indicates permissions of the new file.
Return Value
 return first unused file descriptor (generally 3 when first creating use in the process because 0,
1, 2 fd are reserved)
 return -1 when an error
How C create() works in OS
 Create a new empty file on the disk.
 Create file table entry.
 Set the first unused file descriptor to point to the file table entry.
 Return file descriptor used, -1 upon failure.
2. C open
The open() function in C is used to open the file for reading, writing, or both. It is also capable of
creating the file if it does not exist. It is defined inside <unistd.h> header file and the flags that are
passed as arguments are defined inside <fcntl.h> header file.
Syntax of open() in C
int open (const char* Path, int flags);
Parameters
 Path: Path to the file which we want to open.
 Use the absolute path beginning with “/” when you are not working in the same
directory as the C source file.
 Use relative path which is only the file name with extension, when you
are working in the same directory as the C source file.
 flags: It is used to specify how you want to open the file. We can use the following flags.

Flags Description

O_RDONLY Opens the file in read-only mode.

O_WRONLY Opens the file in write-only mode.

O_RDWR Opens the file in read and write mode.

SEAT Page 4
RTS

Flags Description

O_CREAT Create a file if it doesn’t exist.

O_EXCL Prevent creation if it already exists.

O_ APPEND Opens the file and places the cursor at the end of the contents.

O_ASYNC Enable input and output control by signal.

O_CLOEXEC Enable close-on-exec mode on the open file.

O_NONBLOCK Disables blocking of the file opened.

O_TMPFILE Create an unnamed temporary file at the specified path.

How C open() works in OS


 Find the existing file on the disk.
 Create file table entry.
 Set the first unused file descriptor to point to the file table entry.
 Return file descriptor used, -1 upon failure.

A program in Unix is a sequence of executable instructions on a disk. You can use the
command size to get a very cursory check of the structure and memory demands of the program, or
use the various invocations of objdump for a much more detailed view. The only aspect that is of
interest to us is the fact that a program is a sequence of instructions and data (on disk) that may
potentially be executed at some point in time, maybe even multiple times, maybe even concurrently.
Such a program in execution is called a process. The process contains the code and initial data of the
program itself, and the actual state at the current point in time for the current execution. That is the
memory map and the associated memory (check /proc/pid/maps), but also the program counter, the
processor registers, the stack, and finally the current root directory, the current directory, environment
variables and the open files, plus a few other things (in modern Linux for example, we find the
processes cgroups and namespace relationships, and so on - things became a lot more complicated
since 1979). In Unix processes and programs are two different and independent things. You can run a
program more than once, concurrently. For example, you can run two instances of the vi editor, which
edit two different texts. Program and initial data are the same: it is the same editor. But the state inside
the processes is different: the text, the insert mode, cursor position and so on differ. From a
programmers point of view, “the code is the same, but the variable values are differing”. A process
can run more than one program: The currently running program is throwing itself away, but asks that
the operating system loads a different program into the same process. The new program will inherit
some reused process state, such as current directories, file handles, privileges and so on. All of that is
done in original Unix, at the system level, with only four syscalls:

SEAT Page 5
RTS

 fork()
 exec()
 wait()
 exit()

Usermode and Kernel

Usermode and Kernel


Context switching: Process 1 is running for a bit, but at (1) the kernel interrupts the execution and
switches to process 2. Some time later, process 2 is frozen, and we context switch back to where we
left off with (1), and so on. For each process, this seems to be seamless, but it happens in intervals
that are not continous. Whenever a Unix process does a system call (and at some other opportunities)
the current process leaves the user context and the operating system code is being activated. This is
privileged kernel code, and the activation is not quite a subroutine call, because not only is privileged
mode activated, but also a kernel stack is being used and the CPU registers of the user process are
saved. From the point of view of the kernel function, the user process that has called us is inert data
and can be manipulated at will. The kernel will then execute the system call on behalf of the user
program, and then will try to exit the kernel. The typical way to leave the kernel is through the
scheduler. The scheduler will review the process list and current situation. It will then decide into
which of all the different userland processes to exit. It will restore the chosen processes registers, then
return into this processes context, using this processes stack. The chosen process may or may not be
the one that made the system call. In short: Whenever you make a system call, you may (or may not)
lose the CPU to another process. That’s not too bad, because this other process at some point has to
give up the CPU and the kernel will then return into our process as if nothing happened. Our program
is not being executed linearly, but in a sequence of subjectively linear segments, with breaks
inbetween. During these breaks the CPU is working on segments of other processes that are also
runnable.

fork() and exit()

In traditional Unix the only way to create a process is using the fork() system call. The new process
gets a copy of the current program, but new process id (pid). The process id of the parent process (the
process that called fork()) is registered as the new processes parent pid (ppid) to build a process tree.
In the parent process, fork() returns and delivers the new processes pid as a result. The new process
also returns from the fork() system call (because that is when the copy was made), but the result of
the fork() is 0. So fork() is a special system call. You call it once, but the function returns twice: Once
in the parent, and once in the child process. fork() increases the number of processes in the system by
one. Every Unix process always starts their existence by returning from a fork() system call with a 0
result, running the same program as the parent process. They can have different fates because the
result of the fork() system call is different in the parent and child incarnation, and that can drive
execution down different if() branches. In Code:
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>

main(void) {
pid\_t pid = 0;

pid = fork();
if (pid == 0) {
printf("I am the child.\\n");
}
if (pid > 0) {
printf("I am the parent, the child is %d.\\n", pid);

SEAT Page 6
RTS

}
if (pid < 0) {
perror("In fork():");
}

exit(0);
}

Running this, we get:

kris@linux:/tmp/kris> make probe1


cc probe1.c -o probe1
kris@linux:/tmp/kris> ./probe1
I am the child.
I am the parent, the child is 16959.
We are defining a variable pid of the type pid_t. This variable saves the fork() result, and using it we
activate one (“I am the child.”) or the other (“I am the parent”) branch of an if(). Running the program
we get two result lines. Since we have only one variable, and this variable can have only one state, an
instance of the program can only be in either one or the other branch of the code. Since we see two
lines of output, two instances of the program with different values for pid must have been running. If
we called getpid() and printed the result we could prove this by showing two different pids (change
the program to do this as an exercise!). The fork() system call is entered once, but left twice, and
increments the number of processes in the system by one. After finishing our program the number of
processes in the system is as large as before. That means there must be another system call which
decrements the number of system calls. This system call is exit(). exit() is a system call you enter once
and never leave. It decrements the number of processes in the system by one. exit() also accepts an
exit status as a parameter, which the parent process can receive (or even has to receive), and which
communicates the fate of the child to the parent. In our example, all variants of the program
call exit() - we are calling exit() in the child process, but also in the parent process. That means we
terminate two processes. We can only do this, because even the parent process is a child, and in fact, a
child of our shell. The shell does exactly the same thing we are doing:
bash (16957) --- calls fork() ---> bash (16958) --- becomes ---> probe1 (16958)

probe1 (16958) --- calls fork() ---> probe1 (16959) --> exit()
|
+---> exit()
exit() closes all files and sockets, frees all memory and then terminates the process. The parameter
of exit() is the only thing that survives and is handed over to the parent process.

wait()

Our child process ends with an exit(0). The 0 is the exit status of our program and can be shipped. We
need to make the parent process pick up this value and we need a new system call for this. This
system call is wait(). In Code:
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>

#include <sys/types.h>
#include <sys/wait.h>

main(void) {
pid\_t pid = 0;
int status;

SEAT Page 7
RTS

pid = fork();
if (pid == 0) {
printf("I am the child.\\n");
sleep(10);
printf("I am the child, 10 seconds later.\\n");
}
if (pid > 0) {
printf("I am the parent, the child is %d.\\n", pid);
pid = wait(&status);
printf("End of process %d: ", pid);
if (WIFEXITED(status)) {
printf("The process ended with exit(%d).\\n", WEXITSTATUS(status));
}
if (WIFSIGNALED(status)) {
printf("The process ended with kill -%d.\\n", WTERMSIG(status));
}
}
if (pid < 0) {
perror("In fork():");
}

exit(0);
}

And the runtime protocol:

kris@linux:/tmp/kris> make probe2


cc probe2.c -o probe2
kris@linux:/tmp/kris> ./probe2
I am the child.
I am the parent, the child is 17399.
I am the child, 10 seconds later.
End of process 17399: The process ended with exit(0).
The variable status is passed to the system call wait() as a reference parameter, and will be
overwritten by it. The value is a bitfield, containing the exit status and additional reasons explaining
how the program ended. To decode this, C offers a number of macros with predicates such
as WIFEXITED() or WIFSIGNALED(). We also get extractors, such
as WEXITSTATUS() and WTERMSIG(). wait() also returns the pid of the process that terminated, as
a function result. wait() stops execution of the parent process until either a signal arrives or a child
process terminates. You can arrange for a SIGALARM to be sent to you in order to time bound
the wait().
The init program, and Zombies
The program init with the pid 1 will do basically nothing but calling wait(): It waits for terminating
processes and polls their exit status, only to throw it away. It also reads /etc/inittab and starts the
programs configured there. When something from inittab terminates and is set to respawn, it will be
restarted by init. When a child process terminates while the parent process is not (yet) waiting for the
exit status, exit() will still free all memory, file handles and so on, but the struct task (basically
the ps entry) cannot be thrown away. It may be that the parent process at some point in time arrives at
a wait() and then we have to have the exit status, which is stored in a field in the struct task, so we
need to retain it. And while the child process is dead already, the process list entry cannot die because
the exit status has not yet been polled by the parent. Unix calls such processes without memory or
other resouces associated Zombies. Zombies are visible in the process list when a process generator (a
forking process) is faulty and does not wait() properly. They do not take up memory or any other

SEAT Page 8
RTS

resouces but the bytes that make up their struct task. The other case can happen, too: The parent
process exits while the child moves on. The kernel will set the ppid of such children with dead parents
to the constant value 1, or in other words: init inherits orphaned processes. When the child
terminates, init will wait() for the exit status of the child, because that’s what init does. No Zombies in
this case. When we observe the number of processes in the system to be largely constant over time,
then the number of calls to fork(), exit() and wait() have to balanced. This is, because for
each fork() there will be an exit() to match and for each exit() there must be a wait() somewhere. In
reality, and in modern systems, the situation is a bit more complicated, but the original idea is as
simple as this. We have a clean fork-exit-wait triangle that describes all processes.

SEAT Page 9

You might also like