Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Project

Download as pdf or txt
Download as pdf or txt
You are on page 1of 55

PROCESSES

Chapter 3
Introduction to processes
A process is an instance of a program running in a computer. A process consists of an
execution environment together with one or more threads. A thread is the operating
system abstraction of an activity.
A process is an instance of a program running in a computer. It is close in meaning to
task, a term used in some operating systems. Like a task, a process is a running program
with which a particular set of data is associated so that the process can be kept track of.
An application that is being shared by multiple users will generally have one process at
some stage of execution for each user. A process is basically a program in execution. In
computing, a process is the instance of a computer program that is being executed. It
contains the program code and its activity. Further A process is defined as an entity
which represents the basic unit of work to be implemented in the system. A process can
initiate a sub-process, which is a called a child process (and the initiating process is
sometimes referred to as its parent). A child process is a replica of the parent process
and shares some of its resources, but cannot exist if the parent is terminated. Processes
can exchange information or synchronize their operation through several methods of
inter process communication.
Figure: Layout of a process inside
main memory

To put it in simple terms, we write our computer


programs in a text file and when we execute this
program, it becomes a process which performs all
the tasks mentioned in the program. When a
program is loaded into the memory and it
becomes a process, it can be divided into four
sections stack, heap, text and data. The following
image shows a simplified layout of a process inside
main memory.
Stack: The process Stack contains the temporary data such as method/function
parameters return address and local variables.

Heap: Heap is the segment where dynamic memory allocation usually takes place
during its run time.

Text: this includes the current activity represented by the value of program counter and
the contents of the processorʼs registration.

Data: This section contains the global and static variables.


Process Life Cycle
When a process executes, it passes through different states. These stages may differ in
different operating systems, and the names of these states are also not standardized.
The operating system maintains management information about a process in a process
control block (PCB). Modern operating systems allow a process to be divided into
multiple threads of execution, which share all process management information except
for information directly related to execution. This information is held in a thread control
block (TCB). Threads in a process can execute different parts of the program code at the
same time. They can also execute the same parts of the code at the same time, but with
different execution state:
● They have independent current instructions, that is, they have (or appear to have)
independent program counters.
● They are working with different data; that is, they are (or appear to be) working
with independent registers.
Figure: Process Life cycle
Start: This is the initial state when a process is first started/created.
Ready: The process is waiting to be assigned to a processor. Ready
processes are waiting to have the processor allocated to them by
the operating system so that they can run. Process may come into
this state after Start state or while running it by but interrupted by
the scheduler to assign CPU to some other process.
Running: Once the process has been assigned to a processor by the
OS scheduler, the process state is set to running and the processor
executes its instructions.
Waiting: Process moves into the waiting state if it needs to wait for
a resource, such as waiting for user input, or waiting for a file to
become available.
Terminated or Exit:Once the process finishes its execution, or it is
terminated by the operating system, it is moved to the terminated
state where it waits to be removed from main memory.
Process life cycle
Threads
In a process, a thread refers to a single sequential activity being executed. these
activities are also known as thread of execution or thread control. Now, any
operating system process can execute a thread. we can say, that a process can have
multiple threads.
A thread is a single sequence stream within a process. Threads are also called
lightweight processes as they possess some of the properties of processes. Each
thread belongs to exactly one process. In an operating system that supports
multithreading, the process can consist of many threads. But threads can be
effective only if the CPU is more than 1 otherwise two threads have to context
switch for that single CPU.
Example: one threads can process a clientʼs request while a second thread serving
other request wait for disk access to complete
Threads are very useful in modern programming whenever a process has multiple
tasks to perform independently of the others. This is particularly true when one of
the tasks may block, and it is desired to allow the other tasks to proceed without
blocking. For example, in a word processor, a background thread may check
spelling and grammar while a foreground thread processes user input (keystrokes),
while yet a third thread loads images from the hard drive, and a fourth does
periodic automatic backups of the file being edited.

Another example is a web server - Multiple threads allow for multiple requests to be
satisfied simultaneously, without having to service requests sequentially or to fork
off separate processes for every incoming request. (The latter is how this sort of
thing was done before the concept of threads was developed. A daemon would
listen at a port, fork off a child for every incoming request to be processed, and
then go back to listening to the port).
Major benefits to multi-threading
● Responsiveness: If the process is divided into multiple threads, if one thread completes its
execution, then its output can be immediately returned.
● Faster context switch: Context switch time between threads is lower compared to the
process context switch. Process context switching requires more overhead from the CPU.
● Effective utilization of multiprocessor system: If we have multiple threads in a single
process, then we can schedule multiple threads on multiple processors. This will make
process execution faster.
● Resource sharing: Resources like code, data, and files can be shared among all threads
within a process. Note: Stacks and registers canʼt be shared among the threads. Each thread
has its own stack and registers.
● Communication: Communication between multiple threads is easier, as the threads share
a common address space. while in the process we have to follow some specific
communication techniques for communication between the two processes.
● Enhanced throughput of the system: If a process is divided into multiple threads, and
each thread function is considered as one job, then the number of jobs completed per unit
of time is increased, thus increasing the throughput of the system.
Properties of a Thread
● Only one system call can create more than one thread (Lightweight process).
● Threads share data and information.
● Threads shares instruction, global and heap regions but have its own
individual stack and registers.
● Thread management consumes no or fewer system calls as the
communication between threads can be achieved using shared memory.
● The isolation property of the process increases its overhead in terms of
resource consumption.
Thread Model
There are two types of threads to be managed in a modern system: User threads
and kernel threads. User threads are supported above the kernel, without kernel
support. There are the threads that application programmers would put into their
programs. Kernel threads are supported within the kernel of the OS itself. All
modern operating systems support kernel level threads, allowing the kernel to
perform multiple simultaneous tasks and/or to service multiple kernel system calls
simultaneously. In a specific implementation, the user threads must be mapped to
kernel threads, using one of the following strategies.
Many to one multithreading model:
The many to one model maps many user levels threads to one kernel thread. This type
of relationship facilitates an effective context-switching environment, easily
implemented even on the simple kernel with no thread support.

The disadvantage of this model is that since there is only one kernel-level thread
schedule at any given time, this model cannot take advantage of the hardware
acceleration offered by multithreaded processes or multi-processor systems. In this, all
the thread management is done in the userspace. If blocking comes, this model blocks
the whole system.
One to one multithreading model
The one-to-one model maps a single user-level
thread to a single kernel-level thread. This type of
relationship facilitates the running of multiple
threads in parallel. However, this benefit comes
with its drawback. The generation of every new
user thread must include creating a
corresponding kernel thread causing an
overhead, which can hinder the performance of
the parent process. Windows series and Linux
operating systems try to tackle this problem by
limiting the growth of the thread count.
Many to Many Model multithreading model
In this type of model, there are several user-level threads and several kernel-level
threads. The number of kernel threads created depends upon a particular
application. The developer can create as many threads at both levels but may not
be the same. The many to many model is a compromise between the other two
models. In this model, if any thread makes a blocking system call, the kernel can
schedule another thread for execution. Also, with the introduction of multiple
threads, complexity is not present as in the previous models. Though this model
allows the creation of multiple kernel threads, true concurrency cannot be
achieved by this model. This is because the kernel can schedule only one process
at a time.
Differences between Process and Thread
As we mentioned earlier that in many respect threads operate in the same way as
that of processes. Some of the similarities and differences are:

Similarities:

● Like processes threads share CPU and only one thread active(running) at a
time.
● Like processes, threads within a process execute sequentially.
● Like Processes, thread can create children.
● And like process, if one thread is blocked, another thread can run.
Major Differentiate between process and thread :
Virtualization
Virtualization is the most effective way to reduce IT expenses and boost efficiency
large to mid size and small organizations. Virtualization lets you run multiple
operating systems and applications on a single server, consolidate hardware to get
vastly higher productivity from fewer servers and simplify the management,
maintenance, and the deployment of new applications.

Virtualization creates a virtual version of a device or resource, such as a server,


storage device, network or even an operating system where the framework divides
the resource into one or more execution environments. The term virtualization is
now associated with a number of computing technologies including the following
1. Application Virtualization: Application virtualization helps a user to have remote
access to an application from a server. The server stores all personal information and
other characteristics of the application but can still run on a local workstation through
the internet. An example of this would be a user who needs to run two different versions
of the same software. Technologies that use application virtualization are hosted
applications and packaged applications.

2. Network Virtualization: The ability to run multiple virtual networks with each
having a separate control and data plan. It co-exists together on top of one physical
network. It can be managed by individual parties that are potentially confidential to
each other. Network virtualization provides a facility to create and provision virtual
networks, logical switches, routers, firewalls, load balancers, Virtual Private Networks
(VPN), and workload security within days or even weeks.
1. Application Virtualization: Application virtualization helps a user to have remote
access to an application from a server. The server stores all personal information and
other characteristics of the application but can still run on a local workstation through
the internet. An example of this would be a user who needs to run two different versions
of the same software. Technologies that use application virtualization are hosted
applications and packaged applications.

2. Network Virtualization: The ability to run multiple virtual networks with each
having a separate control and data plan. It co-exists together on top of one physical
network. It can be managed by individual parties that are potentially confidential to
each other. Network virtualization provides a facility to create and provision virtual
networks, logical switches, routers, firewalls, load balancers, Virtual Private Networks
(VPN), and workload security within days or even weeks.
3. Desktop Virtualization: Desktop virtualization allows the usersʼ OS to be remotely
stored on a server in the data center. It allows the user to access their desktop virtually,
from any location by a different machine. Users who want specific operating systems
other than Windows Server will need to have a virtual desktop. The main benefits of
desktop virtualization are user mobility, portability, and easy management of software
installation, updates, and patches.

4. Storage Virtualization: Storage virtualization is an array of servers that are managed


by a virtual storage system. The servers arenʼt aware of exactly where their data is
stored and instead function more like worker bees in a hive. It makes managing storage
from multiple sources be managed and utilized as a single repository. storage
virtualization software maintains smooth operations, consistent performance, and a
continuous suite of advanced functions despite changes, breaks down, and differences
in the underlying equipment.
5. Server Virtualization: This is a kind of virtualization in which the masking of server
resources takes place. Here, the central server (physical server) is divided into multiple
different virtual servers by changing the identity number, and processors. So, each system
can operate its operating systems in an isolated manner. Where each sub-server knows the
identity of the central server. It causes an increase in performance and reduces the operating
cost by the deployment of main server resources into a sub-server resource. Itʼs beneficial in
virtual migration, reducing energy consumption, reducing infrastructural costs, etc.
6. Data Virtualization: This is the kind of virtualization in which the data is collected
from various sources and managed at a single place without knowing more about the
technical information like how data is collected, stored & formatted then arranged that
data logically so that its virtual view can be accessed by its interested people and
stakeholders, and users through the various cloud services remotely. Many big giant
companies are providing their services like Oracle, IBM, At scale, Cdata, etc.
Pros of Virtualization
Utilization of Hardware Efficiently: With the help of Virtualization Hardware is Efficiently
used by user as well as Cloud Service Provider. In this the need of Physical Hardware System for
the User is decreases and this results in less costly.In Service Provider point of View, they will
utilize the Hardware using Hardware Virtualization which decrease the Hardware requirement
from Vendor side.
High Availability: One of the main benefit of Virtualization is that it provides advance features
which allow virtual instances to be available all the times.
Disaster Recovery is efficient and easy: With the help of virtualization Data Recovery, Backup,
Duplication becomes very easy. In traditional method , if somehow due to some disaster if
Server system Damaged then the surety of Data Recovery is very less. But with the tools of
Virtualization real time data backup recovery and mirroring become easy task and provide
surety of zero percent data loss.
Virtualization saves Energy: Virtualization will help to save Energy because while moving
from physical Servers to Virtual Serverʼs, the number of Serverʼs decreases due to this monthly
power and cooling cost decreases which will Save Money as well.
Quick and Easy Set up: In traditional methods Setting up physical system and servers
are very time-consuming. Firstly Purchase them in bulk after that wait for shipment.
When Shipment is done then wait for Setting up and after that again spend time in
installing required software etc. Which will consume very time. But with the help of
virtualization the entire process is done in very less time which results in productive
setup.
Cloud Migration becomes easy: Most of the companies those who already have spent a
lot in the server have a doubt of Shifting to Cloud. But it is more cost-effective to shift to
cloud services because all the data that is present in their serverʼs can be easily
migrated into the cloud server and save something from maintenance charge, power
consumption, cooling cost, cost to Server Maintenance Engineer etc.
Resource Optimization: Virtualization allows efficient utilization of physical hardware
by running multiple virtual machines (VMs) on a single physical server. This
consolidation leads to cost savings in terms of hardware, power, cooling, and space
Cons of Virtualization
High Initial Investment: While virtualization reduces costs in the long run, the initial
setup costs for storage and servers can be higher than a traditional setup.
Complexity: Managing virtualized environments can be complex, especially as the
number of VMs increases.
Security Risks: Virtualization introduces additional layers, which may pose security
risks if not properly configured and monitored.
Learning New Infrastructure: As Organization shifted from Servers to Cloud. They
required skilled staff who can work with cloud easily. Either they hire new IT staff with
relevant skill or provide training on that skill which increase the cost of company.
Data can be at Risk: Working on virtual instances on shared resources means that our
data is hosted on third party resource which putʼs our data in vulnerable condition. Any
hacker can attack on our data or try to perform unauthorized access. Without Security
solution our data is in threaten situation.
CLIENTS
In the previous chapters we discussed the client-server model, the roles of clients
and servers, and the ways they interact. Let us now take a closer look at the
anatomy of clients and servers, respectively.

The client is the machine (workstation or PC) running the front-end applications. It
interacts with a user through the keyboard, display, and pointing device such as a
mouse. The client also refers to the client process that runs on the client machine.
The client has no direct data access responsibilities. It simply requests processes
from the server and displays data managed by the server. Therefore, the client
workstation can be optimized for its job. For example, it might not need large disk
capacity, or it might benefit from graphic capabilities.
Networked user interfaces
A major task of client machines is to provide the means for users to interact with
remote servers. There are roughly two ways in which this interaction can be
supported. First, for each remote service the client machine will have a separate
counterpart that can contact the service over the network. A typical example is a
calendar running on a user's Smartphone that needs to synchronize with a remote,
possibly shared calendar. In this case, an application-level protocol will handle the
synchronization, as shown in Figure below,
A second solution is to provide direct
access to remote services by offering
only a convenient user interface.
Effectively, this means that the client
machine is used only as a terminal
with no need for local storage, leading
to an application-neutral solution as
shown in Figure below. In the case of
networked user interfaces, everything
is processed and stored at the server.
This thin-client approach has received
much attention with the increase of
Internet connectivity and the use of
mobile devices. Thin-client solutions
are also popular as they ease the task
of system management.
Client-side software can transparently collect all responses and pass a single
response to the client application. Regarding failure transparency, masking
communication failures with a server is typically done through client middleware.
For example, client middleware can be configured to repeatedly attempt to
connect to a server, or perhaps try another server after several attempts. There are
even situations in which the client middleware returns data it had cached during a
previous session, as is sometimes done by Web browsers that fail to connect to a
server.
SERVERS
A server is a piece of computer hardware or software
that provides functionality for other programs or
devices, called clients. This architecture is called the
client-server model. Servers can provide various
functionalities, often called services, such as sharing
data or resources among multiple clients, or performing
computation for a client. A single server can serve
multiple clients, and a single client can use multiple
servers. A client process may run on the same device or
may connect over a network to a server on a different
device. Typical servers are database servers, file servers,
mail servers, print servers, web servers, game servers,
and application servers
General server design issues in distributed system
A server is a process implementing a specific service on behalf of a collection of clients. In essence,
each server is organized in the same way: it waits for an incoming request from a client and
subsequently ensures that the request is taken care of, after which it waits for the next incoming
request.
1. Concurrent versus iterative servers: There are two types of server design choices:
Iterative server: It's a single threaded server which processes the requests in a queue. While
processing the current request it adds incoming requests in a wait queue, and once the processing is
done, it takes in the next request in the queue. Essentially, requests are not executed in parallel at
any time on the server.
Concurrent server: In this case when a new request comes in, the server spawns a new thread to
service the request. Thus all processes are executed in parallel in this scenario. This is the thread
per-request model. There is also one more flavor to the multi-threaded server. Unlike the previous
case where new thread is spawned every time a new request comes in, there is a pool of
pre-spawned threads in the server which are ready to serve the requests. A thread dispatcher or
scheduler handles this pool of pre-spawned threads.
2. Contacting a server: end points
The client can determine the server it need to communicates using one of the two
techniques.

● Hard code the port number of the server in the client. If the server moves then
the client will need to be recompiled.
● A directory service is used to register a service with a port number and IP
address. The client connects to the directory service to determine the location
of the server with a particular service, and then sends to it the request. In this
case, the client only needs to know the location of the directory service.
In all cases, clients send requests to an end point, also called a port, at the machine where the
server is running. Each server listens to a specific end point. How do clients know the end point of
a service? One approach is to globally assign end points for well-known services. For example,
servers that handle Internet FTP requests always listen to TCP port 21. Likewise, an HTTP server
for the World Wide Web will always listen to TCP port 80. These end points have been assigned by
the Internet Assigned Numbers Authority (IANA). With assigned end points, the client needs to find
only the network address of the machine where the server is running. Name services can be used
for that purpose.

There are many services that do not require a pre-assigned end point. For example, a time-of-day
server may use an end point that is dynamically assigned to it by its local operating system. In that
case, a client will first have to look up the end point. One solution is to have a special daemon
running on each machine that runs servers. The daemon keeps track of the current end point of
each service implemented by a co-located server. The daemon itself listens to a well-known end
point. A client will first contact the daemon, request the end point, and then contact the specific
server, as shown in Figure below. It is common to associate an end point with a specific service.
However, actually implementing each service by means of a separate server may be a waste of
resources. For example, in a typical UNIX system, it is common to have lots of servers running
simultaneously, with most of them passively waiting until a client request comes in.
3. Interrupting a server

Another issue that needs to be taken into account when designing a server is whether
and how a server can be interrupted. For example, consider a user who has just decided
to upload a huge file to an FTP server. Then, suddenly realizing that it is the wrong file,
he wants to interrupt the server to cancel further data transmission. There are several
ways to do this. One approach that works only too well in the current Internet (and is
sometimes the only alternative) is for the user to abruptly exit the client application
(which will automatically break the connection to the server), immediately restart it,
and pretend nothing happened. The server will eventually tear down the old
connection, thinking the client has probably crashed.
Contd… Interrupting a server

A much better approach for handling communication interrupts is to develop the client
and server such that it is possible to send out-of-band data, which is data that is to be
processed by the server before any other data from that client. One solution is to let the
server listen to a separate control end point to which the client sends out-of-band data,
while at the same time listening (with a lower priority) to the end point through which
the normal data passes. Another solution is to send out-of-band data across the same
connection through which the client is sending the original request. In TCP, for example,
it is possible to transmit urgent data. When urgent data are received at the server, the
latter is interrupted (e.g., through a signal in UNIX systems), after which it can inspect
the data and handle them accordingly.
4. Stateless versus stateful servers
A final, important design issue, is whether or not the server is stateless.
Stateful servers are those, which keep a state of their connected clients. For example,
push servers are Stateful servers since the server need to know the list of clients it needs
to send messages to.
Stateless servers on the other hand don't keep any information on the state of its
connected clients, and can change its own state without having to inform any client. In
this case, the client will keep its own session information. Pull servers are examples of
stateless servers where the clients send requests to the servers as and when required.
The advantage of stateless servers are that the server crash will not impact clients more
resilient to failures, it's also more scalable, since clients take some workloads.
Soft state servers maintain the state of the client for a limited period of time on the
server and when the session timeouts it discards the information.
Advantages of both Stateful and Stateless servers

The main advantage of Stateful servers, is that they can provide better
performance for clients because clients do not have to provide full file information
every time they perform an operation, the size of messages to and from the server
can be significantly decreased. Likewise the server can make use of knowledge of
access patterns to perform read-ahead and do other optimizations. Stateful servers
can also offer clients extra services such as file locking, and remember read and
write positions.
The main advantage of stateless servers is that they can easily recover from failure.
Because there is no state that must be restored, a failed server can simply restart
after a crash and immediately provide services to clients as though nothing
happened. Furthermore, if clients crash the server is not stuck with abandoned
opened or locked files. Another benefit is that the server implementation remains
simple because it does not have to implement the state accounting associated with
opening, closing, and locking of files.
Object Servers
An object server is a server tailored to support distributed objects. The important
difference between a general object server and other servers is that an object
server by itself does not provide a specific service. Specific services are
implemented by the objects that reside in the server. Essentially, the server
provides only the means to invoke local objects, based on requests from remote
clients. As a consequence, it is relatively easy to change services by simply adding
and removing objects.

An object server thus acts as a place where objects live. An object consists of two
parts: data representing its state and the code for executing its methods. Whether
or not these parts are separated, or whether method implementations are shared
by multiple objects, depends on the object server. Also, there are differences in the
way an object server invokes its objects. For example, in a multithreaded server,
each object may be assigned a separate thread, or a separate thread may be used
for each invocation request.
CODE MIGRATION
Traditionally, code migration in distributed systems took place in the form of
process migration in which an entire process was moved from one machine to
another. The basic idea is that overall system performance can be improved if
processes are moved from heavily-loaded to lightly-loaded machines.

Code migration in the broadest sense deals with moving programs between
machines, with the intention to have those programs be executed at the target. In
some cases, as in process migration, the execution status of a program, pending
signals, and other parts of the environment must be moved as well.
Reasons for Migrating Code
Code migration is essential in distributed systems for several reasons:
Dynamic Environment: Code migration allows systems to adapt to frequent changes
by transferring code to different locations as needed.
Resource Optimization: It optimizes resource utilization by moving code to nodes with
excess capacity, balancing workload.
System Maintenance and Updates: Facilitates seamless deployment of updates,
ensuring applications remain functional.
Scaling: Enables dynamic resource allocation to handle increased traffic and workload
demands.
Fault Tolerance and Resilience: Aids in quick recovery from failures by migrating code
to ensure continuity of service.

You might also like