Unix System Programming Module-3 RR
Unix System Programming Module-3 RR
Module-03
UNIX FILE
APIs
General file API’s
Files in a UNIX and POSIX system may be any one of the following types:
Regular file
Directory File
FIFO file
Block device file
character device file
Symbolic link file.
There are special API’s to create these types of files. There is a set of Generic API’s that can be used to manipulate
and create more than one type of files. These API’s are:
open
This is used to establish a connection between a process and a file i.e. it is used to open an existing file for
data transfer function or else it may be also be used to create a new file.
The returned value of the open system call is the file descriptor (row number of the file table), which
contains the inode information.
The prototype of open function is
#include<sys/types.h>
#include<sys/fcntl.h>
int open(const char *pathname, int accessmode, mode_t permission);
If successful, open returns a nonnegative integer representing the open file descriptor.
If unsuccessful, open returns –1.
The first argument is the name of the file to be created or opened. This may be an absolute pathname or
relative pathname.
If the given pathname is symbolic link, the open function will resolve the symbolic link reference to a non
symbolic link file to which it refers.
The second argument is access modes, which is an integer value that specifies how actually the file should be
accessed by the calling process.
Generally the access modes are specified in <fcntl.h>. Various access modes are:
O_RDONLY - open for reading file only
O_WRONLY - open for writing file only
O_RDWR - opens for reading and writing file.
There are other access modes, which are termed as access modifier flags, and one or more of the following can be
specified by bitwise-ORing
O_APPEND - Append data themto with oneof
the end offile.
the above access mode flags to alter the access mechanism of the file.
O_CREAT - Create the file if it doesn’t exist
O_EXCL - Generate an error if O_CREAT is also specified and the file already exists.
O_TRUNC - If file exists discard the file content and set the file size to zero bytes.
O_NONBLOCK - Specify subsequent read or write on the file should be non-blocking.
O_NOCTTY - Specify not to use terminal device file as the calling process control terminal.
To illustrate the use of the above flags, the following example statement opens a file called /usr/divya/usp for
read and write in append mode:
int fd=open(“/usr/divya/usp”,O_RDWR | O_APPEND,0);
If the file is opened in read only, then no other modifier flags can be used.
If a file is opened in write only or read write, then we are allowed to use any modifier flags along with them.
The third argument is used only when a new file is being created. The symbolic names for file permission are
given in the table in the previous page.
creat
This system call is used to create new regular files.
The prototype of creat is
#include <sys/types.h>
#include<unistd.h>
int creat(const char *pathname, mode_t mode);
Returns: file descriptor opened for write-only if OK, -1 on error.
The first argument pathname specifies name of the file to be created.
The second argument mode_t, specifies permission of a file to be accessed by owner group and others.
The creat function can be implemented using open function as:
#define creat(path_name, mode)
open (pathname, O_WRONLY | O_CREAT | O_TRUNC, mode);
read
The read function fetches a fixed size of block of data from a file referenced by a given file descriptor.
The prototype of read function is:
#include<sys/types.h>
#include<unistd.h>
size_t read(int fdesc, void *buf, size_t nbyte);
If successful, read returns the number of bytes actually read.
If unsuccessful, read returns –1.
The first argument is an integer, fdesc that refers to an opened file.
The second argument, buf is the address of a buffer holding any data read.
The third argument specifies how many bytes of data are to be read from the file.
The size_t data type is defined in the <sys/types.h> header and should be the same as unsigned int.
There are several cases in which the number of bytes actually read is less than the amount requested:
o When reading from a regular file, if the end of file is reached before the requested number of bytes has
been read. For example, if 30 bytes remain until the end of file and we try to read 100 bytes, read returns
30. The next time we call read, it will return 0 (end of file).
o When reading from a terminal device. Normally, up to one line is read at a time.
o When reading from a network. Buffering within the network may cause less than the requested amount to
be returned.
o When reading from a pipe or FIFO. If the pipe contains fewer bytes than requested, read will return only
what is available.
write
The write system call is used to write data into a file.
The write function puts data to a file in the form of fixed block size referred by a given file descriptor.
close
The close system call is used to terminate the connection to a file from a process.
The prototype of the close is
#include<unistd.h>
int close(int fdesc);
If successful, close returns 0.
If unsuccessful, close returns –1.
The argument fdesc refers to an opened file.
Close function frees the unused file descriptors so that they can be reused to reference other files. This is
important because a process may open up to OPEN_MAX files at any time and the close function allows a
process to reuse file descriptors to access more than OPEN_MAX files in the course of its execution.
The close function de-allocates system resources like file table entry and memory buffer allocated to hold the
read/write.
fcntl
The fcntl function helps a user to query or set flags and the close-on-exec flag of any file descriptor.
The prototype of fcntl is
#include<fcntl.h>
int fcntl(int fdesc, int cmd, …);
The first argument is the file descriptor.
The second argument cmd specifies what operation has to be performed.
The third argument is dependent on the actual cmd value.
The possible cmd values are defined in <fcntl.h> header.
cmd value Use
F_GETFL Returns the access control flags of a file descriptor fdesc
F_SETFL Sets or clears access control flags that are specified in the third argument to
fcntl. The allowed access control flags are O_APPEND & O_NONBLOCK
F_GETFD Returns the close-on-exec flag of a file referenced by fdesc. If a return value is
zero, the flag is off; otherwise on.
F_SETFD Sets or clears the close-on-exec flag of a fdesc. The third argument to fcntl is
an integer value, which is 0 to clear the flag, or 1 to set the flag
F_DUPFD Duplicates file descriptor fdesc with another file descriptor. The third argument
to fcntl is an integer value which specifies that the duplicated file descriptor
must be greater than or equal to that value. The return value of
fcntl is the duplicated file descriptor
The fcntl function is useful in changing the access control flag of a file descriptor.
For example: after a file is opened for blocking read-write access and the process needs to change the access to
non-blocking and in write-append mode, it can call:
int cur_flags=fcntl(fdesc,F_GETFL);
int rc=fcntl(fdesc,F_SETFL,cur_flag | O_APPEND | O_NONBLOCK);
The following statements change the standard input og a process to a file called FOO:
int fdesc=open(“FOO”,O_RDONLY); //open FOO for read
close(0); //close standard input
if(fcntl(fdesc,F_DUPFD,0)==-1)
perror(“fcntl”); //stdin from FOO now
char buf[256];
int rc=read(0,buf,256); //read data from FOO
The dup and dup2 functions in UNIX perform the same file duplication function as fcntl.
They can be implemented using fcntl as:
lseek
The lseek function is also used to change the file offset to a different value.
Thus lseek allows a process to perform random access of data on any opened file.
The prototype of lseek is
#include <sys/types.h>
#include <unistd.h>
off_t lseek(int fdesc, off_t pos, int whence);
On success it returns new file offset, and –1 on error.
The first argument fdesc, is an integer file descriptor that refer to an opened file.
The second argument pos, specifies a byte offset to be added to a reference location in deriving the new file
offset value.
The third argument whence, is the reference location.
Whence value Reference location
SEEK_CUR Current file pointer address
SEEK_SET The beginning of a file
SEEK_END The end of a file
They are defined in the <unistd.h> header.
If an lseek call will result in a new file offset that is beyond the current end-of-file, two outcomes possible
are:
o If a file is opened for read-only, lseek will fail.
o If a file is opened for write access, lseek will succeed.
o The data between the end-of-file and the new file offset address will be initialised with NULL
characters.
link
The link function creates a new link for the existing file.
The prototype of the link function is
#include <unistd.h>
int link(const char *cur_link, const char *new_link);
If successful, the link function returns 0.
If unsuccessful, link returns –1.
The first argument cur_link, is the pathname of existing file.
The second argument new_link is a new pathname to be assigned to the same file.
If this call succeeds, the hard link count will be increased by 1.
The UNIX ln command is implemented using the link API.
unlink
The unlink function deletes a link of an existing file.
This function decreases the hard link count attributes of the named file, and removes the file name entry of
the link from directory file.
A file is removed from the file system when its hard link count is zero and no process has any file descriptor
referencing that file.
The prototype of unlink is
#include <unistd.h>
int unlink(const char * cur_link);
If successful, the unlink function returns 0.
If unsuccessful, unlink returns –1.
The argument cur_link is a path name that references an existing file.
ANSI C defines the rename function which does the similar unlink operation.
The prototype of the rename function is:
#include<stdio.h>
int rename(const char * old_path_name,const char * new_path_name);
The UNIX mv command can be implemented using the link and unlink APIs as shown:
#include <iostream.h>
#include <unistd.h>
#include<string.h>
int main ( int argc, char *argv[ ])
{
if (argc != 3 || strcmp(argv[1],argcv[2]))
cerr<<”usage:”<<argv[0]<<””<old_link><new_link>\n”;
else if(link(argv[1],argv[2]) == 0)return
unlink(argv[1]);
return 1;
}
stat, fstat
The stat and fstat function retrieves the file attributes of a given file.
The only difference between stat and fstat is that the first argument of a stat is a file pathname, where as the
first argument of fstat is file descriptor.
The prototypes of these functions are
#include<sys/stat.h>
#include<unistd.h>
access
The access system call checks the existence and access permission of user to a named file.
The prototype of access function is:
#include<unistd.h>
int access(const char *path_name, int flag);
On success access returns 0, on failure it returns –1.
The first argument is the pathname of a file.
The second argument flag, contains one or more of the following bit flag .
Mode Description
if (UID == (uid_t)-1)
cerr <<“Invalid user name”; else for (int i
= 2; i < argc ; i++)
if (stat(argv[i], &statv)==0)
{
if (chown(argv[i], UID,statv.st_gid))perror
(“chown”);
else
perror (“stat”);
}
return 0;
}
utime Function
The utime function modifies the access time and the modification time stamps of a file.
The prototype of utime function is
#include<sys/types.h>
#include<unistd.h>
#include<utime.h>
F_SETLK sets a file lock, do not block if this cannot succeed immediately.
F_SETLKW sets a filethe
For file locking purpose, lock andargument
third blocks the
toprocess until
fctnl is an the lock
address of aisstruct
acquired.
flock type variable.
F_GETLK queriesa as
This variable specifies to which
region process
of a file locked
where a specified
lock is to be set, region
unset orofqueried.
file.
struct flock
{
The l_whence, l_start & l_len define a region of a file to be locked or unlocked.
The possible values of l_whence and their uses are
Example Program
#include <unistd.h>
#include<fcntl.h> int main
()
{
int fd;
struct flock lock;
fd=open(“divya”,O_RDONLY);
lock.l_type=F_RDLCK;
lock.l_whence=0; lock.l_start=10;
lock.l_len=15;
fcntl(fd,F_SETLK,&lock);
}
}
else
{ fd=open(argv[1],O_WRONLY); write(fd,argv[2],strlen(argv[2]));
}
close(fd);
}
/* filebase.h */ #define
FILEBASE_H #ifndef
FILEBASE_H
#include<fstream.h>
#include<iostream.h>
#include<sys/types.h>
#include<string.h>
#include<unistd.h>
#include<sys/stat.h>
#include<fcntl.h>
#include<utime.h> typedef enum
{
REG_FILE=‟r‟, DIR_FILE=‟d‟, CHAR_FILE=‟c‟ PIPE_FILE=‟p‟,
SYM_FILE = „s‟, BLK_FILE = „b‟, UNKNOWN_FILE = „?‟
}
FILE_TYPE_ENUM
;
class filebase : public fstream
{
protected:
char *filename;
friend ostream& operator<<(ostream& os,filebase& fobj)
{return os;};
public :
file base() {filename =0:};
file base(const char *fn, int flags, int prot = filebuf :: openprot)ifstream(fn, flags, prot)
{
filename = newchar[strlen(fn) + 1];
strcpy(filename, fn);
};
virtual ~filebase() {delete filename;}; virtual int create(const
char *fn, mode_t mode)
{
return :: create(fn, mode);
};
int fileno()
{
return rdbuf()->fd();
};
int chmod(mode_t mode)
{
return :: chmod(filename, mode);
};
int chown(uid_t uid, gid_t gid)
{
return :: chown(filename, uid, gid);
};
int link (const char *new_link)
{
return :: link (filename, new_link);
};
int utime (const struct utim buf *timbuf_ptr)
{
return :: utime (filename, timbuf_ptr);
};
virtual int remove()
#include “filebase.h”#include
“dirfile.h” #include
“symfile.h”
void show_list(ostream &ofs, const char *fname, int deep);extern void long_list
(ostream &ofs, char *fn);
void show_dir(ostream &ofs, const char *fname)
{
dirfile dirobj(fname);char
buf[256];
ofs <<“Directory :” << fname;while
(dirobj.read(buf, 256))
{
filebase fobj(buf, ios :: in, O755);
if(fobj.file_type==DIR_FILE)
Summary
The inheritance hierarchy of all the file classes defined in the chapter is:
UNIX PROCESSES
INTRODUCTION
main FUNCTION
A C program starts execution with a function called main. The prototype for the mainfunction is
int main(int argc, char *argv[]);
where argc is the number of command-line arguments, and argv is an array of pointers to the arguments.
When a C program is executed by the kernel by one of the exec functions, a special start-up routine is called before the
main function is called. The executable program file specifies this routine as the starting address for the
program; this is set up by the link editor when it is invoked by the C compiler. This start-up routine takes values from
the kernel, the command-line arguments and the environment and sets things up so that the mainfunction is called.
PROCESS TERMINATION
There are eight ways for a process to terminate. Normal termination occurs in five ways:
Return from main
Calling exit
Calling _exitor _Exit
Return of the last thread from its start routine
Calling pthread_exit from the last thread
Abnormal termination occurs in three ways:
Calling abort
Receipt of a signal
Response of the last thread to a cancellation request
Exit Functions
Three functions terminate a program normally: _exit and _Exit, which return to the kernel immediately, and
exit, which performs certain cleanup processing and then returns to the kernel.
#include <stdlib.h>
void exit(int status);
void _Exit(int status);
#include <unistd.h>
void _exit(int status);
All three exit functions expect a single integer argument, called the exit status. Returning an integer value from the
main function is equivalent to calling exit with the same value. Thus
exit(0); is the
same as
return(0);
from the main function.
int main(void)
{
if (atexit(my_exit2) != 0) err_sys("can't register
my_exit2");
printf("main is done\n");
return(0);
}
static void
my_exit1(void)
{
printf("first exit handler\n");
}
static void
my_exit2(void)
{
printf("second exit handler\n");
}
Output:
$ ./a.out
main is done
first exit handler first exit
handler second exit handler
The below figure summarizes how a C program is started and the various ways it can terminate.
COMMAND-LINE ARGUMENTS
When a program is executed, the process that does the exec can pass command-line arguments to the new
program.
Example: Echo all command-line arguments to standard output
#include "apue.h"
ENVIRONMENT LIST
Each program is also passed an environment list. Like the argument list, the environment list is an array of character
pointers, with each pointer containing the address of a null-terminated C string. The address of the array of pointers is
contained in the global variable environ:
extern char **environ;
Figure : Environment consisting of five C character strings
SHARED LIBRARIES
Nowadays most UNIX systems support shared libraries. Shared libraries remove the common library routines from the
executable file, instead maintaining a single copy of the library routine somewhere in memory that all processes
reference. This reduces the size of each executable file but may add some runtime overhead, either when the program
is first executed or the first time each shared library function is called. Another advantage of shared libraries is that,
library functions can be replaced with new versions without having to re-link, edit every program that uses the library.
With cc compiler we can use the option –g to indicate that we are using shared library.
MEMORY ALLOCATION
ISO C specifies three functions for memory allocation:
malloc, which allocates a specified number of bytes of memory. The initial value of the memory is
indeterminate.
calloc, which allocates space for a specified number of objects of a specified size. The space is initialized to all
0 bits.
realloc, which increases or decreases the size of a previously allocated area. When the size increases, it may
involve moving the previously allocated area somewhere else, to provide the additional room at the end. Also,
when the size increases, the initial value of the space between the old contents and the end of the new area is
indeterminate.
#include <stdlib.h>
void *malloc(size_t size);
void *calloc(size_t nobj, size_t size);
void *realloc(void *ptr, size_t newsize);
All three return: non-null pointer if OK, NULLon error
void free(void *ptr);
The pointer returned by the three allocation functions is guaranteed to be suitably aligned so that it can be used for any
data object. Because the three alloc functions return a generic void * pointer, if we #include
<stdlib.h> (to obtain the function prototypes), we do not explicitly have to cast the pointer returned by these functions
when we assign it to a pointer of a different type.
The function free causes the space pointed to by ptr to be deallocated. This freed space is usually put into a pool of
available memory and can be allocated in a later call to one of the three allocfunctions.
The reallocfunction lets us increase or decrease the size of a previously allocated area. For example, if we allocateroom
for 512 elements in an array that we fill in at runtime but find that we need room for more than 512 elements, we can
call realloc. If there is room beyond the end of the existing region for the requested space, then realloc doesn't have to
move anything; it simply allocates the additional area at the end and returns the same pointer that we passed it. But if
there isn't room at the end of the existing region, reallocallocates another area that is large
libmalloc
SVR4-based systems, such as Solaris, include the libmalloc library, which provides a set of interfaces matching the ISO
C memory allocation functions. The libmalloc library includes mallopt, a function that allows a process to set certain
variables that control the operation of the storage allocator. A function called mallinfo is also available to provide
statistics on the memory allocator.
vmalloc
Vo describes a memory allocator that allows processes to allocate memory using different techniques for different
regions of memory. In addition to the functions specific to vmalloc, the library also provides emulations of the ISO C
memory allocation functions.
quick-fit
Historically, the standard malloc algorithm used either a best-fit or a first-fit memory allocation strategy. Quick-fit is
faster than either, but tends to use more memory. Free implementations of malloc and freebased on quick-fit are readily
available from several FTP sites.
allocaFunction
The function alloca has the same calling sequence as malloc; however, instead of allocating memory from the heap, the
memory is allocated from the stack frame of the current function. The advantage is that we don't have to free the
space; it goes away automatically when the function returns. The alloca function increases the size of the stack frame.
The disadvantage is that some systems can't support alloca, if it's impossible to increase the size of the stack frame
after the function has been called.
ENVIRONMENT VARIABLES
The environment strings are usually of the form: name=value. The UNIX kernel never looks at these strings; their
interpretation is up to the various applications. The shells, for example, use numerous environment variables. Some,
such as HOME and USER, are set automatically at login, and others are for us to set. We normally set environment
variables in a shell start-up file to control the shell’s actions. The functions that we can use to set and fetch values
from the variables are setenv, putenv, and getenv functions. The prototype of these functions are
#include <stdlib.h>
char *getenv(const char *name);
Returns: pointer to value associated with name, NULL if not found.
#define TOK_ADD 5
jmp_buf jmpbuffer;
int main(void)
{
char line[MAXLINE];
if (setjmp(jmpbuffer) != 0)
printf("error");
while (fgets(line, MAXLINE, stdin) != NULL)do_line(line);
exit(0);
}
...
void cmd_add(void)
{
int token;
token = get_token();
if (token < 0) /* an error has occurred */
longjmp(jmpbuffer, 1);
The setjmp function always returns ‘0’ on its success when it is called directly in a process (for the first time).
The longjmp function is called to transfer a program flow to a location that was stored in the env argument.
The program code marked by the env must be in a function that is among the callers of the current function.
When the process is jumping to the target function, all the stack space used in the current function and its
callers, upto the target function are discarded by the longjmp function.
The process resumes execution by re-executing the setjmp statement in the target function that is markedby
env. The return value of setjmp function is the value(val), as specified in the longjmp function call.
The ‘val’ should be nonzero, so that it can be used to indicate where and why the longjmp function was
invoked and process can do error handling accordingly.
Note: The values of automatic and register variables are indeterminate when the longjmp is called but static and
global variable are unaltered. The variables that we don’t want to roll back after longjmp are declared with keyword
‘volatile’.
int main(void)
{
#ifdef RLIMIT_AS
doit(RLIMIT_AS);
#endif
doit(RLIMIT_CORE);
doit(RLIMIT_CPU);
doit(RLIMIT_DATA);
doit(RLIMIT_FSIZE);
#ifdef RLIMIT_LOCKS
doit(RLIMIT_LOCKS);
#endif
#ifdef RLIMIT_MEMLOCK
doit(RLIMIT_MEMLOCK);
#endif
doit(RLIMIT_NOFILE);
#ifdef RLIMIT_NPROC
doit(RLIMIT_NPROC);
#endif
#ifdef RLIMIT_RSS
doit(RLIMIT_RSS);
#endif
#ifdef RLIMIT_SBSIZE
doit(RLIMIT_SBSIZE);
#endif
doit(RLIMIT_STACK);
#ifdef RLIMIT_VMEM
doit(RLIMIT_VMEM);
#endif
exit(0);
}
All processes in UNIX system expect the process that is created by the system boot code, are created by the fork
system call. After the fork system call, once the child process is created, both the parent and child processes resumes
execution. When a process is created by fork, it contains duplicated copies of the text, data and stack segments of its
parent as shown in the Figure below. Also it has a file descriptor table, which contains reference to the same opened
files as the parent, such that they both share the same file pointer to each opened files.
UNIT 5
PROCESS
CONTROL
INTRODUCTION
Process control is concerned about creation of new processes, program execution, and process termination.
PROCESS IDENTIFIERS
#include <unistd.h>
pid_t getpid(void);
Returns: process ID of calling process
pid_t getppid(void);
Returns: parent process ID of calling process
uid_t getuid(void);
Returns: real user ID of calling process
uid_t geteuid(void);
Returns: effective user ID of calling process
gid_t getgid(void);
Returns: real group ID of calling process
gid_t getegid(void);
Returns: effective group ID of calling process
fork FUNCTION
An existing process can create a new one by calling the forkfunction.
#include <unistd.h>
pid_t fork(void);
Returns: 0 in child, process ID of child in parent, 1 on error.
The new process created by forkis called the child process.
This function is called once but returns twice.
The only difference in the returns is that the return value in the child is 0, whereas the return value in the
parent is the process ID of the new child.
The reason the child's process ID is returned to the parent is that a process can have more than one child, and
there is no function that allows a process to obtain the process IDs of its children.
The reason fork returns 0 to the child is that a process can have only a single parent, and the child can always
call getppid to obtain the process ID of its parent. (Process ID 0 is reserved for use by the kernel, so it's not
possible for 0 to be the process ID of a child.)
Both the child and the parent continue executing with the instruction that follows the call to fork.
The child is a copy of the parent.
For example, the child gets a copy of the parent's data space, heap, and stack.
Note that this is a copy for the child; the parent and the child do not share these portions of memory.
Example programs:
Program 1
/* Program to demonstrate fork function Program name – fork1.c */
#include<sys/types.h>
#include<unistd.h> int main(
)
{
fork( );
printf(“\n hello USP”);
}
Output :
$ cc fork1.c
$ ./a.out hello
USP hello
USP
Note : The statement hello USP is executed twice as both the child and parent have executed that instruction.
Program 2
/* Program name – fork2.c */
#include<sys/types.h>
#include<unistd.h>
int main( )
{
printf(“\n 6 sem “);fork( );
printf(“\n hello USP”);
}
Output :
$ cc fork1.c
$ ./a.out
6 sem hello
USPhello
USP
Note: The statement 6 sem is executed only once by the parent because it is called before fork and statement hello
USP is executed twice by child and parent. [Also refer lab program 3.sh]
File Sharing
Consider a process that has three different files opened for standard input, standard output, and standard error. Onreturn
from fork, we have the arrangement shown in Figure 8.2.
Figure 8.2 Sharing of open files between parent and child after fork
It is important that the parent and the child share the same file offset.
Consider a process that forks a child, then waits for the child to complete.
Assume that both processes write to standard output as part of their normal processing.
If the parent has its standard output redirected (by a shell, perhaps) it is essential that the parent's file
offset be updated by the child when the child writes to standard output.
In this case, the child can write to standard output while the parent is waiting for it; on completion of the child,
the parent can continue writing to standard output, knowing that its output will be appended to whatever the
child wrote.
If the parent and the child did not share the same file offset, this type of interaction would be more difficult to
accomplish and would require explicit actions by the parent.
There are two normal cases for handling the descriptors after a fork.
The parent waits for the child to complete. In this case, the parent does not need to do anything with its
descriptors. When the child terminates, any of the shared descriptors that the child read from or wrote to will
have their file offsets updated accordingly.
Both the parent and the child go their own ways. Here, after the fork, the parent closes the descriptors that it
doesn't need, and the child does the same thing. This way, neither interferes with the other's open descriptors.
This scenario is often the case with network servers.
There are numerous other properties of the parent that are inherited by the child:
o Real user ID, real group ID, effective user ID, effective group ID
o Supplementary group IDs
o Process group ID
o Session ID
vfork FUNCTION
The function vforkhas the same calling sequence and same return values as fork.
The vfork function is intended to create a new process when the purpose of the new process is to exec a new
program.
The vfork function creates the new process, just like fork, without copying the address space of the parent into
the child, as the child won't reference that address space; the child simply calls exec (or exit) right after the
vfork.
Instead, while the child is running and until it calls either exec or exit, the child runs in the address space of the
parent. This optimization provides an efficiency gain on some paged virtual-memory implementations of the
UNIX System.
Example of vforkfunction
#include "apue.h"
int glob = 6; /* external variable in initialized data */
int main(void)
{
int var; /* automatic variable on the stack */pid_t pid;
var = 88;
printf("before vfork\n"); /* we don't flush stdio */if ((pid =
vfork()) < 0) {
err_sys("vfork error");
} else if (pid == 0) { /* child */
glob++; /* modify parent's variables */var++;
_exit(0); /* child terminates */
}
/*
* Parent continues here.
*/
printf("pid = %d, glob = %d, var = %d\n", getpid(), glob, var);exit(0);
}
Output:
$ ./a.out before
vfork
pid = 29039, glob = 7, var = 89
exit FUNCTIONS
A process can terminate normally in five ways:
Executing a return from the main function.
Calling the exit function.
Calling the _exit or _Exit function.
In most UNIX system implementations, exit(3) is a function in the standard C library, whereas _exit(2) is a system call.
Executing a return from the start routine of the last thread in the process. When the last thread returns from its
start routine, the process exits with a termination status of 0.
Calling the pthread_exit function from the last thread in the process.
The three forms of abnormal termination are as follows:
Calling abort. This is a special case of the next item, as it generates the SIGABRT signal.
When the process receives certain signals. Examples of signals generated by the kernel include the process
referencing a memory location not within its address space or trying to divide by 0.
The last thread responds to a cancellation request. By default, cancellation occurs in a deferred manner: one
thread requests that another be canceled, and sometime later, the target thread terminates.
Int main(void)
{
pid_t pid;
int status;
exit(0);
}
Macro Description
WIFEXITED(status) True if status was returned for a child that terminated normally. In this
case, we can execute
WEXITSTATUS(status)
to fetch the low-order 8 bits of the argument that the child passed to
exit, _exit,or _Exit.
WIFSIGNALED(status) True if status was returned for a child that terminated abnormally, by
receipt of a signal that it didn't catch. In this case, we can execute
WTERMSIG(status)
to fetch the signal number that caused the termination.
Additionally, some implementations (but not the Single UNIX Specification)
define the macro
WCOREDUMP(status)
that returns true if a core file of the terminated process was generated.
WIFSTOPPED(status) True if status was returned for a child that is currently stopped. In this
case, we can execute
WSTOPSIG(status)
to fetch the signal number that caused the child to stop.
WIFCONTINUED(status) True if status was returned for a child that has been continued after a job
control stop
The waitpidfunction provides three features that aren't provided by the waitfunction.
The waitpid function lets us wait for one particular process, whereas the wait function returns the status of any
terminated child. We'll return to this feature when we discuss the popenfunction.
The waitpidfunction provides a nonblocking version of wait. There are times when we want to fetch a
child's status, but we don't want to block.
The waitpidfunction provides support for job control with the WUNTRACEDand WCONTINUEDoptions.
Program to Avoid zombie processes by calling forktwice
#include "apue.h" #include
<sys/wait.h>
Int main(void)
{
pid_t pid;
/*
* We're the parent (the original process); we continue executing,
* knowing that we're not the parent of the second child.
*/ exit(0);
}
Output:
$ ./a.out
$ second child, parent pid = 1
waitid FUNCTION
The waitidfunction is similar to waitpid, but provides extra flexibility.
The options argument is a bitwise OR of the flags as shown below: these flags indicate which state changes the caller
is interested in.
Constant Description
WCONTINUE Wait for a process that has previously stopped and has been continued, and whose status has not
D yet been reported.
WEXITED Wait for processes that have exited.
WNOHANG Return immediately instead of blocking if there is no child exit status available.
WNOWAIT Don't destroy the child exit status. The child's exit status can be retrieved by a subsequent call to
wait, waitid,or waitpid.
WSTOPPED Wait for a process that has stopped and whose status has not yet been reported.
RACE CONDITIONS
A race condition occurs when multiple processes are trying to do something with shared data and the final outcome
depends on the order in which the processes run.
main(void)
{
pid_t pid;
main(void)
{
pid_t pid;
+ TELL_WAIT();
+
if ((pid = fork()) < 0) { err_sys("fork
error");
} else if (pid == 0) {
+ WAIT_PARENT(); /* parent goes first */
charatatime("output from child\n");
} else {
charatatime("output from parent\n");
+ TELL_CHILD(pid);
}
exit(0);
}
static void
When we run this program, the output is as we expect; there is no intermixing of output from the two processes.
exec FUNCTIONS
When a process calls one of the exec functions, that process is completely replaced by the new program, and the new
program starts executing at its main function. The process ID does not change across an exec, because a new process is
not created; exec merely replaces the current process - its text, data, heap, and stack segments - with a brand new
program from disk.
There are 6 exec functions:
#include <unistd.h>
int execl(const char *pathname, const char *arg0,... /* (char *)0 */ );
int execv(const char *pathname, char *const argv []);
int execle(const char *pathname, const char *arg0,... /*(char *)0, char
*const envp */ );
int execve(const char *pathname, char *const argv[], char *const envp[]);
int execlp(const char *filename, const char *arg0, ... /* (char *)0 */ );
int execvp(const char *filename, char *const argv []);
All six return: -1 on error, no return on success.
The first difference in these functions is that the first four take a pathname argument, whereas the last two take
a filename argument. When a filename argument is specified
If filename contains a slash, it is taken as a pathname.
Otherwise, the executable file is searched for in the directories specified by the PATH environment
variable.
The next difference concerns the passing of the argument list (l stands for list and v stands for vector). The
functions execl, execlp, and execle require each of the command-line arguments to the new program to be
specified as separate arguments. For the other three functions (execv, execvp, and execve), we haveto build an
array of pointers to the arguments, and the address of this array is the argument to these three functions.
The final difference is the passing of the environment list to the new program. The two functions whose
names end in an e (execle and execve) allow us to pass a pointer to an array of pointers to the environment
strings. The other four functions, however, use the environ variable in the calling process to copy the existing
environment for the new program.
int main(void)
{
pid_t pid;
exit(0);
}
Output:
$ ./a.out
argv[0]: echoallargv[1]:
myarg1 argv[2]: MY
ARG2
USER=unknown
exit(0);
}