DumbGeeks: Fork And exec

The use of fork and exec exemplifies the spirit of
UNIX in that it provides a very simple way to start new
processes.
The fork call basically makes a duplicate of the
current process, identical in almost every way (not
everything is copied over, for example, resource limits
in some implementations but the idea is to create as
close a copy as possible).
The new process (child) gets a different process ID
(PID) and has the the PID of the old process (parent)
as its parent PID (PPID). Because the two processes
are now running exactly the same code, they can tell
which is which by the return code of fork - the child
gets 0, the parent gets the PID of the child. This is all,
of course, assuming the fork call works - if not, no
child is created and the parent gets an error code.
The exec call is a way to basically replace the entire
current process with a new program. It loads the
program into the current process space and runs it
from the entry point.
So, fork and exec are often used in sequence to get
a new program running as a child of a current
process. Shells typically do this whenever you try to
run a program like find - the shell forks, then the
child loads the find program into memory, setting up
all command line arguments, standard I/O and so
forth.
But they're not required to be used together. It's
perfectly acceptable for a program to fork itself
without exec ing if, for example, the program contains
both parent and child code (you need to be careful
what you do, each implementation may have
restrictions). This was used quite a lot (and still is)
for daemons which simply listen on a TCP port and
fork a copy of themselves to process a specific
request while the parent goes back to listening.
Similarly, programs that know they're finished and
just want to run another program don't need to fork ,
exec and then wait for the child. They can just load
the child directly into their process space.
Some UNIX implementations have an optimized fork
which uses what they call copy-on-write. This is a
trick to delay the copying of the process space in
fork until the program attempts to change something
in that space. This is useful for those programs using
only fork and not exec in that they don't have to
copy an entire process space.
If the exec is called following fork (and this is what
happens mostly), that causes a write to the process
space and it is then copied for the child process.
Note that there is a whole family of exec calls
(execl , execle, execve and so on) but exec in
context here means any of them.
The following diagram illustrates the typical fork/
exec operation where the bash shell is used to list a
directory with the ls command:
+--------+
| pid=7 |
| ppid=4 |
| bash   |
+--------+
    |
    | calls fork
    V
+--------+             +--------+
| pid=7 |    forks    | pid=22 |
| ppid=4 | ----------> | ppid=7 |
| bash   |             | bash   |
+--------+             +--------+
    |                      |
    | waits for pid 22     | calls
exec to run ls
    |                      V
    |                  +--------+
    |                  | pid=22 |
    |                  | ppid=7 |
    |                  | ls     |
    V                  +--------+
+--------+                 |
| pid=7 |                 | exits
| ppid=4 | <---------------+
| bash   |
+--------+
    |
    | continues
    V

(Draft In progress)

DumbGeeks

Thursday, April 30, 2015

Fork And exec

No comments:

Post a Comment

Popular Posts