Tuesday, July 16, 2013

Basic Thread Concepts



Threads

In shared memory multiprocessor architectures, threads can be used to implement parallelism.

A thread is an independent Schedulable entity i.e.

Technically, a thread is defined as an independent stream of instructions that can be scheduled to run as such by the operating system



  • To the software developer, the concept of a "procedure" that runs independently from its main program may best describe a thread.
  • To go one step further, imagine a main program (a.out) that contains a number of procedures. Then imagine all of these procedures being able to be scheduled to run simultaneously and/or independently by the operating system. That would describe a "multi-threaded" program
Before understanding a thread, one first needs to understand a UNIX process

. A process is created by the operating system, and requires a fair amount of "overhead"


 Processes contain information about program resources and program execution state, including:
  • Process ID, process group ID, user ID, and group ID
  • Environment    (in which it is running)
  • Working directory. (where it is running)
  • Program instructions (code seg)  Prog counter
  • Registers (??)
  • Stack (seg)  Stack Pointer
  • Heap (seg) 
  • File descriptors  
  • Signal actions   
  • Shared libraries 
  • Inter-process communication tools (such as message queues, pipes, semaphores, or shared memory).
Process-thread relationship
                     UNIX PROCESS


Process-thread relationship
  • THREADS WITHIN A UNIX PROCESS
  • Threads use and exist within these process resources, yet are able to be scheduled by the operating system and run as independent entities largely because they duplicate only the bare essential resources that enable them to exist as executable code.

  • This independent flow of control is accomplished because a thread maintains its own:
    • Stack pointer
    • Registers
    • Scheduling properties (such as policy or priority)
    • Set of pending and blocked signals
    • Thread specific data.
  • So, in summary, in the UNIX environment a thread:
    • Exists within a process and uses the process resources
    • Has its own independent flow of control as long as its parent process exists and the OS supports it
    • Duplicates only the essential resources it needs to be independently schedulable
    • May share the process resources with other threads that act equally independently (and dependently)
    • Dies if the parent process dies - or something similar
    • Is "lightweight" because most of the overhead has already been accomplished through the creation of its process.
  • Because threads within the same process share resources:
    • Changes made by one thread to shared system resources (such as closing a file) will be seen by all other threads.
    • Two pointers having the same value point to the same data.
    • Reading and writing to the same memory locations is possible, and therefore requires explicit synchronization by the programmer.


A thread is the fundamental unit of execution within an application.

A running application consists of at least one thread.

Each thread has its own stack and runs independently from the application’s other threads.

Threads share the resources used by the application as it runs, such as file handles
or memory, which is why problems can occur.

Data corruption is a common side effect of having two threads simultaneously write data
to the same block of memory.

Threads can be implemented in different ways.

On most systems, threads are created and managed by the operating system
Such threads are often called native or kernel-level threads.

Clone () in Linux

Number of threads that can be executed at any given instant is limited by the number of processors
in the computer,

Operating system will rapidly switch from thread to thread, giving each
thread a small window of time in which to run. This is known as preemptive threading

A cooperative model, on the other hand, requires a thread to explicitly suspend its own execution in order to let other threads run.

Swapping one thread out and another in is referred to as a context switch.


System Threads versus User Threads

A system thread is created and managed by the system. The first (main) thread of an application is a system thread, and the application often exits when the first thread terminates

General Ex: UI
Applications that display user interfaces main thread in such an application is usually called the event thread.
 threads are created to handle time-consuming operations like network access.

thread synchronization =>  data corruption

two fundamental thread synchronization constructs are monitors and semaphores.

DEF :
A thread is a seq of inst executed by a processor for a particular Job. It has its own program start addr hence a prog counter and its own stack space hence Stack pointer.
Each thread has a status {processor registers, PC , SP, all other flags and control registers}

Threads VS Process

Threads Better

1 . Sharing Data Between Threads is easy , where as in process it is not ( create a shared mem or Pipe )
2. Creation : Threads faster, (Context gen lower for threads)

Processs Better
 
Thread Safety In multi threading , functions called should be thread-safe or should be called in Thread-safe manner. (Thread Safety) .  multiprogramming need not bother about this
One to All One bug in a thread (lets say invalid pointer access ) can damage all the threads attached to the process. Why ? Since they are all attached to same address space. Process are more isolated from each other.
VM SPace Each thread fights for Finite Virtual Addr Space of "Host Process". In particular every threads stack and local storage consumes a part of Process virtual Address Space which is unavail for other threads.

Though the Process Addr space is typicaly Huge (3 GB on x86 - 32) , this might be a limitation if a process req large no of threads or a thread req large mem

Gen Process can have a total virtual add space

Other factors
Dealing with Signals in a multi threaded appl ? Desirable to avoid.
All threads  execute same program code
Apart from process data, threads also share other info such as FIle desc, Signal Disposition, current working dir,UID GID etc.. this has both Advs/Disadvs.







No comments:

Post a Comment