Thursday, April 30, 2015

difference between message queue and shared memory?

Both shared memory and message queues can be used to exchange information between processes. The difference is in how they are used.

Shared memory is exactly what you'd think: it's an area of storage that can be read and written by more than one process. It provides no inherent synchronization; in other words, it's up to the programmer to ensure that one process doesn't clobber another's data. But it's efficient in terms of throughput: reading and writing are relatively fast operations.

A message queue is a one-way pipe: one process writes to the queue, and another reads the data in the order it was written until an end-of-data condition occurs. When the queue is created, the message size (bytes per message, usually fairly small) and queue length (maximum number of pending messages) are set. Access is slower than shared memory because each read/write operation is typically a single message. But the queue guarantees that each operation will either processes an entire message successfully or fail without altering the queue. So the writer can never fail after writing only a partial message, and the reader will either retrieve a complete message or nothing at all.

Shared memory can be deemed as faster (low overhead, high volume of data passing) then queues. But queues on the other hand, requires high overhead (the set up for making a queue to be permanent etc) with low volume of data.

The onus with shared memory is that you have to implement synchronization in order to be thread safe. Have a look at the excellent article by Beej on IPC.

Message passing is useful for exchanging smaller amounts of data, because no conflicts need be avoided. It's much easier to implement than is shared memory for intercomputer communication. Also, as you've already noticed, message passing has the advantage that application developers don't need to worry about the details of protections like shared memory.

Shared memory allows maximum speed and convenience of communication, as it can be done at memory speeds when within a computer. Shared memory is usually faster than message passing, asmessage-passing are typically implemented using system calls and thus require the more time-consuming tasks of kernel intervention. In contrast, in shared-memory systems, system calls are required only to establish shared-memory regions. Once established, all access are treated as normal memory accesses w/o extra assistance from the kernel.

Off the top of my head and assuming you talk about posix message queues (not the SysV ones):

Pipes aren't limited in size, message queues are.
Pipes can be integrated in systems using file descriptors, message queues have their own set of functions, though linux supports select(), poll(), epoll() and friends on the mqd_t.
Pipes, once closed, require some amount of cooperation on both sides to reestablish them, message queues can be closed and reopened on either side without the coorporation of the other side.
Pipes are flat, much like a stream, to impose a message structure you would have to implement a protocol on both sides, message queues are message oriented already, no care has to be taken to get, say, the fifth message in the queue.

C++ Notes

abstract classes

An interface describes the behavior or capabilities of a C++ class without committing to a particular implementation of that class.

A class is made abstract by declaring at least one of its functions as pure virtual function. A pure virtual function is specified by placing "= 0" in its declaration as follows:

class Box

{

public:

// pure virtual function

virtual double getVolume() = 0;

private:

double length; // Length of a box

};

The purpose of an abstract class (often referred to as an ABC) is to provide an appropriate base class from which other classes can inherit. Abstract classes cannot be used to instantiate objects and serves only as an interface. Attempting to instantiate an object of an abstract class causes a compilation error.

Thus, if a subclass of an ABC needs to be instantiated, it has to implement each of the virtual functions, which means that it supports the interface declared by the ABC. Failure to override a pure virtual function in a derived class, then attempting to instantiate objects of that class, is a compilation error. Classes that can be used to instantiate objects are called concrete classes.

Designing Strategy:

An object-oriented system might use an abstract base class to provide a common and standardized interface appropriate for all the external applications. Then, through inheritance from that abstract base class, derived classes are formed that all operate similarly.

The capabilities (i.e., the public functions) offered by the external applications are provided as pure virtual functions in the abstract base class. The implementations of these pure virtual functions are provided in the derived classes that correspond to the specific types of the application.

This architecture also allows new applications to be added to a system easily, even after the system has been defined.

Explain container class and its types in C++.

Answer
A container stores many entities and provide sequential or direct access to them. List, vector and strings are such containers in standard template library. The string class is a container that holds chars. All container classes access the contained elements safely and efficiently by using iterators. Container class is a class that hold group of same or mixed objects in memory. It can be heterogeneous and homogeneous. Heterogeneous container class can hold mixed objects in memory whereas when it is holding same objects, it is called as homogeneous container class.

What is a container class? What are the types of container classes?

A class is said to be a container class which is utilized for the purpose of holding objects in memory or persistent media. A generic class plays a role of generic holder. A container class is a good blend of predefined behavior and an interface that is well known. The purpose of container class is to hide the topology for the purpose of objects list maintenance in memory. A container class is known as heterogeneous container, when it contains a set of different objects. A container class is known as homogeneous container when it contains a set of similar objects.

What are the C++ standardized container classes?

The following are the standardized container classes:

std::map: Used for handle sparse array or a sparse matrix.
std::vector: Like an array, this standard container class offers additional features such as bunds checking through the at () member function, inserting or removing elements, automatic memory management and throwing exceptions.
std::string: A better supplement for arrays of chars.

What are shallow and deep copy?

A shallow copy just copies the values of the data as they are. Even if there is a pointer that points to dynamically allocated memory, the pointer in the copy will point to the same dynamically allocated object.

A deep copy creates a copy of the dynamically allocated objects too. You would need to use a copy constructor and overload an assignment operator for this

A shallow copy of an object copies all of the member field values. This works well if the fields are values, but may not be what you want for fields that point to dynamically allocated memory. The pointer will be copied. but the memory it points to will not be copied -- the field in both the original object and the copy will then point to the same dynamically allocated memory, which is not usually what you want. The default copy constructor and assignment operator make shallow copies.

A deep copy copies all fields, and makes copies of dynamically allocated memory pointed to by the fields. To make a deep copy, you must write a copy constructor and overload the assignment operator, otherwise the copy will point to the original, with disasterous consequences.

If an object has pointers to dynamically allocated memory, and the dynamically allocated memory needs to be copied when the original object is copied, then a deep copy is required.

A class that requires deep copies generally needs:

A constructor to either make an initial allocation or set the pointer to NULL.
A destructor to delete the dynamically allocated memory.
A copy constructor to make a copy of the dynamically allocated memory.
An overloaded assignment operator to make a copy of the dynamically allocated memory.

Usefull blogs

http://www.bogotobogo.com/cplusplus/multithreading_ipc.php

http://hzqtc.github.io/2012/07/linux-ipc-with-pipes.html

http://www.bogotobogo.com/cplusplus/quiz_multithreading.php

http://stackoverflow.com/questions/860339/difference-between-private-public-and-protected-inheritance

https://computing.llnl.gov/tutorials/pthreads/#Thread (Very good Posix thread programing)

Virtual Memory

Real, or physical, memory exists on RAM chips inside the computer. Virtual memory, as its name suggests, doesn’t physically exist on a memory chip.

It is an optimization technique and is implemented by the operating system in order to give an application program the impression that it has more memory than actually exists.

Virtual memory is implemented by various operating systems such as Windows, Mac OS X, and Linux.

So how does virtual memory work?

Let’s say that an operating system needs 120 MB of memory in order to hold all the running programs, but there’s currently only 50 MB of available physical memory stored on the RAM chips. The operating system will then set up 120 MB of virtual memory, and will use a program called the virtual memory manager (VMM) to manage that 120 MB.

The VMM will create a file on the hard disk that is 70 MB (120 – 50) in size to account for the extra memory that’s needed. The O.S. will now proceed to address memory as if there were actually 120 MB of real memory stored on the RAM, even though there’s really only 50 MB. So, to the O.S., it now appears as if the full 120 MB actually exists.

It is the responsibility of the VMM to deal with the fact that there is only 50 MB of real memory.

The paging file and the RAM

Now, how does the VMM function? As mentioned before, the VMM creates a file on the hard disk that holds the extra memory that is needed by the O.S., which in our case is 70 MB in size.

This file is called a paging file (also known as a swap file), and plays an important role in virtual memory. The paging file combined with the RAM accounts for all of the memory.

Whenever the O.S. needs a ‘block’ of memory that’s not in the real (RAM) memory, the VMM takes a block from the real memory that hasn’t been used recently, writes it to the paging file, and then reads the block of memory that the O.S. needs from the paging file.

The VMM then takes the block of memory from the paging file, and moves it into the real memory – in place of the old block. This process is called swapping (also known as paging), and the blocks of memory that are swapped are called pages.

The group of pages that currently exist in RAM, and that are dedicated to a specific process, is known as the working set for that process.

As mentioned earlier, virtual memory allows us to make an application program think that it has more memory than actually exists. There are two reasons why one would want this: the first is to allow the use of programs that are too big to physically fit in memory. The other reason is to allow for multitasking – multiple programs running at once.

Before virtual memory existed, a word processor, e-mail program, and browser couldn’t be run at the same time unless there was enough memory to hold all three programs at once. This would mean that one would have to close one program in order to run the other, but now with virtual memory, multitasking is possible even when there is not enough memory to hold all executing programs at once.

Virtual Memory Can Slow Down Performance

However, virtual memory can slow down performance. If the size of virtual memory is quite large in comparison to the real memory, then more swapping to and from the hard disk will occur as a result. Accessing the hard disk is far slower than using system memory.

Using too many programs at once in a system with an insufficient amount of RAM results in constant disk swapping – also called thrashing, which can really slow down a system’s performance.

What is the purpose of swapping in virtual memory?

Swapping is exchanging data between the hard disk and the RAM

The goal of the virtual memory technique is to make an application think that it has more memory than actually exists. If you read above then you know that the virtual memory manager (VMM) creates a file on the hard disk called a swap file. Basically, the swap file (also known as a paging file) allows the application to store any extra data that can’t be stored in the RAM – because the RAM has limited memory.

Keep in mind that an application program can only use the data when it’s actually in the RAM. Data can be stored in the paging file on the hard disk, but it is not usable until that data is brought into the RAM. Together, the data being stored on the hard disk combined with the data being stored in the RAM comprise the entire data set needed by the application program.

So, the way virtual memory works is that whenever a piece of data needed by an application program cannot be found in the RAM, then the program knows that the data must be in the paging file on the hard disk.

But in order for the program to be able to access that data, it must transfer that data from the hard disk into the RAM. This also means that a piece of existing data in the RAM must be moved to the hard disk in order to make room for the data that it wants to bring in from the hard disk. So, you can think of this process as a trade in which an old piece of data is moved from the RAM to the hard disk in exchange for a ‘new’ piece of data to bring into the RAM from the hard disk. This trade is known as swapping or paging.

Another term used for this is a ‘page fault’ – which occurs when an application program tries to access a piece of data that is not currently in the RAM, but is in the paging file on the hard disk.

Remember that page faults are not desirable since they cause expensive accesses to the hard disk. Expensive in this context means that accessing the hard disk is slow and takes time.

The Purpose Of Swapping

So, we can say that the purpose of swapping, or paging, is to access data being stored in hard disk and to bring it into the RAM so that it can be used by the application program. Remember that swapping is only necessary when that data is not already in the RAM.

Excessive Swapping Causes Thrashing

Excessive use of swapping is called thrashing and is undesirable because it lowers overall system performance, mainly because hard drives are far slower than RAM.

Fork And exec

The use of fork and exec exemplifies the spirit of
UNIX in that it provides a very simple way to start new
processes.
The fork call basically makes a duplicate of the
current process, identical in almost every way (not
everything is copied over, for example, resource limits
in some implementations but the idea is to create as
close a copy as possible).
The new process (child) gets a different process ID
(PID) and has the the PID of the old process (parent)
as its parent PID (PPID). Because the two processes
are now running exactly the same code, they can tell
which is which by the return code of fork - the child
gets 0, the parent gets the PID of the child. This is all,
of course, assuming the fork call works - if not, no
child is created and the parent gets an error code.
The exec call is a way to basically replace the entire
current process with a new program. It loads the
program into the current process space and runs it
from the entry point.
So, fork and exec are often used in sequence to get
a new program running as a child of a current
process. Shells typically do this whenever you try to
run a program like find - the shell forks, then the
child loads the find program into memory, setting up
all command line arguments, standard I/O and so
forth.
But they're not required to be used together. It's
perfectly acceptable for a program to fork itself
without exec ing if, for example, the program contains
both parent and child code (you need to be careful
what you do, each implementation may have
restrictions). This was used quite a lot (and still is)
for daemons which simply listen on a TCP port and
fork a copy of themselves to process a specific
request while the parent goes back to listening.
Similarly, programs that know they're finished and
just want to run another program don't need to fork ,
exec and then wait for the child. They can just load
the child directly into their process space.
Some UNIX implementations have an optimized fork
which uses what they call copy-on-write. This is a
trick to delay the copying of the process space in
fork until the program attempts to change something
in that space. This is useful for those programs using
only fork and not exec in that they don't have to
copy an entire process space.
If the exec is called following fork (and this is what
happens mostly), that causes a write to the process
space and it is then copied for the child process.
Note that there is a whole family of exec calls
(execl , execle, execve and so on) but exec in
context here means any of them.
The following diagram illustrates the typical fork/
exec operation where the bash shell is used to list a
directory with the ls command:
+--------+
| pid=7 |
| ppid=4 |
| bash   |
+--------+
    |
    | calls fork
    V
+--------+             +--------+
| pid=7 |    forks    | pid=22 |
| ppid=4 | ----------> | ppid=7 |
| bash   |             | bash   |
+--------+             +--------+
    |                      |
    | waits for pid 22     | calls
exec to run ls
    |                      V
    |                  +--------+
    |                  | pid=22 |
    |                  | ppid=7 |
    |                  | ls     |
    V                  +--------+
+--------+                 |
| pid=7 |                 | exits
| ppid=4 | <---------------+
| bash   |
+--------+
    |
    | continues
    V

(Draft In progress)

DumbGeeks