Posted by Michael Heliso
As you all know any program is using resources. These resources can be files, data base resources, network connections, memory buffers, objects, etc. The usage of any resources requires memory to be allocated. This is achieved using the following steps:
1.Allocate memory for the type that represents the resource by calling newobj intermediate language instruction. This instruction is emitted when you use the new operator.
2.Initialize the memory to make the resource available. The initial state of the resource is created by the constructor of the type.
3.Use the resource by accessing type members.
4.Clean up the resource state.
5.Free the memory occupied by the resource. This task is performed by garbage collector.
The previous steps were generating two major bugs. First, often programmers forgotten to free memory when this was any longer needed. Second, programmers tried to access memory after this was freed. These two cases represent the worst bugs that can reside in an application because the behavior of the application in unpredictable and it's also very difficult to track them.
Resource management it's a difficult task and distracts you, the programmer from the main problems that you have to solve. So this is the reason why garbage collector was created. It will take all the burden of memory handling from your shoulders. You must keep in mind that garbage collector does not knows about a resource that is represented by types in memory. So, this means that it can not clean up the resource in proper manner. This steps must be performed by you, by writing the corresponding code which will clean up the resource. There are two methods that you can make use of to perform the resource cleaning, Finalize and Dispose.
Most types existent in .NET framework do not require resource clean up, types such as Int32, String, ArrayList . But there are also types which wrap some unmanaged resource like a network connection, a database connection, an icon, etc.
The Common Language Runtime also known as CLR requires that all resources to be allocated from the managed heap. This managed heap resembles with the heap from C runtime heap with one major difference, you never free objects from the managed heap, object will be automatically freed when the application doesn't needs them. Now let's see what happens when a process get's initialized. The CLR will reserve a contiguous zone of address space. This address zone represents the managed heap. The managed heap maintains pointer to this address space, we'll call it NextObjectPtr. This pointer indicates where in the managed heap the next object will be allocated. Initially NextObjectPtr points to the base address from the reserved address space. As your intuition tells you, newobj instruction creates a new object. As I have mentioned earlier, this instruction is emitted when you make use of new key word. The newobj instruction will determine the CLR to perform the next steps:
1.Calculate the number of bytes required by the type for which memory will be allocated and also for all it's based types.
2.Add the bytes required for an object overhead. Each object has two overhead fields, a method table pointer and a SyncBLockIndex. If you are using an 32 bit system, each field requires 32 bits, this will add 8 bytes to each object. In the case of a 64 bit system, each field requires 64 bits which will add 16 bytes to each object.
3.CLR then checks if the bytes required to allocate the object are available in the reserved address space (managed heap), if the will fit, then it is allocated at the address pointed by NextObjectPtr, the constructor is called passing NextObjectPtr for this parameter and the new operator will return the address of the object. The NextObjectPtr is moved after the currently allocated object and indicates the address where the next object will be allocated at in the managed heap.
When an application calls the new operator to create a new object there might not be enough space for it. The managed heap checks this by adding the required bytes of the objet to the address in NextObjectPtr. If the value exceeds the address space the managed heap is full and garbage collection takes action.
Garbage collector checks if there are any unused objects in the managed heap. If this kind of objects do exists, then the memory used by them can be reclaimed. If there is no more memory in the heap the new operator will throw an OutOfMemoryException.
An application has a set of roots. A root can be considered to be a memory storage location which contains a pointer to a reference type. This pointer can refer to an object or is set to null if the object doesn't exists. All global or static reference type variables are considered roots also any local variable reference type or parameter variable on a thread stack are considered as being roots. When garbage collector starts running (garbage collector starts when generation 0 of object is full, the garbage collector generation mechanism is used for performance improving, I'm not going to discuss this matter in this article), it will assume that all roots from the managed heap do not refer to any object. The garbage collector starts to iterate thru all roots and creates a graph with all objects that can be reached. In the image objects A and B are directly referenced by the roots so they will be added to the graph. When garbage collector will reach object C it will observer that this objects references another object from the managed heap, object D. So, object will be also added to the garbage collector graph. The entire iteration is performed recursively.
After the graph is completed, this will contain all objects reachable from your application. All other objects which are not a part of this graph are considered garbage. The garbage collector will iterate the heap linearly searching for free large continuous blocks of memory where new object could be allocated. Also garbage collector will shift non garbage objects in memory using memcpy function to compact the memory heap. This operation will make all pointers to objects invalid. So the garbage collector will correct all the invalid pointers. After the managed heap memory is compacted the NextObjectPtr will point exactly after the last non garbage object.
Now that you have a general overview of how garbage collector works you can design you applications properly.
In the next article I will talk with more details about object generations and also about Finalize and Dispose methods and how they should be used.