Disposal and Garbage Collection in C#

Some objects require explicit tear-down code to release resources such as open files,
locks, operating system handles, and unmanaged objects. In .NET parlance, this is
called disposal, and it is supported through the IDisposable interface. The managed
memory occupied by unused objects must also be reclaimed at some point; this
function is known as garbage collection and is performed by the CLR.


Disposal differs from garbage collection in that disposal is usually explicitly instigated;
garbage collection is totally automatic. In other words, the programmer takes
care of such things as releasing file handles, locks, and operating system resources
while the CLR takes care of releasing memory.


This article discusses both disposal and garbage collection, also describing C#
finalizers and the pattern by which they can provide a backup for disposal. Lastly,
we discuss the intricacies of the garbage collector and other memory management
options.


IDisposable, Dispose, and Close
The .NET Framework defines a special interface for types requiring a tear-down
method:
public interface IDisposable
{
void Dispose();
}
C#’s using statement provides a syntactic shortcut for calling Dispose on objects
that implement IDisposable, using a try/finally block. For example:
using (FileStream fs = new FileStream ("myFile.txt", FileMode.Open))
{
// ... Write to the file ...
}
The compiler converts this to:
FileStream fs = new FileStream ("myFile.txt", FileMode.Open);
try
{
// ... Write to the file ...
}
finally
{
if (fs != null) ((IDisposable)fs).Dispose();
}
The finally block ensures that the Dispose method is called even when an exception
is thrown,* or the code exits the block early.
In simple scenarios, writing your own disposable type is just a matter of implementing
IDisposable and writing the Dispose method:
sealed class Demo : IDisposable
{
public void Dispose()
{
// Perform cleanup / tear-down.
...
}
}


Standard Disposal Semantics

The Framework follows a de facto set of rules in its disposal logic. These rules are
not hard-wired to the Framework or C# language in any way; their purpose is to
define a consistent protocol to consumers. Here they are:

1. Once disposed, an object is beyond redemption. It cannot be reactivated, and
calling its methods or properties throws an ObjectDisposedException.
2. Calling an object’s Dispose method repeatedly causes no error.
3. If disposable object x contains or “wraps” or “possesses” disposable object y,
x’s Dispose method automatically calls y’s Dispose method—unless instructed
otherwise.

These rules are also helpful when writing your own types, though not mandatory.
Nothing prevents you from writing an “Undispose” method, other than, perhaps,
the flak you might cop from colleagues!

According to rule 3, a container object automatically disposes its child objects.

A good example is a Windows container control such as a Form or Panel. The container
may host many child controls, yet you don’t dispose every one of them explicitly:
closing or disposing the parent control or form takes care of the whole lot.
Another example is when you wrap a FileStream in a DeflateStream. Disposing the
DeflateStream also disposes the FileStream—unless you instructed otherwise in the
constructor.

Close and Stop

Some types define a method called Close in addition to Dispose. The Framework is
not completely consistent on the semantics of a Close method, although in nearly
all cases it’s either:

• Functionally identical to Dispose
• A functional subset of Dispose

An example of the latter is IDbConnection: a Closed connection can be re-Opened; a
Disposed connection cannot. Another example is a Windows Form activated with
ShowDialog: Close hides it; Dispose releases its resources.
Some classes define a Stop method (e.g., Timer or HttpListener), which may release
unmanaged resources, like Dispose, but unlike Dispose, it allows for re-Starting.

When to Dispose

A safe rule to follow (in nearly all cases) is “If in doubt, dispose.” A disposable
object—if it could talk—would say the following:

When you’ve finished with me, let me know. If simply abandoned, I might
cause trouble for other object instances, the application domain, the computer,
the network, or the database!

Objects wrapping an unmanaged resource handle will nearly always require disposal,
in order to free the handle. Examples include Windows Forms controls, file
or network streams, network sockets, GDI+ pens, brushes, and bitmaps. Conversely,
if a type is disposable, it will often (but not always) reference an unmanaged
handle, directly or indirectly. This is because unmanaged handles provide the gateway
to the “outside world” of operating system resources, network connections,
database locks—the primary means by which objects can create trouble outside of
themselves if improperly abandoned.

There are, however, three scenarios for not disposing:

• When obtaining a shared object via a static field or property
• When an object’s Dispose method does something that you don’t want
• When an object’s Dispose method is unnecessary by design, and disposing that
object would add complexity to your program

The first category is rare. The main cases are in the System.Drawing namespace: the
GDI+ objects obtained through static fields or properties (such as Brushes.Blue) must
never be disposed because the same instance is used throughout the life of the application.
Instances that you obtain through constructors, however (such as new
SolidBrush), should be disposed, as should instances obtained through static methods
(such as Font.FromHdc).
The second category is more common.


When you might have lazily evaluated queries connected
to that context MemoryStream’s Dispose method disables only the object; it doesn’t perform any critical cleanup because a MemoryStream holds no unmanaged handles or other such resources.

The third category includes the following classes: WebClient, StringReader, String
Writer, and BackgroundWorker (in System.ComponentModel). These types are disposable
under the duress of their base class rather than through a genuine need to perform
essential cleanup. If you happen to instantiate and work with such an object entirely
in one method, wrapping it in a using block adds little inconvenience. But if the
object is longer-lasting, keeping track of when it’s no longer used so that you can
dispose of it adds unnecessary complexity. In such cases, you can simply ignore
object disposal.

Opt-in Disposal

Because IDisposable makes a type tractable with C#’s using construct, there’s a
temptation to extend the reach of IDisposable to nonessential activities. For
instance:


public sealed class HouseManager : IDisposable
{
public void Dispose()
{
CheckTheMail();
}
...
}


The idea is that a consumer of this class can choose to circumvent the nonessential
cleanup—simply by not calling Dispose. This, however, relies on the consumer
knowing what’s inside Demo’s Dispose method. It also breaks if essential cleanup
activity is later added:

public void Dispose()
{
CheckTheMail(); // Nonessential
LockTheHouse(); // Essential
}

The solution to this problem is the opt-in disposal pattern:
public sealed class HouseManager : IDisposable
{
public readonly bool CheckMailOnDispose;
public Demo (bool checkMailOnDispose)
{
CheckMailOnDispose = checkMailOnDispose;
}
public void Dispose()
{
if (CheckMailOnDispose) CheckTheMail();
LockTheHouse();
}
...
}
The consumer can then always call Dispose—providing simplicity and avoiding the
need for special documentation or reflection. An example of where this pattern is
implemented is in the DeflateStream class, in System.IO.Compression. Here’s its
constructor:

public DeflateStream (Stream stream, CompressionMode mode, bool leaveOpen)
The nonessential activity is closing the inner stream (the first parameter) upon disposal.
There are times when you want to leave the inner stream open and yet still
dispose the DeflateStream to perform its essential tear-down activity (flushing buffered
data).

This pattern might look simple, yet it escaped StreamReader and StreamWriter (the
System.IO namespace). The result is messy: StreamWriter must expose another
method (Flush) to perform essential cleanup for consumers not calling Dispose. The
CryptoStream class in System.Security.Cryptography suffers a similar problem and
requires that you call FlushFinalBlock to tear it down while keeping the inner stream
open.

Clearing Fields in Disposal

In general, you don’t need to clear an object’s fields in its Dispose method. However,
it is good practice to unsubscribe from events that the object has subscribed to internally
over its lifetime. Unsubscribing from such events avoids receiving unwanted event
notifications—and avoids unintentionally keeping the object alive in the eyes of the
garbage collector (GC).

It’s also worth setting a field to indicate that the object is disposed so that you can
throw an ObjectDisposedException if a consumer later tries to call members on the
object. A good pattern is to use a publicly readable automatic property for this:
public bool IsDisposed { get; private set; }

Although technically unnecessary, it can also be good to clear an object’s own event
handlers (by setting them to null) in the Dispose method. This eliminates the possibility
of those events firing during or after disposal.

Occasionally, an object holds high-value secrets, such as encryption keys. In these
cases, it can make sense to clear such data from fields during disposal (to avoid
discovery by less privileged assemblies or malware). The SymmetricAlgorithm class
in System.Security.Cryptography does exactly this, by calling Array.Clear on the
byte array holding the encryption key.

Automatic Garbage Collection

Regardless of whether an object requires a Dispose method for custom tear-down
logic, at some point the memory it occupies on the heap must be freed. The CLR
handles this side of it entirely automatically, via an automatic GC. You never
deallocate managed memory yourself. For example, consider the following method:
public void Test()
{
byte[] myArray = new byte[1000];
...
}


Garbage Collection and Memory Consumption

The GC tries to strike a balance between the time it spends doing garbage collection
and the application’s memory consumption (working set). Consequently,
applications can consume more memory than they need, particularly if large temporary
arrays are constructed.

The problem can look worse than it is, though, if you judge memory consumption
by the “Memory Usage” figure reported by the Task Manager in Windows XP.
Unlike with later versions of Windows (which report private working set), the XP
figure includes memory that a process has internally deallocated and is willing to
rescind immediately to the operating system should another process need it. (It
doesn’t return the memory to the operating system immediately to avoid the overhead
of asking for it back, should it be required a short while later. It reasons: “If
the computer has plenty of free memory, why not use it to lessen allocation/deallocation
overhead?”)

You can determine your process’s real memory consumption by querying a performance
counter (System.Diagnostics):
string procName = Process.GetCurrentProcess().ProcessName;
using (PerformanceCounter pc = new PerformanceCounter
("Process", "Private Bytes", procName))
Console.WriteLine (pc.NextValue());
Reading performance counters requires administrative privileges.

When Test executes, an array to hold 1,000 bytes is allocated on the memory heap.
The array is referenced by the variable myArray, stored on the local variable stack.
When the method exits, this local variable myArray pops out of scope, meaning that
nothing is left to reference the array on the memory heap. The orphaned array then
becomes eligible to be reclaimed in garbage collection.

Garbage collection does not happen immediately after an object is orphaned. Rather
like garbage collection on the street, it happens periodically, although (unlike garbage
collection on the street) not to a fixed schedule. The CLR bases its decision on
when to collect upon a number of factors, such as the available memory, the amount
of memory allocation, and the time since the last collection. This means that there’s an indeterminate delay between an object being orphaned and being released from
memory. Theoretically, it can range from nanoseconds to days.

Roots

referenced by a root, it will be eligible for garbage collection.

A root is one of the following:
• A local variable or parameter in an executing method (or in any method in its
call stack)
• A static variable
• An object on the queue that stores objects ready for finalization (see the next
section)
It’s impossible for code to execute in a deleted object, so if there’s any possibility of
an (instance) method executing, its object must somehow be referenced in one of
these ways.
Note that a group of objects that reference each other cyclically are considered dead
without a root referee. To put it in another way, objects that cannot
be accessed by following the arrows (references) from a root object are
unreachable—and therefore subject to collection.

Finalizers

Prior to an object being released from memory, its finalizer runs, if it has one. A
finalizer is declared in the same way as a constructor, but it is prefixed by the ˜
symbol:

class Test
{
˜Test()
{
// Finalizer logic...
}
}
Finalizers are possible because garbage collection works in distinct phases. First, the
GC identifies the unused objects ripe for deletion. Those without finalizers are deleted right away. Those with pending (unrun) finalizers are kept alive (for now)
and are put onto a special queue.
At that point, garbage collection is complete, and your program continues executing.
The finalizer thread then kicks in and starts running in parallel to your program,
picking objects off that special queue and running their finalization methods. Prior
to each object’s finalizer running, it’s still very much alive—that queue acts as a root
object. Once it’s been dequeued and the finalizer executed, the object becomes orphaned
and will get deleted in the next collection (for that object’s generation).
Finalizers can be useful, but they come with some provisos:
• Finalizers slow the allocation and collection of memory (the GC needs to keep
track of which finalizers have run).
• Finalizers prolong the life of the object and any referred objects (they must all
await the next garbage truck for actual deletion).
• It’s impossible to predict in what order the finalizers for a set of objects will be
called.
• You have limited control over when the finalizer for an object will be called.
• If code in a finalizer blocks, other objects cannot get finalized.
• Finalizers may be circumvented altogether if an application fails to unload
cleanly.

In summary, finalizers are somewhat like lawyers—although there are cases in which
you really need them, in general you don’t want to use them unless absolutely necessary.
If you do use them, you need to be 100% sure you understand what they are
doing for you.

Calling Dispose from a Finalizer


call Dispose on a disposable object; it’s usually better to have an object disposed late
than never! There’s a standard pattern for implementing this, as follows:
class Test : IDisposable
{
public void Dispose() // NOT virtual
{
Dispose (true);
GC.SuppressFinalize (this); // Prevent finalizer from running.
}
protected virtual void Dispose (bool disposing)
{
if (disposing)
{
// Call Dispose() on other objects owned by this instance.
// You can reference other finalizable objects here.
// ...
}
// Release unmanaged resources owned by (just) this object.
// ...
}
˜Test()
{
Dispose (false);
}
}

Dispose is overloaded to accept a bool disposing flag. The parameterless version is
not declared as virtual and simply calls the enhanced version with true.

The enhanced version contains the actual disposal logic and is protected and
virtual; this provides a safe point for subclasses to add their own disposal logic.
The disposing flag means it’s being called “properly” from the Dispose method
rather than in “last-resort mode” from the finalizer. The idea is that when called
with disposing set to false, this method should not, in general, reference other objects with finalizers (because such objects may themselves have been finalized and
so be in an unpredictable state). This rules out quite a lot! Here are a couple of tasks
it can still perform in last-resort mode, when disposing is false:
• Releasing any direct references to operating system resources (obtained, perhaps,
via a P/Invoke call to the Win32 API)
• Deleting a temporary file created on construction
To make this robust, any code capable of throwing an exception should be wrapped
in a try/catch block, and the exception, ideally, logged. Any logging should be as
simple and robust as possible.
Notice that we call GC.SuppressFinalize in the parameterless Dispose method—this
prevents the finalizer from running when the GC later catches up with it. Technically,
this is unnecessary, as Dispose methods must tolerate repeated calls. However,
doing so improves performance because it allows the object (and its referenced objects)
to be garbage-collected in a single cycle.

Resurrection

Suppose a finalizer modifies a living object such that it refers back to the dying object.
When the next garbage collection happens (for the object’s generation), the CLR
will see the previously dying object as no longer orphaned—and so it will evade
garbage collection. This is an advanced scenario, and is called resurrection.
To illustrate, suppose we want to write a class that manages a temporary file. When
an instance of that class is garbage-collected, we’d like the finalizer to delete the
temporary file. It sounds easy:

public class TempFileRef
{
public readonly string FilePath;
public TempFileRef (string filePath) { FilePath = filePath; }
~TempFileRef() { File.Delete (FilePath); }
}

Unfortunately, this has a bug: File.Delete might throw an exception (due to a lack
of permissions, perhaps, or the file being in use). Such an exception would take down
the whole application (as well as preventing other finalizers from running). We could
simply “swallow” the exception with an empty catch block, but then we’d never
know that anything went wrong. Calling some elaborate error reporting API would
also be undesirable because it would burden the finalizer thread, hindering garbage collection for other objects. We want to restrict finalization actions to those that are
simple, reliable, and quick.
A better option is to record the failure to a static collection as follows:

public class TempFileRef
{
static ConcurrentQueue<TempFileRef> _failedDeletions
= new ConcurrentQueue<TempFileRef>();
public readonly string FilePath;
public Exception DeletionError { get; private set; }
public TempFileRef (string filePath) { FilePath = filePath; }
~TempFileRef()
{
try { File.Delete (FilePath); }
catch (Exception ex)
{
DeletionError = ex;
_failedDeletions.Enqueue (this); // Resurrection
}
}
}

Enqueuing the object to the static _failedDeletions collection gives the object another
referee, ensuring that it remains alive until the object is eventually dequeued.

GC.ReRegisterForFinalize

A resurrected object’s finalizer will not run a second time—unless you call
GC.ReRegisterForFinalize.


In the following example, we try to delete a temporary file in a finalizer (as in the
last example). But if the deletion fails, we reregister the object so as to try again in
the next garbage collection:

public class TempFileRef
{
public readonly string FilePath;
int _deleteAttempt;
public TempFileRef (string filePath) { FilePath = filePath; }
~TempFileRef()
{
try { File.Delete (FilePath); }
catch
{
if (_deleteAttempt++ < 3) GC.ReRegisterForFinalize (this);
}
}
}

After the third failed attempt, our finalizer will silently give up trying to delete the
file. We could enhance this by combining it with the previous example—in other
words, adding it to the _failedDeletions queue after the third failure.

How the Garbage Collector Works

The CLR uses a generational mark-and-compact GC that performs automatic memory
management for objects stored on the managed heap. The GC is considered to
be a tracing garbage collector in that it doesn’t interfere with every access to an object,
but rather wakes up intermittently and traces the graph of objects stored on the
managed heap to determine which objects can be considered garbage and therefore
collected.

The GC initiates a garbage collection upon performing a memory allocation (via the
new keyword) either after a certain threshold of memory has been allocated, or at
other times to reduce the application’s memory footprint. This process can also be
initiated manually by calling System.GC.Collect. During a garbage collection, all
threads may by frozen (more on this in the next section).


The GC begins with its root object references, and walks the object graph, marking
all the objects it touches as reachable. Once this process is complete, all objects that
have not been marked are considered unused, and are subject to garbage collection.
Unused objects without finalizers are immediately discarded; unused objects with
finalizers are enqueued for processing on the finalizer thread after the GC is complete.
These objects then become eligible for collection in the next GC for the object’s
generation (unless resurrected).

The remaining “live” objects are then shifted to the start of the heap (compacted),
freeing space for more objects. This compaction serves two purposes: it avoids memory fragmentation, and it allows the GC to employ a very simple strategy when
allocating new objects, which is to always allocate memory at the end of the heap.
This avoids the potentially time-consuming task of maintaining a list of free memory
segments.

If there is insufficient space to allocate memory for a new object after garbage
collection, and the operating system is unable to grant further memory, an
OutOfMemoryException is thrown.