Written on February 8th, 2007.

Manual memory management using malloc() and free() works nicely, but can become tiresome. If you forget a call to free(), you leak memory; if you free memory too much you crash.

One technique which can make memory management a lot easier is reference counting, which this article is about. I’ll explain what it is, how it works, and why it is useful.

Warning: for clarity, this code in this tutorial does not contain a lot of error handling. In real-world applications, you definitely need to perform some error checking.

A Simple Dynamic Array

For this example, I’ve written a very simple dynamic array. (A dynamic array automatically expands when adding new elements, unlike a normal C array.) Here’s the interface:

/* array type */
typedef struct array_s array_t;

/* creates an array */
array_t *array_create(void);

/* deallocates an array */
void array_destroy(array_t *a_array);

/* adds a pointer to the array */
void array_add(array_t *a_array, void *a_data);

/* gets the pointer at the given index */
void *array_get(array_t *a_array, unsigned a_index);

/* removes the pointer at the given index */
void array_remove(array_t *a_array, unsigned a_index);

For flexibility reasons, this array only holds pointers in its internal storage. It doesn’t do anything with them, though — it doesn’t allocate memory for the elements we add to it, and obviously it doesn’t free any memory either.

Because of this, adding an element will require us to allocate memory for it manually, before adding it. Similarly, we’ll need to free its memory manually after we’re done with it. (You could also simply store pointers without allocating them first in this dynamic array, but then there’s the risk of dangling pointers in case you free the memory for the array element elsewhere…)

Some Memory Management Ideas

There’s a few things you can do in order to make memory management easier. Here’s a few ideas that don’t work:

First idea: let array_remove() free the pointer. Using this technique, you’d only have to allocate memory; freeing memory would happen automatically.

However, a pointer contained in more than one array cannot easily be removed; removing it from one array will render the pointer invalid, and you’d end up with a dangling pointer. Even worse, trying to remove the pointer from a second array will probably cause a crash, because non-allocated memory is being freed.

Second idea: let array_add() make a copy of the element that is being added, and add that copy instead. array_add() would need to have a third argument representing the element’s size; the function needs to know how many bytes to allocate for it after all. array_remove() would then free the copy of the element.

This approach also has some drawbacks. Making a copy of an element takes time, and uses a lot of memory. Updating a single element is not easy; you’d have to search all copies of this element and update those. Lastly, comparing elements isn’t as simple as comparing pointers anymore — you’d need a comparator function to compare two copies.

The third idea is called reference counting, and it works quite well.

Reference Counting, At Last

The basic idea behind reference counting is this: every object is given a reference count, which indicates the number of “things” using this object. For example, a “user” object which is in two arrays while being processed in some function would have reference count 3 (3 things: 2 arrays, 1 function).

When a “thing” starts using a reference-counted object, it increments the object’s reference count; this is called retaining. When this “thing” is done using this reference-counted object, it decrements the object’s reference count; this is called releasing.

As long as this object’s reference count is higher than zero, some “things” are using it. When it reaches zero, nothing is using the object anymore, and it can safely be deallocated.

When a reference-counted object is created, its reference count is set to 1; the idea is that the function creating the object is going to use this object.

Objects

The easiest way to add reference counting to existing code is to add a reference_count variable to your structs. For example, this is a user struct which has a new reference count field:

struct user_s {
    char     *first_name;
    char     *last_name;

    /* ... */

    unsigned  reference_count;
}

However, this is not a flexible solution. You’d need to update all object types to allow reference counting. You’d probably need to write retain and release functions for every object type in your application.

A nicer solution is introducing a new object structure, which will wrap around existing objects. This new objwrapper_t type will require only one retain and release function, and it’s much nicer to work with in arrays as well. Here’s the interface:

/* object wrapper type */
typedef struct objwrapper_s objwrapper_t;

/* object wrapper destructor type */
typedef void (*objwrapper_destructor_t)(void *a_data);

/* creates an object wrapper */
objwrapper_t *objwrapper_create(
    void *a_data,
    objwrapper_destructor_t a_destructor
);

/* returns the object in the object wrapper */
void *objwrapper_get_object(objwrapper_t *a_objwrapper);

/* retains the object inside a wrapper */
void objwrapper_retain(objwrapper_t *a_objwrapper);

/* releases the object inside a wrapper */
void objwrapper_release(objwrapper_t *a_objwrapper);

The struct objwrapper_s contains a pointer to the actual object, as well as the destructor and, obviously, the reference count:

struct objwrapper_s {
    void                    *data;
    unsigned                reference_count;
    objwrapper_destructor_t destructor;
};

Without the object wrapper, you’d create an object and pass the pointer to this object around to other functions, add it to arrays, etc. With the object wrapper, you’d create an object and an object wrapper, and pass the object wrapper around instead.

Here’s what these functions do, under the hood:

With this object wrapper implementation in place, let’s take a look at adding reference counting to the array implementation from the beginning of the article.

A Dynamic Reference-Counting Array

The array implementation won’t change too much. Most importantly, arrays will keep a collection of object wrappers, not pointers, in memory; the interface will need to be changed to reflect this. Here’s the new one:

/* array type */
typedef struct array_s array_t;

/* creates an array */
array_t *array_create(void);

/* deallocates an array */
void array_destroy(array_t *a_array);

/* adds an object wrapper to the array */
void array_add(array_t *a_array, objwrapper_t *a_objwrapper);

/* gets the object wrapper at the given index */
objwrapper_t *array_get(array_t *a_array, unsigned a_index);

/* removes the object wrapper at the given index */
void array_remove(array_t *a_array, unsigned a_index);

Here’s what these functions do to the object wrappers, under the hood:

Because of reference counting, array_destroy() and array_remove() don’t deallocate anything. They just release the object using the objwrapper_release function — which will deallocate the object and its wrapper if its retain count reaches zero.

The Implementation

As you may have noticed, I haven’t actually implemented anything of the above; there’s only some typedefs, structs and function declarations. This is mostly because I think that the interface is much more important than the implementation — the latter is simply filling in the blanks, which is (more or less) easy.

Still, I’ve written the implementation and put it into a sample project, which you can download here.

A Look Ahead

This post only scratches the surface of reference counting. I’ve deliberately kept this article entry-level; I’m probably not qualified to dig much deeper into algorithms and optimizations and related techniques.

There is one more advanced topic worth mentioning, though, and that’s the problem of retain cycles, which occur when two objects retain each other. Imagine this situation: a “user” object has a retained reference to an “e-mail address” object, and an “e-mail address” object has a retained reference to a “user” object. Both objects retain each other — the result is a memory leak, because neither object will ever be de-allocated.

Reference counting can make memory management much easier, but only when done right. If even reference counting is too tedious for you, then maybe you want garbage collection, which almost completely eliminates manual memory management — but garbage collection is way outside the scope of this article.

Learn More

Wikipedia has a fairly dry and theoretical description of reference counting, which you may want to read.

Cocoa, Mac OS X’s set of awesome frameworks, uses reference counting quite a lot. The Memory Management Guide is mostly oriented at Objective-C, but is a very useful read anyway if you want to know more about reference counting in general.

The Final Word

I’ve enjoyed writing this article; I do hope you’ll enjoy reading it. If you have questions or comments, please do .