No Raw Pointers

The Problem
Consider native pointers of the kind “T* p.” The main difficulty with such kind of pointers is that memory is

  • sometimes deleted more than once leading to memory corruption or
  • sometimes not deleted at all leading to memory leaks.

Furthermore when a function is passed a pointer it is not obvious whether the pointer passed in must be deleted within the function or elsewhere.

The C convention is that the function that created the pointer must delete the pointer. This is a good to discipline to follow where feasible. However in event driven applications such as a Graphical User Interface (GUI) or web server, one function allocates resources for other functions to use. Yet even here there is some kind of order and every resource is ‘owned’ by one object only. For example in a GUI a window may allocate memory to store fonts. Sub-windows may access this data but rarely modify it and never delete it. The parent window must delete those resources. Sometimes however we may not wish to recreate the data store and hand over resource from one window to another. Hence the developer has to keep a ledger of some sort to track which resource has been to be dealloacted and when. This is the kind of problem that garbage collection was meant to solve. C/C++ does not have that option, not yet at least. Another solution to the problem is to use reference counted pointers or smart pointers or shared pointers in C++ parlance.

Raw Pointers
“A raw pointer is pointer to an object with implied ownership and reference semantics” [1] such as

  • T* p
  • unique_ptr
  • shared_ptr

As Parent suggests raw pointers sharing memory between distinct pieces of code is no better than a global variable. Here is why. It is difficult to reason about shared pointers. Normally we expect a function to be time invariant. ‘sin(x)’ should return the same value for a given value of ‘x’ no matter when it is called and so it does. But if a piece of memory is shared then two runs of the same function with the same input can return different results because the underlying memory could be modified by another module. Let us call this problem the curse of sharing. This problem does not arise with raw pointers to const objects. Functional programmers have been saying that for ages. In fact Hoare [2] preferred message passing to shared memory a long while back. “Accelerated C++” contains a lot of good code before the idea of pointers is broached. In fact Water Buckets problem, the BrainVita problem or the Sudoku solver, all written C++, do not use raw pointers other than those used internally by STL containers.

Current Uses
So why are pointers still used? Optimization is one reason. Passing large containers by value is sub-optimal. The other very common reason is polymorphism. Lets take a simple example where a document is a collection of shapes. Drawing the document consists of drawing the shapes. Let us assume a shape can be just a triangle, rectangle or circle. In other words triangle, rectangle and circle would inherit from shape. The drawing code would like this

class shape {
public:
	virtual ~shape() { }
	virtual void draw(ostream&, size_t) const = 0;
};
using document_t = vector<shared_ptr<shape>>;
void draw(const document_t& x, ostream& out, size_t position)
{
	out << string(position, ' ') << "<document>" << endl;
	for (const auto& e : x) e->draw(out, position + 2);
	out << string(position, ' ') << "</document>" << endl;
}

Here a document_t would be instantiated with raw pointers to concrete objects like circle, rectangle or triangle. The issue here is the curse of inheritance. Just to indicate that a type implements a function with a specific signature, each concrete object is required to inherit from a specific class and override virtual functions. A virtual pointer introduces a level of indirection that involves a performance hit.

More importantly it introduces software engineering nightmares. Usually drawing is dependent on an external library such as Windows GDI, DirectX, OpenGL.. etc. Hence instead of ostream we would require a different graphics object for each library. It does not seem elegant to compile all the implementations when only one of it is going to be used. Hence we do #define ‘s to comment out what we do not want. Not pretty. Next consider archiving the object. Archiving is done in different formats JSON, XML, binary… etc. We will have to create load and save methods for each format. This type of intrusive modification of classes is not good and STL has helped us move away from it.

Making draw or save an external function means that the shape abstraction is not corrupted. Thus instead of virtual function override we use function overloading. We can achieve that using templates. Since the document_t has to contain objects of different types we have to store them using pointers to a common type. But now instead of inheritance we use containment. “There’s nothing in computing that can’t be broken by another level of indirection.” (Robin Pike). Parent suggests the following alternative [1].

class object_t {
public:
	template <typename T>
	object_t(T x) :
           self_(make_shared<model<T>>(move(x)))
           { }
	friend void draw(
          const object_t& x, 
          ostream& out, 
          size_t position)
	{
		x.self_->draw_(out, position);
	}
private:
	struct concept_t {
		virtual ~concept_t() = default;
		virtual void draw_(ostream&, size_t) const = 0;
	};
	template <typename T>
	struct model : concept_t {
		model(T x) : data_(move(x)) { }
		void draw_(ostream& out, size_t position) const
		{
			draw(data_, out, position);
		}
		T data_;
	};
	shared_ptr<const concept_t> self_;
};
using document_t = vector<shared_ptr<object_t>>;

Thus even though we have inheritance and virtual functions the whole code is localised to a specific class. We do not have an abstract shape class, just a generic model. Notice the object is copied and there is no external reference to it thus avoiding the curse of sharing and that ‘draw(circle&, ostream&, int)‘ is not a member function. Neither for that matter is ‘draw(object_t cons&...).’ Hence if we want to draw using DirectX we would put all the drawing code in a single file. If we wanted to use OpenGL instead we would have no need to include the files for DirectX. But even if we wanted both we could rewrite ‘draw’ as follows.

using document_t = vector<shared_ptr<shape>>;
template<class T>
void draw(const document_t& x, T& out, size_t position)
{
	out << string(position, ' ') << "<document>" << endl;
	for (const auto& e : x) e->draw(out, position + 2);
	out << string(position, ' ') << "</document>" << endl;
}

In other words template-ize the draw function so that drawing tool is passed as a template parameter. While this is rarely required for drawing, it comes in handy while doing IO.

One final note
Containers holding polymorphic objects should, almost always, contain only immutable objects. Let’s consider a counter example: a progress bar in a container window. The length of the progress bar depends on an external event such as the fraction of the file that was copied. Wouldn’t the drawn object, progress bar in this case, have to be updated? Not really. The progress bar should refer to an external object that represents the progress status. This external object can be updated by some other function. Thus the drawing window cannot modify the progress bar because the object_t is constant but the the progress bar can display its value based on an external value. In other words the document must be separate from the view. What about the document itself? Doesn’t that contain mutable objects? Again No. The recommendation here is to replace a modified object. This does seem counter intuitive but it helps especially if there is a need to undo or backtrack.

Reference
[1] Sean Parent, “C++ Seasoning,” Going Native 2013, http://channel9.msdn.com/Events/GoingNative/2013/Cpp-Seasoning.
[2] C.A.R. Hoare “Communicating Sequential Processes” http://www.usingcsp.com/cspbook.pdf (first published by Prentice Hall in 1985)

Advertisements

About The Sunday Programmer

Joe is an experienced C++/C# developer on Windows. Currently looking out for an opening in C/C++ on Windows or Linux.
This entry was posted in C++, Software Engineering and tagged . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s