Cppy smart pointer

CPython relies on reference counting to manage object lifetime. A large pitfall when writing C-extension is to properly handle increfing and decrefing the reference count. Cppy aims at simplifying this process by providing a smart pointer class. Before diving into the details of it helps lets start a CPython reference counting crash-course.

CPython reference counting crash course

Each object allocated by Python has a reference count, indicating how many times this object is ‘used’. When the reference count of an object goes to zero, it is de-allocated. Outside of C extension, one does not need to manage the reference count manually.

When a function part of Python C-API returns a Python object, it returns a pointer to it. At the time at which the function returns, the referenced object is live and its reference count is above zero. Depending of the function, you do not have the same responsibility with respect to that object reference count:

  • Owned references: Most functions return a new reference which means that you are responsible for decrefing the object reference count when you are done with it (basically the function increfed the object reference count before returning). In this situation you own a reference.

  • Borrowed reference: Some functions (PyList_GetItem, PyTuple_GetItem, PyDict_GetItem, …) do not incref the object count before returning. In that case, you have only a borrowed reference, you are not responsible for decrefing the object reference count.

Borrowed references allow to avoid the cost of increfing/decrefing which is nice. However since you do not own the reference, if the object referenced is removed from its owner (list, tuple for the above two mentioned functions) it may just disappear and your reference becomes invalid. This can cause issues. If the object should outlive the container, or the time it will spend in the container you have to incref it manually. Lets now discuss the convention when calling a function.

When calling a function, the caller is expected to own a reference to each of the arguments passed to the callee. The callee does not own the references, it only borrows them. As a consequence, it should not decref the reference and if it needs to store the object, in for example a C structure, it should incref it. Note that this does not apply in general to Python container since those are manipulated using functions that take care of it. There are however some exceptions that steals a reference, meaning that you are not the owner of the reference after the call. PyList_SetItem, for example, steal references.

An easy way to get reference count wrong is forgetting to decref some intermediate object before leaving a function. This is particularly true if the function has some early exit point because an exception should be raised. A good practice is to have a single exit point, however it is not always possible/practical and even like this it is possible to miss references, this is typically where cppy can help.

This is a very brief introduction to reference counting. You can read a bit more in the official Python documentation and in the Python API documentation.

Cppy smart pointer class

Cppy smart pointer (cppy::ptr) can be initialized with a pointer to a Python object as follows:

cppy::ptr obj_ptr( PyUnicode_FromString("test") )

When created, the class assume that you own the reference, if it is not the case you should incref it first:

PyObject* function( PyObject* obj )
{
    cppy::ptr obj_ptr( cppy::incref( obj ) );
    cppy::ptr obj_ptr2( obj, true );
}

Note

Cppy provides convenient inline function for common reference manipulation: - cppy::incref, cppy::xincref, cppy::decref, cppy::xdecref use the the similarly named Python macros and return the input value. - cppy::clear, cppy::replace are similar but return void.

You can also initialize a cppy::ptr from another cppy::ptr in which case the reference count will always be incremented.

The main advantage provided by cppy::ptr is that it implements a destructor that will be invoked automatically by the c++ runtime when the cppy::ptr goes out of scope. The destructor will decref the reference for you. As a consequence you can be sure that your reference you always be decremented when you leave the function.

Sometimes, however, that is not what you want, because you want to return the reference the cppy::ptr manage. You can request the cppy::ptr to give back the reference using its release method. Lets illustrate on a tiny example:

PyObject* function( PyObject* obj )
{
    cppy::ptr repr_ptr( PyObject_Repr( obj ) );
    return repr_ptr.release();
}

Function which are part of Python C-API are not aware of of cppy::ptr and when calling them you need to provide the original PyObject*. To access, you simply need to call the get method of the cppy::ptr object.

PyObject* function( PyObject* obj )
{
    cppy::ptr l_ptr( PyList_New() );
    if( PyList_Append( l_ptr.get(), obj ) != 0 )
        return 0;
    return l_ptr.release();
}

Here we see that because we use cppy::ptr to manage the list, we do not have to worry about decrefing the reference if an exception occurs, the runtime will do it for us. If no exception occurs, we stop managing the reference and we are good.

Using cppy does not eliminate all the pitfalls of writing C-extensions. For example if you release too early (for example when passing the object to a function that may fail), you can still leak references. However it does alleviate some of the complexity.

cppy::ptr methods

All methods that takes a PyObject* can also accept a cppy::ptr. Most names should be self-explanatory, and apart from the is_ methods most of them rely on the PyObject_ functions similarly named:

bool is_none() const
bool is_true() const
bool is_false() const
bool is_bool() const
bool is_int() const
bool is_float() const
bool is_list() const
bool is_dict() const
bool is_set() const
bool is_bytes() const
bool is_str() const
bool is_unicode() const
bool is_callable() const
bool is_iter() const
bool is_type( PyTypeObject* cls ) const
int is_truthy() const
int is_instance( PyObject* cls ) const
int is_subclass( PyObject* cls ) const
PyObject* iter() const
PyObject* next() const
PyObject* repr() const
PyObject* str() const
PyObject* bytes() const
PyObject* unicode() const
Py_ssize_t length() const
PyTypeObject* type() const
int richcmp( PyObject* other, int opid ) const
long hash() const
bool hasattr( PyObject* attr ) const
bool hasattr( const char* attr ) const
bool hasattr( const std::string& attr ) const
PyObject* getattr( PyObject* attr ) const
PyObject* getattr( const char* attr ) const
PyObject* getattr( const std::string& attr ) const
bool setattr( PyObject* attr, PyObject* value ) const
bool setattr( const char* attr, PyObject* value ) const
bool setattr( const std::string& attr, PyObject* value ) const
bool delattr( PyObject* attr ) const
bool delattr( const char* attr ) const
bool delattr( const std::string& attr ) const
PyObject* getitem( PyObject* key ) const
bool setitem( PyObject* key, PyObject* value ) const
bool delitem( PyObject* key )
PyObject* call( PyObject* args, PyObject* kwargs = 0 ) const