__del__ without the gc issues

I’ve been meaning to post about this for a while, but over the years I’ve wound up in some code situations where the best solution for a resource reclamation/finalization was a __del__ method- to be clear, not all situations are addressable via context managers, nor necessarily atexit.register. There are some cases where a __del__ really is the best solution and they’re damn annoying to deal with in a way that doesn’t involve bad compromises and having to leave warnings in the docstrings about the potential.

That said, as most folk know, this however means that object must never participate in a cycle- doing so means that cpython’s garbage collector can’t automatically break that cycle and do reclamation (details here, look for object.__del__), leaving it up to the developer to explicitly break the cycle.

Frankly this situation sucks from my standpoint (although I fully understand why it is the way it is and agree w/ the __del__ limitation, even if I dislike said limitation)- developers are fallible thus trying to rely on them to always do something is suboptimal. Further, for some cases I’ve dealt with __del__ was the only sane option.

Getting to the point, people know that weakref finalization is the best alternative, but anyone who has tried it knows that you wind up having to do some nasty seperation of your finalizer (and the data it needs) from the object that you’re waiting for to die. In reality you wind up having to implement a solution per usage usually.

The alternative is to tweak the class specifying the attributes that must be available to the finalizer- activate state has several such attempts. I find some faults with these attempts-

  • For the attempts that rely on binding a set of values to the finalizer, that’s a partial solution that can bite you in the ass if you ever inadvertantly replace that reference.
  • Said attempts also make it a serious pain in the ass if you’re deriving from such a class, and need to add one more attribute into what’s bound to the finalizer.
  • The next evolution of this is a class attribute listing what all must be bound in. Step in the right direction, but critically, it’s reliant on people maintaining it perfectly. People screw up, and if you have a complex finalizer pathway this can quickly prove to bite you in the ass.

In snakeoil, we’ve got a nasty bit of voodoo in snakeoil.obj that is designed for basically transparent proxying to another object. This includes slot methods (which most implementations miss), lieing about class to isinstance, and a whole bunch of other things that I’m reasonably sure people will hate me for. Either way, the sucker works, and at least for native classes you have to go out of your way to spot it’s presence.

Leading into the point of this blog, snakeoil.weakrefs.WeakRefFinalizer. This metaclass works by rewriting the class slightly (primarily shifting it’s __del__ to a method named __finalizer__ to avoid the gc marking it as unbreakable if somehow cyclic), and abusing the proxy I’d mentioned.

The trick behind this is that when you create an instance of a target class, you’re not actually getting that instance- you get the proxy. The real instance is jammed into a strong ref mapping  hidden away on the class object itself.   The trick is that the weakref is created for the proxy– when the proxy falls out of memory, the weakref finalizer fires invoking the real instances __finalizer__ method.   Since that instance is still strongly ref’d, the original __del__ has access to all attributes of the instance- you don’t have to track what you want during finalization. After that’s invoked, it then wipes the classes strong reference to it- meaning the instance falls out of memory.

Basically, best I can tell in a fair bit of experimenting with this, you get __del__ w/out the gc issues, at the cost of a slightly increased attribute/method access, the inability to resurrect instances from deletion (by the time the __del__ fires, the proxy is dead- thus you can’t really resurrect it anywhere), and one caveat.

The caveat’s an annoyance I’ve not yet figured out how to address, nor frankly have I decided if it’s worth the time to do so- if you have a method that returns another method, you’re not returning the proxied method- you’re returning the real instances method. This means it’s possible for the proxy to be deleted from memory while a ref effectively exists, leading to an early finalization invocation. I’ve yet to see this in any real usage, but thought experiment wise, I know it’s possible.

Finally, I apologize to anyone who looks in obj. Read the docstrings, there is a very good reason it has some voodoo in it- it’s the only way I could come up with to ensure that the proxy behaved exactly like the proxied target, leading to the vm executing the same codepaths.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: