Wednesday, February 29, 2012

The heirarchy of Mystical Arts in Programming

I was responding to an incredibly detailed answer on StackOverflow on Python metaclasses, when I wrote the following on my whiteboard:

I thought this was clever, and gave the (I thought clever) response on StackOverflow which read:
I read this and think of the famous "There are lies, damned lies, and then there is statistics", but instead think of it as "there are hacks, tricks, voodoo magic, dark arts, and then there are Python metaclasses".
This was good, I then left to go for lunch and when I came back a co-worker had modified/added to my list:

Thursday, February 16, 2012

Python HTMLParser and super()

So I have a class that inherits from HTMLParser, and I want to call the super class init (the __init__ of HTMLParser), I would think I should do:

class MyParser(HTMLParser):
def __init__(self):
super(MyParser, self).__init__()

But this causes a problem:

myparser = MyParser()
Traceback (most recent call last):
File "", line 1, in
File "", line 3, in __init__
TypeError: must be type, not classobj

What's with that? The super(class, instance).__init__ idiom is the supposed proper way of calling a parent class constructor, and it is -- if the class is a "new-style" Python class (one which inherits from object, or a class which inherits from object).

And therein is the problem: HTMLParser inherits from markupbase.ParserBase, and markupbase.ParserBase is defined as:

class ParserBase:
"""Parser base class which provides some common support methods used
by the SGML/HTML and XHTML parsers."""

That is, as an *old* style class. One definitely wonders why in Python 2.7+ the classes that form part of the standard library wouldn't all be new-style classes, *especially* when the class is intended as being something you inherit from (like HTMLParser). Anywho, to fix:

class MyParser(HTMLParser):
def __init__(self):
# Old style way of doing super()