One of the things I dislike about Java is the need to declare exceptions as part of an interface or class definition. But perhaps Java got this right...
I've writen an application that uses urllib2, urlparse, robotparser and some other modules in the battery pack. One day my app failed with an urllib2.HTTPError. So I catch that. But then I get a urllib2.URLError, so I catch that too. The next day, it encounters a urllib2.HTTPError, then a IOError, a socket.timeout, httplib.InvalidURL,...
How do you program robustly with these modules throwing all those different (and sometimes undocumented) exceptions at you?
A catchall seems like a bad idea, since it also catches AttributeErrors and other bugs in the program.
Rene Pijlman wrote: > One of the things I dislike about Java is the need to declare exceptions > as part of an interface or class definition. But perhaps Java got this > right...
> I've writen an application that uses urllib2, urlparse, robotparser and > some other modules in the battery pack. One day my app failed with an > urllib2.HTTPError. So I catch that. But then I get a urllib2.URLError, so > I catch that too. The next day, it encounters a urllib2.HTTPError, then a > IOError, a socket.timeout, httplib.InvalidURL,...
> How do you program robustly with these modules throwing all those > different (and sometimes undocumented) exceptions at you?
> A catchall seems like a bad idea, since it also catches AttributeErrors > and other bugs in the program.
The relevant lines of urllib2, for example, look as such:
class URLError(IOError): class HTTPError(URLError, addinfourl): class GopherError(URLError):
This suggests that catching URLError should have caught your HTTPError, so you might have the chronology backwards above.
E.g.:
py> class BobError(Exception): pass ... py> class CarolError(BobError): pass ... py> try: ... raise CarolError ... except BobError: ... print 'got it' ... got it
Now,
% cat httplib.py | grep -e '^\s*class'
produces the following at one point in its output:
class HTTPException(Exception): class NotConnected(HTTPException): class InvalidURL(HTTPException): class UnknownProtocol(HTTPException): class UnknownTransferEncoding(HTTPException): class UnimplementedFileMode(HTTPException): class IncompleteRead(HTTPException): class ImproperConnectionState(HTTPException): class CannotSendRequest(ImproperConnectionState): class CannotSendHeader(ImproperConnectionState): class ResponseNotReady(ImproperConnectionState): class BadStatusLine(HTTPException):
Which suggests that "try: except HTTPException:" will be specific enough as a catchall for this module.
The following, then, should catch everything you mentioned except the socket timeout:
In article <4408db38$0$21898$5a62a...@per-qv1-newsreader-01.iinet.net.au>, Ben Caradoc-Davies <b...@wintersun.org> wrote:
> James Stroud wrote: > > except URLError, HTTPException:
> Aieee! This catches only URLError and binds the name HTTPException to > the detail of that error. You must write
> except (URLError, HTTPException):
> to catch both.
This exact issue came up just within the past week or so. I think that qualifies it as a wart, but I think it's a double wart.
It's certainly a wart that the try statement syntax allows for such ambiguity. But, I think it's also a wart in how the exceptions were defined. I like to create a top-level exception class to encompass all the possible errors in a given module, then subclass that. This way, if you want to catch anything to goes wrong in a call, you can catch the top-level exception class without having to enumerate them all.
Rene Pijlman wrote: > One of the things I dislike about Java is the need to declare exceptions > as part of an interface or class definition. But perhaps Java got this > right...
> I've writen an application that uses urllib2, urlparse, robotparser and > some other modules in the battery pack. One day my app failed with an > urllib2.HTTPError. So I catch that. But then I get a urllib2.URLError, so > I catch that too. The next day, it encounters a urllib2.HTTPError, then a > IOError, a socket.timeout, httplib.InvalidURL,...
> How do you program robustly with these modules throwing all those > different (and sometimes undocumented) exceptions at you?
I do it by not micromanaging things. Presumably if you plan to catch an exception, you have a specific procedure in mind for handling the problem. Maybe a retry, maybe an alternate way of attempting the same thing? Look to the code that you are putting in those except: statements (or that you think you want to put in them) to decide what to do about this situation. If each type of exception will be handled in a different manner, then you definitely want to identify each type by looking at the source or the docs, or doing it empirically.
Most of the time there isn't a whole lot of real "handling" going on in an exception handler, but merely something like logging and/or reporting it onscreen in a cleaner fashion than a traceback, then failing anyway. This is one reason Java does get it wrong: 95% of exceptions don't need and shouldn't have special handling anyway.
Good code should probably have a very small set of real exception handling cases, and one or two catchalls at a higher level to avoid barfing a traceback at the user.
> A catchall seems like a bad idea, since it also catches AttributeErrors > and other bugs in the program.
Generally speaking this won't be a problem if you have your catchalls at a fairly high level and have proper unit tests for the lower level code which is getting called. You are doing unit testing, aren't you? ;-)
On Sat, 04 Mar 2006 00:10:17 +0100, Rene Pijlman wrote: > I've writen an application that uses urllib2, urlparse, robotparser and > some other modules in the battery pack. One day my app failed with an > urllib2.HTTPError. So I catch that. But then I get a urllib2.URLError, so > I catch that too. The next day, it encounters a urllib2.HTTPError, then a > IOError, a socket.timeout, httplib.InvalidURL,...
> How do you program robustly with these modules throwing all those > different (and sometimes undocumented) exceptions at you?
How robust do you want to be? Do you want to take a leaf out of Firefox and Windows XP by generating an error report and transmitting it back to the program maintainer?
> A catchall seems like a bad idea, since it also catches AttributeErrors > and other bugs in the program.
try: process_things() except ExpectedErrors: recover_from_error_gracefully() except ErrorsThatCantHappen: print "Congratulations! You have found a program bug!" print "For a $327.68 reward, please send the following " \ "traceback to Professor Donald Knuth." raise except: print "An unexpected error occurred." print "This probably means the Internet is broken." print "If the bug still occurs after fixing the Internet, " \ "it may be a program bug." log_error() sys.exit()
Steven D'Aprano <st...@REMOVETHIScyber.com.au> writes: > try: > process_things() > except ExpectedErrors: > recover_from_error_gracefully() > except ErrorsThatCantHappen: > print "Congratulations! You have found a program bug!" > print "For a $327.68 reward, please send the following " \ > "traceback to Professor Donald Knuth." > raise > except: > print "An unexpected error occurred." > print "This probably means the Internet is broken."
But this isn't good, it catches asynchronous exceptions like the user hitting ctrl-C, which you might want to handle elsewhere. What you want is a way to catch only actual exceptions raised from inside the try block.
On Fri, 03 Mar 2006 21:10:22 -0800, Paul Rubin wrote: > Steven D'Aprano <st...@REMOVETHIScyber.com.au> writes: >> try: >> process_things() >> except ExpectedErrors: >> recover_from_error_gracefully() >> except ErrorsThatCantHappen: >> print "Congratulations! You have found a program bug!" >> print "For a $327.68 reward, please send the following " \ >> "traceback to Professor Donald Knuth." >> raise >> except: >> print "An unexpected error occurred." >> print "This probably means the Internet is broken."
> But this isn't good, it catches asynchronous exceptions like the user > hitting ctrl-C, which you might want to handle elsewhere. What you > want is a way to catch only actual exceptions raised from inside the > try block.
It will only catch the KeyboardInterrupt exception if the user actually hits ctrl-C during the time the code running inside the try block is executing. It certainly won't catch random ctrl-Cs happening at other times.
The way to deal with it is to add another except clause to deal with the KeyboardInterrupt, or to have recover_from_error_gracefully() deal with it. The design pattern still works. I don't know if it has a fancy name, but it is easy to describe:-
catch specific known errors that you can recover from, and recover from them whatever way you like (including, possibly, re-raising the exception and letting higher-level code deal with it);
then catch errors that cannot possibly happen unless there is a bug, and treat them as a bug;
and lastly catch unexpected errors that you don't know how to handle and die gracefully.
My code wasn't meant as production level code, nor was ExpectedErrors meant as an exhaustive list. I thought that was too obvious to need commenting on.
Oh, in case this also wasn't obvious, Donald Knuth won't really pay $327.68 for bugs in your Python code. He only pays for bugs in his own code. *wink*
Steven D'Aprano <st...@REMOVETHIScyber.com.au> writes: > The way to deal with it is to add another except clause to deal with the > KeyboardInterrupt, or to have recover_from_error_gracefully() deal with > it.
I think adding another except clause for KeyboardInterrupt isn't good because maybe in Python 2.6 or 2.6 or whatever there will be some additional exceptions like that and your code will break. For example, proposals have floated for years of adding ways for threads to raise exceptions in other threads.
I put up a proposal for adding an AsynchronousException class to contain all of these types of exceptions, so you can check for that.
> Oh, in case this also wasn't obvious, Donald Knuth won't really pay > $327.68 for bugs in your Python code. He only pays for bugs in his own > code. *wink*
The solution to that one is obvious. We have to get Knuth using Python. Anyone want to write a PEP? ;-)
>I like to create a top-level exception class to encompass all the >possible errors in a given module, then subclass that. This way, if you >want to catch anything to goes wrong in a call, you can catch the top-level >exception class without having to enumerate them all.
What do you propose to do with exceptions from modules called by the given module?
>Which suggests that "try: except HTTPException:" will be specific enough >as a catchall for this module.
>The following, then, should catch everything you mentioned except the >socket timeout:
Your conclusion may be (almost) right in this case. I just don't like this approach. Basically this is reverse engineering the interface from the source at the time of writing the app. Even if you get it right, it may fail next week when someone added an exception to a module.
>But it seems to me that working with the internet as you are doing is >fraught with peril anyway.
>Good code should probably have a very small set of real exception >handling cases, and one or two catchalls at a higher level to avoid >barfing a traceback at the user.
Good point.
>> A catchall seems like a bad idea, since it also catches AttributeErrors >> and other bugs in the program.
>Generally speaking this won't be a problem if you have your catchalls at >a fairly high level and have proper unit tests for the lower level code >which is getting called. You are doing unit testing, aren't you? ;-)
With low coverage, yes. But unit testing isn't the answer for this particular problem. For example, yesterday my app was surprised by an httplib.InvalidURL since I hadn't noticed this could be raised by robotparser (this is undocumented). If that fact goes unnoticed when writing the exception handling, it will also go unnoticed when designing test cases. I probably wouldn't have thought of writing a test case with a first url with some external domain (that triggers robots.txt-fetching) that's deemed invalid by httplib.
>try: > process_things() >except ExpectedErrors: > recover_from_error_gracefully() >except ErrorsThatCantHappen: > print "Congratulations! You have found a program bug!" > print "For a $327.68 reward, please send the following " \ > "traceback to Professor Donald Knuth." > raise >except: > print "An unexpected error occurred." > print "This probably means the Internet is broken." > print "If the bug still occurs after fixing the Internet, " \ > "it may be a program bug." > log_error() > sys.exit()
Yes, I think I'll do something like this. Perhaps combined with Peter's advice to not micromanage, like so:
Reraise = (LookupError, ArithmeticError, AssertionError) # And then some
Rene Pijlman <reply.in.the.newsgr...@my.address.is.invalid> writes: > With low coverage, yes. But unit testing isn't the answer for this > particular problem. For example, yesterday my app was surprised by an > httplib.InvalidURL since I hadn't noticed this could be raised by > robotparser (this is undocumented). If that fact goes unnoticed when
It isn't undocumented in my module. From 'pydoc httplib':
Rene Pijlman <reply.in.the.newsgr...@my.address.is.invalid> wrote: > A catchall seems like a bad idea, since it also catches AttributeErrors > and other bugs in the program.
All of the things like AttributeError are subclasses of StandardError. You can catch those first, and then catch everything else. In theory, all exceptions which represent problems with the external environment (rather than programming mistakes) should derive from Exception, but not from StandardError. In practice, some very old code may raise things which do not derive from Exception, which complicates things somewhat.
try: raise "I'm a string pretending to be an exception" except StandardError, foo: print "Caught a StandardError: ", foo except Exception, foo: print "Caught something else: ", foo --------------------------------------------------
Roy-Smiths-Computer:play$ ./ex.py Caught a StandardError: list index out of range Caught something else: (43, 'Protocol not supported') Traceback (most recent call last): File "./ex.py", line 21, in ? raise "I'm a string pretending to be an exception" I'm a string pretending to be an exception
>Rene Pijlman: >> my app was surprised by an >> httplib.InvalidURL since I hadn't noticed this could be raised by >> robotparser (this is undocumented).
>It isn't undocumented in my module. From 'pydoc httplib':
>In theory, all exceptions which represent problems with the external >environment (rather than programming mistakes) should derive from >Exception, but not from StandardError.
Are you sure?
""" The class hierarchy for built-in exceptions is:
> Roy Smith: > >In theory, all exceptions which represent problems with the external > >environment (rather than programming mistakes) should derive from > >Exception, but not from StandardError.
> Are you sure?
> """ > The class hierarchy for built-in exceptions is:
I do agree with you that there is some value in Java's "must catch or re-export all exceptions" semantics, and this would be one of those places where it would be useful. In general, however, I've always found it to be a major pain in the butt, to the point where I sometimes just punt and declare all my methods to "throw Exception" (or whatever the correct syntax is). Not to mention that with a dynamic language like Python, it's probably impossible to implement.
I think the real problem here is that the on-line docs are incomplete because they don't list all the exceptions that this module can raise. The solution to that is to open a bug on sourceforge against the docs.
>>>my app was surprised by an >>>httplib.InvalidURL since I hadn't noticed this could be raised by >>>robotparser (this is undocumented).
>>It isn't undocumented in my module. From 'pydoc httplib':
> That's cheating: pydoc is reading the source :-)
Yes, and that's the Right Thing(tm) to do. Source code don't lie. Source code don't get out of sync. So source code *is* the best documentation (or at least the most accurate).
>>try: >> process_things() >>except ExpectedErrors: >> recover_from_error_gracefully() >>except ErrorsThatCantHappen: >> print "Congratulations! You have found a program bug!" >> print "For a $327.68 reward, please send the following " \ >> "traceback to Professor Donald Knuth." >> raise >>except: >> print "An unexpected error occurred." >> print "This probably means the Internet is broken." >> print "If the bug still occurs after fixing the Internet, " \ >> "it may be a program bug." >> log_error() >> sys.exit()
> Yes, I think I'll do something like this. Perhaps combined with Peter's > advice to not micromanage, like so:
> Reraise = (LookupError, ArithmeticError, AssertionError) # And then some