Wednesday, April 6, 2011

Python list.index question

Why does list.index throws an exception instead of returning a value like say -1? What's the idea behind this?

To me it looks cleaner to deal with special values, etc than exceptions.

EDIT: I didn't realize -1 to be usable, but in that case, how about a value of None?

From stackoverflow
  • Because -1 is itself a valid index. It could use a different value, such as None, but that wouldn't be useful, which -1 can be in other situations (thus str.find()), and would amount simply to error-checking, which is exactly what exceptions are for.

  • In Python, -1 is a valid index, meaning a number from the end of the list (instead of the beginning), so if you were to do

    idx = mylist.index(someval)
    nextval = mylist[idx+1]
    

    then you would get the first value of the array, instead of realising there was an error. This way you can catch the exception and deal with it. If you don't want exceptions, then just check beforehand:

    if someval in mylist:
        idx = mylist.index(someval)
    

    Edit: There's not really any point in returning None, because if you're going to test for None, you might as well test whether the value is in the list as above!

  • Well, the special value would actually have to be None, because -1 is a valid index (meaning the last element of a list).

    You can emulate this behavior by:

    idx = l.index(x) if x in l else None
    
    John Fouhy : As long as you don't mind searching through the list twice..
    vartec : well, it's that or try/except.
  • One simple idea: -1 is perfectly usable as an index (as are other negative values).

  • To add to Devin's response: This is an old debate between special return values versus exceptions. Many programming gurus prefer an exception because on an exception, I get to see the whole stacktrace and immediate infer what is wrong.

    Joan Venge : This was what I was wondering: special return values versus exceptions.
    S.Lott : +1: Exceptions don't depend on some value being "sacred". Exceptions are often simpler to work with.
  • It's a semantic argument. If you want to know the index of an element, you are claiming that it already exists in the list. If you want to know whether or not it exists, you should use in.

    Georg : … and if you claim that it exists an exception is the correct way to handle it.
  • I agree with Devin Jeanpierre, and would add that dealing with special values may look good in small example cases but (with a few notable exceptions, e.g. NaN in FPUs and Null in SQL) it doesn't scale nearly as well. The only time it works is where:

    • You've typically got lots of nested homogeneous processing (e.g. math or SQL)
    • You don't care about distinguishing error types
    • You don't care where the error occurred
    • The operations are pure functions, with no side effects.
    • Failure can be given a reasonable meaning at the higher level (e.g. "No rows matched")
  • It's mainly to ensure that errors are caught as soon as possible. For example, consider the following:

    l = [1, 2, 3]
    x = l.index("foo")  #normally, this will raise an error
    l[x]  #However, if line 2 returned None, it would error here
    

    You'll notice that an error would get thrown at l[x] rather than at x = l.index("foo") if index were to return None. In this example, that's not really a big deal. But imagine that the third line is in some completely different place in a million-line program. This can lead to a bit of a debugging nightmare.

  • The "exception-vs-error value" debate is partly about code clarity. Consider code with an error value:

    idx = sequence.index(x)
    if idx == ERROR:
        # do error processing
    else:
        print '%s occurred at position %s.' % (x, idx)
    

    The error handling ends up stuffed in the middle of our algorithm, obscuring program flow. On the other hand:

    try:
        idx = sequence.index(x)
        print '%s occurred at position %s.' % (x, idx)
    except IndexError:
        # do error processing
    

    In this case, the amount of code is effectively the same, but the main algorithm is unbroken by error handling.

0 comments:

Post a Comment