This morning I was writing some Python code that needed to perform lookups on a very large map.
1 2 3 4 | mapSize = 65000 amap = {} for i in range(0,mapSize): amap['k%d' % (i)] = i |
If a key did not exist in the map I needed to take some action. My initial effort performed a pre-emptive check to see whether the key existed in the map
1 2 3 4 5 6 7 8 9 | query = ['k%d' % (x) for x in xrange(100)] query.extend(['k%d' % (x) for x in xrange(mapSize+1, mapSize+100)]) def permission(): for q in query: if q in amap.keys(): val = amap[q] else: pass |
Looking up 200 keys (half of which are present and half are absent from the map) took 496 ms (based on Pythons’ timeit module with default settings). On the other hand, if we just try and go ahead and access the key and deal with its absence by handling with the exception, we get a significant improvement.
1 2 3 4 5 6 | def forgiveness(): for q in query: try: val = amap[q] except: pass |
Specifically, 150 us – an improvement of 3306x.
So, as with many other things in life, it’s better to ask Python for forgiveness than permission.
“q in amap.keys()” constructs a new list containing all of the keys then does a linear search of that list to find q. This is slow.
You want “q in amap”. This tests to see if q is a key in the amap dictionary.
If failures are rare then look-before-you-leap should be faster than the exception version because building an exception in CPython is heavier than doing the dictionary check twice.