Wednesday, October 28, 2009

Block Scoping

Unlike languages like Java, C++ and Perl, Python does not have block scoping. I.e. if you define a variable inside a loop in Python, it will still be in scope after that loop and will override previous bindings to that name. Python instead delimits scope at the levels of module, class and function. I defer to the authoritative source for the gory details. Note that according to the definitions used therein, Python does have block-level scoping, but that is only if you define block delimiters to be modules, classes and functions :-)

Python scoping is a drawback in the sense that if you are used to (Java/C++/Perl) block scoping, you are likely to accidentally introduce bugs as a result of using the same variable at different block levels. I've introduced a few such bugs. On the other hand, Python scoping eliminates the need to pre-define/declare variables which are set in a loop/if block, yet you need access to afterward. So, even though I've been burned by this style of scoping, I find that it helps me write better (i.e. more readable) code.

Tuesday, October 27, 2009

The Global Interpreter Lock

Python technically has threading capabilities. And, it can work quite well if the threads are i/o-bound. However, Python threading doesn't work so well when threads are cpu-bound. The following hour-long video explains why. Read the slides if you are impatient.

http://blip.tv/file/2232410

One observation that David Beazley makes is that only the "main" thread can deal with signals like Control-C. However, if this thread is blocked via a join(), the signal will not get handled. So, it may be worth creating a thread separate from the "main" thread to spawn and join threads. Haven't yet tested this, though...

Monday, October 19, 2009

Controlling Printing in Numpy

Numpy has numpy.set_printoptions for controlling the printing of arrays. See the doc for full details. Now that I know about it, I'll be using something displaying fewer precision digits, allowing a larger linewidth and not summarizing until the array is substantially larger:
numpy.set_printoptions(precision=4,
                       threshold=10000,
                       linewidth=150)