Tuesday, August 27, 2013

Aligning PDB structures with Biopython

You can use the Bio.PDB module in Biopython to align PDB files. This is how I did it. The code should be pretty much self-explanatory.

In this example I align the crystal structure of Ubiquitin (PDB code: 1UBQ) to the first structure of a corresponding NMR ensemble (PDB code: 1D3Z, see picture below).


Friday, August 16, 2013

Numpy vs. cPickles (Python, ofc)

I've been using cPickels for storing data into a Python-friendly format for some time. See my earlier blog post for more on cPickles.
http://combichem.blogspot.dk/2013/02/saving-into-data-into-cpickle-format-in.html

I have also been using Numpy's save function to do the same thing. numpy.save() and numpy.load() is so much simpler, however. I really recommend that people use numpy.save() and numpy.load() over cPickles for most purposes. It is so much more simple.

I always thought a cPickle was much, much faster than Numpy, but I guess I was wrong, according to this stackoverflow I just saw. Below are loading and saving times for a large array. Practically no difference between Numpy and cPickles!

Source: http://stackoverflow.com/questions/16833124/pickle-faster-than-cpickle-with-numeric-data








To save an array, a list or dictionary or whatever called my_array into my_file.npy:

  numpy.save("my_file", my_array)

Note that Numpy appends .npy to the filename automatically.


To load the stored data simply:

  my_array = numpy.load("my_file.npy")

 Really py-fragging-thonicly easy!

Saturday, August 10, 2013

You know what really grinds my gears? (In Python)

I can never correctly remember when things are passed as references or copied as local variables inside functions.


Take these two, innocuously looking functions. Because both do the same thing (namely set the contents of a vector, P, to [1, 1]) I call them 1 and a, respectively, since one is not better than the other.


def implementation_1(P):

    P = [1, 1]


def implementation_a(P):

    P[0] = 1
    P[1] = 1



What you would expect is one of the following two options
  1. Both functions change P to [1, 1] (permanently).
  2. Both functions take a local copy of P and change it to [1, 1], and after the function returns, the local [1, 1] array is forgotten.

A simple test is to do this:


P = [0, 0]
print P

implementation_1(P)
print P

implementation_a(P)
print P


which prints:

[0, 0]
[0, 0]
[1, 1]


So clearly implementation_a() is different from implementation_1(), although they seemingly do the same.