Installing programs and modules needed by Selenoprofiles

This page covers the installation of programs and modules used by selenoprofiles (profile-based gene prediction in genomes, you can find it here). Since I encountered some problems installing or running them in some computers, I created this page to help whoever will run in the same problems, this being to install selenoprofiles or not. All installations here refer to Unix systems. I will go through: blastall, exonerate, genewise and the python modules networkx, fpconst and SOAPpy.

Blastall

Selenoprofiles uses blastall from the ncbi package. All 2.2.x versions are expected to work. The blast+ versions are not compatible. You can obtain the latest blastall release at ftp://ftp.ncbi.nlm.nih.gov/blast/executables/release/LATEST/ in form of compiled binaries. The last version I personally installed is 2.2.25. There a single problem that I experienced. If you run the blastpgp binary through links, which is what selenoprofiles does, you may have this error:

[blastpgp] WARNING: Unable to open BLOSUM62

[blastpgp] WARNING: BlastScoreBlkMatFill returned non-zero status

[blastpgp] WARNING: SetUpBlastSearch failed.

This may happen for other binaries as well. The problem is that blast can't find the BLOSUM62 matrix. It is easy to fix it: you need to tell him where it is. Just edit (or create) the file ~/.ncbirc and add something like this to its content:

[NCBI]

data=/path_to_blast_installation/blast-2.2.25/data

Exonerate

Exonerate is an excellent suite of programs for alignments. It covers many purposes. Selenoprofiles uses its protein-to-genome mode, which allows gene prediction by homology. You can find exonerate executables here:

http://www.ebi.ac.uk/about/vertebrate-genomics/software/exonerate

You have source code or pre-compilied binaries available. I never had problems in installing or running exonerate.

Genewise

Genewise is a homology based prediction tool, which aligns a protein query to a nucleotide target. It comes from the Wise2 package available at EBI: go here http://www.ebi.ac.uk/Tools/Wise2/ and click "Download software" and you will be brought to the EBI ftp. The version I last downloaded and installed is 2.2.0. If they didn't move things around, you can download it here:

ftp://ftp.ebi.ac.uk/pub/software/unix/wise2/wise2.2.0.tar.gz

You will have to build the program. Unluckily, there's a bug in the makefile installation, which crashes with this error message:

sqio.c:232: error: conflicting types for 'getline'

/usr/include/stdio.h:653: note: previous declaration of 'getline' was here

make[1]: *** [sqio.o] Error 1

make[1]: Leaving directory `/PATH/src/HMMer2'

make: *** [realall] Error 2

The problem is in a function declaration (getline) in the file HMMer2/sqio.c, since this function is already declared in most compilers. Here's how I solved it. After unpacking the .tar.gz, type:

cd wise2.2.0/src/HMMer2/

sed 's/getline/getline_new/' sqio.c  > a &&  mv a sqio.c

The getline function is used only in sqio.c, so after this single operation everything works fine (at least it did for me). Now you can get back to wise2.2.0/src/ and type "make all". Take care of the final message it shows: you need to set the enviromental variable WISECONFIGDIR to point to right place for genewise to work. If you forget to, you will have the following error:

Warning Error

    Could not open human.gf as a genefrequency file

Warning Error

    Could not read a GeneFrequency file in human.gf

Fatal Error

    Could not build objects!

To take care of this, I suggest adding to your bash configuration file ~/.bashrc something like this line:

export WISECONFIGDIR=/path_to_installation/wise2.2.0/wisecfg/

so this will be executed for every bash instance you will run, with no more problems (it won't affect those already running though).

Python module: networkx

This module is needed by selenoprofiles to use the GO extension. This is fundamental if you want to run the built-in profiles. The easiest way to obtain it is through svn:

svn co https://networkx.lanl.gov/svn/networkx/trunk networkx

After running this, simply cd into the networkx directory and run

python setup.py install

See http://networkx.lanl.gov/ for more information.

Python module: fpconst

This module is also needed by selenoprofiles to use the GO extension. You can find it here: http://pypi.python.org/pypi/fpconst. The latest version I installed is this: http://pypi.python.org/packages/source/f/fpconst/fpconst-0.7.2.tar.gz (works fine with selenoprofiles)

As the previous module, installation is pretty straighforward. Just unpack it, cd in the directory and type:

python setup.py install

Python module: SOAPpy

This module is also needed by selenoprofiles to use the GO extension. Here's the dowload page: http://sourceforge.net/projects/pywebsvcs/files/SOAP.py. The version I lastly installed is here: http://sourceforge.net/projects/pywebsvcs/files/SOAP.py/0.12.0_rc1/SOAPpy-0.12.0.tar.gz/download

The installation of this module has some a serious bug. I don't understand how it could have ever worked at all. As soon as you try the good old:

python setup.py install

you get this error message:

Traceback (most recent call last):

  File "setup.py", line 8, in <module>

    from SOAPpy.version import __version__

  File "/path/SOAPpy-0.12.0/SOAPpy/__init__.py", line 5, in <module>

    from Client      import *

  File "/path/SOAPpy-0.12.0/SOAPpy/Client.py", line 46

    from __future__ import nested_scopes

SyntaxError: from __future__ imports must occur at the beginning of the file

I tried some previous versions of SOAPpy but I had the same message. Basically you have to move these __future__ statements to the beginning of the file, for several .py modules. Here's a command line to do it right away:

cd SOAPpy-0.12.0/SOAPpy/;

for i in Client.py GSIServer.py NS.py Server.py Types.py

do gawk 'BEGIN{print "from __future__ import nested_scopes"}!/from __future__ import nested_scopes/' $i > a && mv a $i

done; cd -

Now you're ready to go with:

python setup.py install