Computational Biology of RNA Processing

Group leader:  Roderic Guigó

Research in our group focuses on the investigation of the signals involved in gene specification in genomic sequences (promoter elements, splice sites, translation initiation sites, etc…). We are interested both in the mechanism of their recognition and processing, and in their evolution. In addition, but related to this basic component of our research, our group is also involved in the development of software for gene prediction and annotation in genomic sequences. Our group also actively participates in the analysis of many eukaryotic genomes and it in involved in the NIH-funded ENCODE project. Furthermore we are members of two large cancer-studies consortia (chronic lymphocytic leukemia "CLL" and Breast Cancer -Hospital del Mar/CRG/Roche-).  These are some of the other projects we are currently involved in:

    • Gene Prediction/Genome Annotation

    • Genome-Wide Search for Selenoproteins

    • Methods for RNAseq (NGS) data analysis

    • Development of methods to analyze the relationship between chromatin and splicing

    • Long Noncoding RNAs with Enhancer-like Function in Human Cells

SECISearch3/Seblastian published in NAR

Selenoproteins are a peculiar class of proteins containing an in-frame UGA (normally a stop) recoded to selenocysteine, the 21st aminoacid. Gene prediction programs normally fail to predict correctly selenoproteins, thus, selenoproteins are generally misannotated in public protein databases. Until today, their correct prediction was a task typically carried out by only a few expert in the field.

Next Generation Molecular Cloning Course Taught by Guigo Lab

The Guigo group wet lab members (Carme Arnan, Alessandra Breschi and Rory Johnson) have taught a one day course to CRG staff and PhD students on Next Generation Molecular Cloning techniques.

Grape road show

Yesterday, David Gonzalez and Maik Röder gave a talk in Paris about "RNA-Seq data analysis on the ENCODE project using the Grape pipeline".

We were invited to come to the lab of Daniel Gautheret, professor at the Université Paris-Sud, whose group is working on RNA sequence, structure and function.

Grape hands-on tutorial

Yesterday, we offered a training for Grape, one of the tools we are developing at the CRG in the Bioinformatics and Genomics group lead by Roderic Guigó Serra.

Grape is a tool designed to automate some of the standard analyses frequently required for RNA-Seq data such as alignment and quantification. It also allows for visualization of the data through a web interface.

More information: 

The tutorial was given by David Gonzalez Knowles and Maik Röder, and there were 20 participants that were provided with laptops. They could follow the steps of how to set up their own RNA-Seq pipeline with Grape and then install a web server to visualize the results.

Tomato genome deciphered

Tomato genome Nature

Roderic Guigó, Francisco Câmara, Tyler Alioto and Paolo Ribeca (both now at the CNAG in Barcelona) have participated in a seven-year study that has lead to the sequencing and annotation of the commercially available tomato spp. (Solanum lycopersicum). Our work at the CRG has contributed to finding the ~35,000 genes contained in the tomato genome. We have also helped annotate the potato genome which was found to differ from that of tomato by only ~8.5% at its euchromatic region. 

This multi-national effort which included groups from fourteen countries has recently been published in the journal Nature.


There are now -50- geneid (1.4) species-specific parameter files

tree of life showing geneid-trained species

We have recently added three new species-specific parameter files to our ab initio gene predictor (geneid) home page. This has allowed us to reach a new milestone. We now have FIFTY parameter files for geneid with which we can predict genes in a wide range of species spanning all four "classical Kingdoms" (Animal, Plantae, Fungi and Protist). This tree-of-life rectangular cladogram portrays all the species for which we now possess a geneid parameter file.

With the latest addition to our list of geneid parameter files we can now also predict protein-coding genes on the model cereal/grass/monocot Brachypodium distachyon, on the ciliated protozoan Tetrahymena thermophila and on the potential plant pahogen fungus Fusarium oxysporum.

RECOMB 2012 (16th International Conference on Research in Computational Molecular Biology)

Recomb 2012

RECOMB 2012 (the 16th International Conference on Research in Computational Molecular Biology) will take place in vibrant and beautiful Barcelona from the 21 to 24 April 2012. On this year's conference there will be a strong focus on the computational challenges arising from the extraordinary developments in high throughput technologies. Participants will enjoy a full agenda of keynote talks, paper presentations, and poster sessions, with ample opportunities to get a sample some of the city life of Barcelona. The meeting also overlaps with Sant Jordi (Saint George), on April 23, the patron of Catalonia, and one of the most important civic holidays in the country. In order to get more information on RECOMB 2012 and how to register please click here.


New ab initio gene predictor geneid parameter files available

We have recently added four new species-specific parameter files to our ab initio gene predictor (geneid) home page.

These four new parameter files allow for gene predictions in Bombus terrestris, Bombus impatiens (two insects of the order Hymenoptera) , Plasmodium vivax (protozoal parasite which is one of the causative agents of malaria) and Phaseolus vulgaris (common bean).

We currently have forty-four parameter files for our ab initio gene prediction program geneid which allow us to predict genes in a wide range of species spanning all four "classical Kingdoms" (Animal, Plantae, Fungi and Protist).

X CRG Annual Symposium "Computational Biology of Molecular Sequences" (10-11 November 2011)

On the 10-11 of November 2011 our group headed by Roderic Guigo will be organizing a two-day CRG symposium on “Computational Biology of Molecular Sequences” which will bring together renowned Computational Biologists from around the world, including both pioneers in the field, as well as promising young scientists. Presentations, discussions and dialogue during the Symposium will contribute to survey the status of a discipline that, at the intersection of Biology and Computation, will have an enormous impact on the world of the XXIst century.

You may obtain more information on the objectives of the symposium, the current list of confirmed speakers and instructions on how to register by clicking here. The poster of the symposium can be downloaded by clicking here.

Installing programs and modules needed by Selenoprofiles

This page covers the installation of programs and modules used by selenoprofiles (profile-based gene prediction in genomes, you can find it here). Since I encountered some problems installing or running them in some computers, I created this page to help whoever will run in the same problems, this being to install selenoprofiles or not. All installations here refer to Unix systems. I will go through: blastall, exonerate, genewise and the python modules networkx, fpconst and SOAPpy.

Roderic Guigo  group photo 2013
Syndicate content