Self-training software to help fight crop disease
Alison Testa from Curtin University has developed a sophisticated, self-training software that can help to identify which fungal genes are responsible for crop disease.
“There are several fungal pathogens that attack important crops,” she says.
“Ideally, we want to get the pathogen’s genome sequence, predict the genes really accurately, and then analyse which genes are important to how the pathogen kills the crop.”
Ms Testa and her co-workers are currently working on barley, which is attacked by a fungal pathogen called net blotch.
“It kind of looks like little brown hashtags on the leaves, and these stop the plant from photosynthesising as well as it should, leading to a decrease in yield,” she explains.
The software, CodingQuarry, uses a novel method to incorporate a relatively new type of data called RNA-seq: the RNA transcripts which are translated into proteins inside the pathogen’s cells.
“We had this really great RNA-seq data, which can help you to identify where the genes are within the genome, but we didn’t have a really good way of incorporating it into the gene prediction programs that were available,” says Ms Testa.
“We could also see that a lot of the software was intended to be used on the human, mouse or insect genomes, and we were seeing some particular problems with how it was working for fungi.
“So we wanted to have a really good go at making a new gene prediction software that was specific to fungi, and that could also incorporate RNA-seq.”
CodingQuarry uses RNA-seq data aligned to the fungal genome, which provides insight into where the introns or ‘breakpoints’ are within a gene and helps to predict the protein coding sequence.
“Without knowing accurately where those breakpoints are, we end up with the wrong protein sequence. When we go to try and figure out how the protein functions, we’re basing it on something incorrect,” explains Ms Testa.
“We found that this is really important for fungi because they have very tight-packed genomes – all the genes are really close together. Other gene prediction software tends to miss these breakpoints and accidentally join genes together.”
CodingQuarry is about 90% accurate, and relatively quick and easy to use. It is also self-training, says Ms Testa.
“Usually if you’re doing gene prediction you need to have a set of known genes for that organism that you give to the software to train the parameters from. But obviously this is quite tricky for researchers who are working on a completely new organism,” she says. “CodingQuarry self-trains using the RNA-seq data.”
The new software can also be broadly applied to gene prediction for many different fungi. This is particularly useful for the food and medical industries. Yeasts, for instance, are widely used in beer and bread production.
“It is giving us a platform to really address a lot of the issues facing fungal researchers,” says Ms. Tetsa.
CodingQuarry is freely available at http://sourceforge.net/projects/codingquarry/