A second poster from our working group will be presented at ECCB 2016. Daniela Beisser will present a poster about Taxonomic assignment of protist metatranscriptome sequences. She will also present the topic during the ECCB workshop “W11 – Recent Computational Advances in Metagenomics (RCAM’16)” on 4th September. See the workshop website for more information.
Taxonomic assignment of protist metatranscriptome sequences
Daniela Beisser, Nadine Graupner, Lars Grossmann, Jens Boenigk and Sven Rahmann
Next generation sequencing (NGS) technologies are increasingly applied to analyse complex microbial ecosystems by mRNA sequencing of whole communities, also known as metatranscriptome sequencing. In principle, each sequenced mRNA allows to both identify the species of origin and assign a function to the transcribed gene. While the functional information is sufficiently covered by databases such as Uniprot, NCBI, KEGG and many others, species identification is currently limited by incomplete reference databases. Inferring the community composition from metratranscriptomic samples is thus still a difficult problem. At the moment, most analyses are restricted to prokaryotic communities, which enjoy better database coverage, or to communities of few known species with sequenced genomes, or to a combination of rRNA and mRNA sequencing. However, the latter approach does not allow to link taxonomic and functional information directly.
Our approach focuses on an accurate assignment of taxonomic groups to metatranscriptomic reads. We constructed a custom database that comprises all major eukaryotic groups, developed a stand-alone tool to assign reads with a low false discovery rate and created a workflow for complete metatranscriptome analysis. The workflow covers all bioinformatic steps: preprocessing of the raw data, taxonomic and functional assignment, and visualisation of the results.