A second poster from our working group will be presented at ECCB 2016. Daniela Beisser will present a poster about Taxonomic assignment of protist metatranscriptome sequences. She will also present the topic during the ECCB workshop “W11 – Recent Computational Advances in Metagenomics (RCAM’16)” on 4th September. See the workshop website for more information.
Taxonomic assignment of protist metatranscriptome sequences
Daniela Beisser, Nadine Graupner, Lars Grossmann, Jens Boenigk and Sven Rahmann
Next generation sequencing (NGS) technologies are increasingly applied to analyse complex microbial ecosystems by mRNA sequencing of whole communities, also known as metatranscriptome sequencing. In principle, each sequenced mRNA allows to both identify the species of origin and assign a function to the transcribed gene. While the functional information is sufficiently covered by databases such as Uniprot, NCBI, KEGG and many others, species identification is currently limited by incomplete reference databases. Inferring the community composition from metratranscriptomic samples is thus still a difficult problem. At the moment, most analyses are restricted to prokaryotic communities, which enjoy better database coverage, or to communities of few known species with sequenced genomes, or to a combination of rRNA and mRNA sequencing. However, the latter approach does not allow to link taxonomic and functional information directly.
Our approach focuses on an accurate assignment of taxonomic groups to metatranscriptomic reads. We constructed a custom database that comprises all major eukaryotic groups, developed a stand-alone tool to assign reads with a low false discovery rate and created a workflow for complete metatranscriptome analysis. The workflow covers all bioinformatic steps: preprocessing of the raw data, taxonomic and functional assignment, and visualisation of the results.
A poster about the Exome Analysis GraphicaL Environment (EAGLE) was accepted for the ECCB 2016 at The Hague. Felix Mölder will present the poster there.
EAGLE: an easy-to-use web-based exome analysis environment
Christopher Schröder, Felix Mölder, Christoph Stahl and Sven Rahmann
High throughput exome sequencing is a widely used technology for deciphering mutations in the coding regions of a genome at relatively low cost. While bioinformatics analyses of exome sequencing data mostly agree on best practices regarding the analysis steps, called genomic variants depend on the set of parameters and applied filtering. We present EAGLE, a software that combines a best practices variant calling workflow with a web frontend. By storing the called variant information in HDF5 files (instead of SQL databases), EAGLE allows filtering and parameter tuning in almost real time. This enables iterative tuning of thresholds, or the selection of different samples for filtering by medical PIs via the web interface. The web interface presents metadata, annotations, quality control data and statistics to facilitate a comprehensive data analysis on different levels.
Target identification for metabolic engineering,
Christopher Schröder, Sven Rahmann
In metabolic engineering by gene knockouts, one searches for genes controlling metabolic reactions that should be removed from a metabolic network in order to optimize the yield of a desired metabolite.
In a conservative way, this is done by undirected mutagenesis selection of the population with best efficiency.
Unrean et al. developed a simple algorithm to directly predict reaction targets, to save the high costs of this uncontrolled expensive process. It is based on elementary modes, undecomposable sequences of metabolite transformation flows in the network.
We substantially improved the algorithm and applied it to a network of Escherichia coli to show the improved results.