Christopher Schröder and Sven Rahmann
Algorithms for Molecular Biology
Christopher Schröder and Sven Rahmann
Algorithms for Molecular Biology
Christopher Schröder*, Elsa Leitão*, Stefan Wallner, Gerd Schmitz, Ludger Klein-Hitpass, Anupam Sinha, Karl-Heinz Jöckel, Stefanie Heilmann-Heimbach, Per Hoffmann, Markus M. Nöthen, Michael Steffens, Peter Ebert, Sven Rahmann and Bernhard Horsthemke
* Contributed equally
Epigenetics & Chromatin 2017
There is increasing evidence for inter-individual methylation differences at CpG dinucleotides in the human genome, but the regional extent and function of these differences have not yet been studied in detail. For identifying regions of common methylation differences, we used whole genome bisulfite sequencing data of monocytes from five donors and a novel bioinformatic strategy.
We identified 157 differentially methylated regions (DMRs) with four or more CpGs, almost none of which has been described before. The DMRs fall into different chromatin states, where methylation is inversely correlated with active, but not repressive histone marks. However, methylation is not correlated with the expression of associated genes. High-resolution single nucleotide polymorphism (SNP) genotyping of the five donors revealed evidence for a role of cis-acting genetic variation in establishing methylation patterns. To validate this finding in a larger cohort, we performed genome-wide association studies (GWAS) using SNP genotypes and 450k array methylation data from blood samples of 1128 individuals. Only 30/157 (19%) DMRs include at least one 450k CpG, which shows that these arrays miss a large proportion of DNA methylation variation. In most cases, the GWAS peak overlapped the CpG position, and these regions are enriched for CREB group, NF-1, Sp100 and CTCF binding motifs. In two cases, there was tentative evidence for a trans-effect by KRAB zinc finger proteins.
Allele-specific DNA methylation occurs in discrete chromosomal regions and is driven by genetic variation in cis and trans, but in general has little effect on gene expression
A poster about the Exome Analysis GraphicaL Environment (EAGLE) was accepted for the ECCB 2016 at The Hague. Felix Mölder will present the poster there.
EAGLE: an easy-to-use web-based exome analysis environment
Christopher Schröder, Felix Mölder, Christoph Stahl and Sven Rahmann
High throughput exome sequencing is a widely used technology for deciphering mutations in the coding regions of a genome at relatively low cost. While bioinformatics analyses of exome sequencing data mostly agree on best practices regarding the analysis steps, called genomic variants depend on the set of parameters and applied filtering. We present EAGLE, a software that combines a best practices variant calling workflow with a web frontend. By storing the called variant information in HDF5 files (instead of SQL databases), EAGLE allows filtering and parameter tuning in almost real time. This enables iterative tuning of thresholds, or the selection of different samples for filtering by medical PIs via the web interface. The web interface presents metadata, annotations, quality control data and statistics to facilitate a comprehensive data analysis on different levels.
Christopher Schröder, Daniela Beißer and Sven Rahmann from the Genome Informatics group contributed to novel insights about epigenetic changes during cell differentiation. The article will appear soon in the renowned “Epigenetics & Chromatin” journal (IF 4.873) by BioMedCentral.
Epigenetic dynamics of monocyte to macrophage differentiation
by Stefan Wallner, Christopher Schröder, Elsa Leitão, Tea Berulava, Claudia
Haak, Daniela Beißer, Sven Rahmann, Andreas S Richter, Thomas Manke,
Ulrike Böhnisch, Laura Arrigoni, Sebastian Fröhler, Filippos Klironomos,
Wei Chen, Nikolaus Rajewsky, Fabian Müller, Peter Ebert, Thomas
Lengauer, Matthias Barann, Philip Rosenstiel, Gilles Gasparoni, Karl
Nordström, Jörn Walter, Benedikt Brors, Gideon Zipprich, Bärbel Felder,
Ludger Klein-Hitpass, Corinna Attenberger, Gerd Schmitz, Bernhard Horsthemke
Monocyte to macrophage differentiation involves major biochemical and
structural changes. In order to elucidate the role of gene regulatory
changes during this process, we used high-throughput sequencing to
analyze the complete transcriptome and epigenome of human monocytes that
were differentiated in vitro by addition of colony stimulating factor 1
(CSF1) in serum-free medium. Numerous mRNAs and miRNAs were
significantly up- or downregulated. More than 100 discrete DNA regions,
most often far away from transcription start sites, were rapidly
demethylated by the ten-eleven translocation (TET) enzymes, became
nucleosome-free and gained histone marks indicative of active enhancers.
These regions were unique for macrophages and associated with genes
involved in the regulation of the actin cytoskeleton, phagocytosis and
innate immune response. In summary, we have discovered a phagocytic gene
network that is repressed by DNA methylation in monocytes and rapidly
de-repressed after the onset of macrophage differentiation.
An article by Christopher Schröder and Sven Rahmann about estimating parameters of beta mixture models, which has applications in determining the methylation state of genomic regions, has been accepted at WABI 2016 and will be presented at the conference in Aarhus (Danmark), August 22-24, 2016. The paper will be available in the WABI 2016 proceedings (LNBI series, Springer Verlag) in August 2016.
A hybrid parameter estimation algorithm for beta mixtures and applications to methylation state classification
by Christopher Schröder and Sven Rahmann
Mixtures of beta distributions have previously been shown to be a flexible tool for modeling data with values on the unit interval, such as methylation levels. However, maximum likelihood parameter estimation with beta distributions suffers from problems because of singularities in the log-likelihood function if some observations take the values 0 or 1. While ad-hoc corrections have been proposed to mitigate this problem, we propose a different approach to parameter estimation for beta mixtures where such problems do not arise in the first place. Our algorithm has significant computational advantages over the maximum-likelihood-based EM algorithm. As an application, we demonstrate that methylation state classification is more accurate when using adaptive thresholds from beta mixtures than non-adaptive thresholds on observed methylation levels.
Exome Analysis GraphicaL Environment (EAGLE)
The Exome Analysis GraphicaL Environment (EAGLE) combines a best practices variant calling workflow, with a web frontend. By storing the called information in speficially structerd hdf5 files, EAGLE allows filtering and parameter tuning in almost real time. This enables iterative tuning of thresholds, or the selection of different samples for filtering by non computer scientists via the web interface.
Bioinformatics Analysis of Heterogenous Data Reveals Characteristic Mutational Landscapes of Neuroblastoma Relapses, GCB 2015 in Dortmund
Neuroblastoma is a malignancy of the developing sympathic nervous system that causes 15% of childhood cancer-related mortality. However, in the vast majority of cases death results not from the initial disease manifestation but rather from metastasis or recurrence.
Systematic search for genomic alterations in primary neuroblastomas has shown low genetic complexity, with significant mutations in only a very few genes. This study explored the genomic landscape of relapsing neuroblastoma in order to evaluate ‘driver’ mutations to be exploited as therapeutic targets.
Evolutionary Origin and Methylation Status of Human Intronic CpG Islands that Are Not Present in Mouse
Rademacher, K., Schröder, C., Kanber, D., Klein-Hitpass, L., Wallner, S., Zeschnigk, M., Horsthemke, B.
Genome Biol Evol 6, 1579–1588 (2014), doi:10.1093/gbe/evu125
Imprinting of the human RB1 gene is due to the presence of a differentially methylated CpG island (CGI) in intron 2, which is part of a retrocopy derived from the PPP1R26 gene on chromosome 9. The murine Rb1 gene does not have this retrocopy and is not imprinted. We have investigated whether the RB1/Rb1 locus is unique with respect to these differences.
Christopher Schröder, Johannes Köster, Christoph Stahl, Sebastian Venier, Sven Rahmann, Marcel Martin
Exomate is an exome-sequencing pipeline with a web frontend. It automates most steps needed to go from FASTQ files to variant calls, puts the calls and metadata about patients, samples, etc. into a database and then allows interactive analysis via a web frontend. It is primarily designed for easy use and has already been used in various studies [1,2,3].
 Martin, M. et al., 2013. Exome sequencing identifies recurrent somatic mutations in EIF1AX and SF3B1 in uveal melanoma with disomy 3. Nat. Genet. 45, 933–936.
 Czeschik, J.C. et al., 2013. Clinical and mutation data in 12 patients with the clinical diagnosis of Nager syndrome. Hum. Genet. 132, 885–898.
 Voigt, C., et al., 2013. Oto-facial syndrome and esophageal atresia, intellectual disability and zygomatic anomalies – expanding the phenotypes associated with EFTUD2 hfg mutations.
Orphanet J Rare Dis 8, 110.
Target identification for metabolic engineering,
In metabolic engineering by gene knockouts, one searches for genes controlling metabolic reactions that should be removed from a metabolic network in order to optimize the yield of a desired metabolite.
In a conservative way, this is done by undirected mutagenesis selection of the population with best efficiency.
Unrean et al. developed a simple algorithm to directly predict reaction targets, to save the high costs of this uncontrolled expensive process. It is based on elementary modes, undecomposable sequences of metabolite transformation flows in the network.
We substantially improved the algorithm and applied it to a network of Escherichia coli to show the improved results.