Software

Popular
Snakemake Johannes Köster a Python-like and make-like programming language for data analysis workflows, especially popular in the NGS community
cutadapt Marcel Martin primary processing of NGS reads (454, Illumina, SOLiD, …): quality trimming, adapter cutting, name mangling, etc.
skylign Jody Clements,
Travis Wheeler,
Robert Finn
Benjamin
Schuster-Böckler,
Sven Rahmann
a server for generating HMM logos, a generalization of sequence logos, using stack width for visualizing insertion/deletion probabilities, based on original ideas from BSB’s Bachelor thesis.

Recent
warp Christoph Stahl  a web based frontend for argparse. It allows launching and running python programs through a web browser.
SimLoRD Bianca Stöcker,
Sven Rahmann
a read simulator for third generation sequencing reads focused on the Pacific Biosciences SMRT error model.
dupre Christopher Schröder,
Sven Rahmann
a tool to estimate the duplicate rate of a sequencing library at a given sequencing depth N, when the occupancy vector of a (small) subsample is known. This is useful when one has to decide which sequencing depth should be aimed for, weighing the potential of new discoveries vs. cost.
amplikyzer Sven Rahmann,
Marcel Bargull
a versatile and fast tool for the analysis of amplicon reads, especially for methylation studies with CpG bisulfite sequencing and NOME-seq.
dinopy Henning Timm,
Till Hartmann
a DNA input and output library for Python, providing readers and writers for FASTA and FASTQ files, along with support for samtools faidx files, and generators for solid and gapped q-grams (k-mers).
MoSDi Tobias Marschall methods for (DNA) motif statistics, e.g. to compute the exact occurrence count distribution of a motif, exact motif discovery, extraction of motifs with provably optimal p-value, analysis of pattern matching algorithms (to compute, for given algorithm and pattern, the exact distribution of the number of character accesses caused by searching a random text).

Older
PEAX Marianna
D’Addario,
Dominik Kopczynski,
Sven Rahmann
a collection of fully automated peak extraction methods for MCC/IMS datasets, provided as a modular extensible framework backed by an open source implementation.
FLowG Marcel Martin,
Sven Rahmann
a proof-of-concept implementation of flowgram-string alignment as described in our GCB 2013 paper “Aligning Flowgrams to DNA Sequences”.
StylPyl Sven Rahmann a command line utility that scans a text or tex input file for patterns that indicate bad writing style (e.g., according to Strunk and White’s classic “The Elements of Style”) and reports occurrences of these, together with suggestions for improvement. Beta version.