ReproPhylo is a reproducible phylogenomics pipeline written in python and making use of Biopython and other open tools. It is open source software (CC0 public domain) for you to use, modify and distribute as you see fit. This is community software, we would welcome your contributions.
ReproPhylo is described in this publication.
We are still largely missing the benefits of reproducibility in phylogenomics. This makes our lives unnecessarily difficult and makes us particularly poorly prepared to confront modern data-rich phylogenomics.
Our ideas are outlined in a series of reproducible phylogenetics blog posts: part 1, part 2a, part 2b. ReproPhylo allows you to carry out a phylogenomic analysis using pre-written or self-written commands. This programmatic approach ensures that all stages of the analysis are explicitly recorded, and can be exactly reproduced if required. Input sequence data, and generated intermediate data files (e.g. alignments, metadata), are tracked and held in version control- meaning that there can be no doubt which version of which file was used for any analysis. Reprophylo will write a human-readable detailed graphical report for each experiment. ReproPhylo will create an experimental .zip archive to upload to FigShare or equivalent repository containing the entire experiment. ReproPhylo is best deployed as a Docker container, which includes not just the experimental components but also the phylogenetics programs and any dependencies, ensuring that it can be run exactly as it was on the previous phylogeneticists machine.
Authors and Contributors
ReproPhylo was developed by Amir Szitenberg (@szitenberg) and Dave Lunt (@davelunt) with contributions and suggestions from a number of other people. We really welcome your bug reports, comments, and contributions to the code, design or documentation. Follow @ReproPhylo on Twitter for updates and usage hints.
Documentation can be found at http://goo.gl/yW6J1J
Instructions for Docker, Vagrant, WinPython or local installation here
- Google groups - Support for users and developers.
- The DockerHub reporitory - A Docker image of the python module in an Jupyter notebook environment.
- The Dockerfile reporitory - For a local Docker image build or a local installation on Linux
- The ReproPhyloVagrant reporitory - A Vagrant VM builder for any OS.
- The Galaxy distribution repository - A Galaxy distriution including the Galaxy ReproPhylo tools, their dependencies and an INSTALL file.