Here, you can download the source code of CoMuS, CoMuStats, as well as R scripts that can be used for different analyses. The R scripts implement specific examples that have been used in CoMuS’ manuscript or, in general, I consider them interesting. You can download them, modify them according to your needs.
You can download:
- CoMus (current version 2.0 April 2016). See below for previous versions and bugs.
- Please find CoMuStats inside the CoMuS folder.
Comment: It can calculate total and per species/population summary statistics
- The manual can be also found in the CoMuS folder.
- Code and data to generate the inference of the mutation rate and TMRCA for the human-chimp data (here).
- Code and data to generate Supplementary Figure 1 (here)
- Code and data to infer parameters using simulated pseudo-observed data of 10 species and 10 individuals per each species (here)
- Code and data to infer parameters using simulated pseudo-observed data of 2 species and 10 individuals per species (here)
Also you can download the code (the most up to date version) from bitbucket.org
git clone firstname.lastname@example.org:idaios/comus.git
- download the code from the previous link
- tar xvfz comus.tar.gz
- cd comus
- to compile: make -f Makefile.gcc (you may need to remove the *.o files)
i.e. rm *.o and then make -f Makefile.gcc
- now you should have the comus executable
- IMPORTANT: there is a pre-compiled executable in the comus.tar.gz. However, sometimes, depending on the system you will not be able to execute it.Instead, you will see the error message:comus: /lib64/libc.so.6: version `GLIBC_2.14′ not found (required by comus). Please first clean: make clean -f Makefile.gcc
and then recompile the code: make -f Makefile.gcc
Scripts and demonstration
In CoMuS manuscript we have used several scripts, simulations and inference examples to demonstrate CoMuS usage. These scripts are provided here either for demonstration or testing purposes.
- ancestral sampling
CoMuS allows the simultaneous simulation of both modern and ancestral samples. To facilitate the simultaneous coalescent simulation of modern and ancestral samples, CoMuS implementation starts at the present-day with the whole dataset (modern and ancestral). However, all events that involve ancestral samples or their population are forbidden until sampling (time proceeds backwards). After sampling (backwards in time), evolutionary processes (e.g. recombination, mutations, coalescent, migration etc) take place as usually according to the model parameters.
Here, we demonstrate the usage of CoMuS to infer potential ancestral gene flow between an extant population and an extinct sample (fossil). The simulation scenario is as follows: we assume a sample of 10 sequences from species A sampled at present, and a sample of 10 sequences from an extinct species B sampled at time 0.2 (phylogenetic time units). The time of the MRCA has been set to 0.5 (phylogenetic time units). We assume no gene flow after speciation between the species A and B. We illustrate the dendrogram for this scenario in Figure 1 (below). Assuming that the above scenario represents the true evolutionary history for extant species A and extinct species B, our goal is to infer: (i) whether gene flow between A and B is absent or present, and (ii) the time of sampling for species B. The time of the MRCA (= 0.5 phylogenetic units) as well as θ (= 100) value is assumed to be known. The length of the simulated region is 1kb and we assumed a mutation model with equal mutation rates between each pair of bases.
Download the script here
Figure 1: a coalescent tree example comprises both present-day and ancestral sampling
Testing species delimitation software
- testing species delimitation with gene-flow
An apparent usage of CoMuS is to test species delimitation software. More specifically various parameters such as
- changes of population size
- population subdivision
- ghost populations
can be examined and the performance of species delimitation software can be assessed. Here, we test the PTP software developed by Jiajie Zhang et al. (including myself) in the group of Alexandros Stamatakis in Heidelberg. The manuscript is available from here.
- Testing species delimitation with various birth rates
Scripts can be found here. The ideas are similar as those presented above.
Inferring parameters values
CoMuS can be used to infer parameter using the ABC framework. We have used two scenarios: (i) 2 species, each of 10 sequences sampled, inference of the birth rate b (ii) 10 species, each of 10 sequences, inference of the birth rate b and the time of the most recent common ancestor, i.e. the time that the two populations find common ancestor.
Examples and Manual
- Inside the comus directory you will find a directory called ‘manual’ which contains the manual, i.e. manual.pdf
- Also there is a directory called examples. There is a run.sh file that contains commands as well as useful notes that explain most of the results. Please consult them first.
This software can be used to calculate summary statistics from single or multi-FASTA alignment files. Multi-alignment fasta files should be separated by //.
Bugs and Previous versions
- CoMuS v 1.0 : -oFormat unrecognized
- CoMuS v2.0 March 2016 (CoMuStats was not running properly the sliding window code).
- CoMuStats 1.0.1 accepts an outgroup. Run CoMuStats without arguments for further details.