Introduction
Here, you can download the source code of CoMuS, CoMuStats, as well as R scripts that can be used for different analyses. The R scripts implement specific examples that have been used in CoMuS’ manuscript or, in general, I consider them interesting. You can download them, modify them according to your needs.
You can download the most recent version from the github:
https://github.com/idaios/comus
or type in terminal: git clone https://github.com/idaios/comus.git
Instructions:
- download the code from the previous link
- tar xvfz comus.tar.gz
- cd comus
- to compile: make -f Makefile.gcc (you may need to remove the *.o files)
i.e. rm *.o and then make -f Makefile.gcc - now you should have the comus executable
- IMPORTANT: there is a pre-compiled executable in the comus.tar.gz. However, sometimes, depending on the system you will not be able to execute it.Instead, you will see the error message:comus: /lib64/libc.so.6: version `GLIBC_2.14′ not found (required by comus). Please first clean: make clean -f Makefile.gcc
and then recompile the code: make -f Makefile.gcc
Scripts and demonstration
In CoMuS manuscript we have used several scripts, simulations and inference examples to demonstrate CoMuS usage. These scripts are provided here either for demonstration or testing purposes.
- ancestral sampling
CoMuS allows the simultaneous simulation of both modern and ancestral samples. To facilitate the simultaneous coalescent simulation of modern and ancestral samples, CoMuS implementation starts at the present-day with the whole dataset (modern and ancestral). However, all events that involve ancestral samples or their population are forbidden until sampling (time proceeds backwards). After sampling (backwards in time), evolutionary processes (e.g. recombination, mutations, coalescent, migration etc) take place as usually according to the model parameters.
Here, we demonstrate the usage of CoMuS to infer potential ancestral gene flow between an extant population and an extinct sample (fossil). The simulation scenario is as follows: we assume a sample of 10 sequences from species A sampled at present, and a sample of 10 sequences from an extinct species B sampled at time 0.2 (phylogenetic time units). The time of the MRCA has been set to 0.5 (phylogenetic time units). We assume no gene flow after speciation between the species A and B. We illustrate the dendrogram for this scenario in Figure 1 (below). Assuming that the above scenario represents the true evolutionary history for extant species A and extinct species B, our goal is to infer: (i) whether gene flow between A and B is absent or present, and (ii) the time of sampling for species B. The time of the MRCA (= 0.5 phylogenetic units) as well as θ (= 100) value is assumed to be known. The length of the simulated region is 1kb and we assumed a mutation model with equal mutation rates between each pair of bases.
Download the script here
Figure 1: a coalescent tree example comprises both present-day and ancestral sampling
Testing species delimitation software
- testing species delimitation with gene-flow
An apparent usage of CoMuS is to test species delimitation software. More specifically various parameters such as
- changes of population size
- population subdivision
- ghost populations
- etc
can be examined and the performance of species delimitation software can be assessed. Here, we test the PTP software developed by Jiajie Zhang et al. (including myself) in the group of Alexandros Stamatakis in Heidelberg. The manuscript is available from here.
The scripts used for this demonstration can be found here. (NOTE: to run the full set of commands in the scripts you need to install raxml. see raxml-github)
- Testing species delimitation with various birth rates
Scripts can be found here. The ideas are similar as those presented above.
Inferring parameters values
CoMuS can be used to infer parameter using the ABC framework. We have used two scenarios: (i) 2 species, each of 10 sequences sampled, inference of the birth rate b (ii) 10 species, each of 10 sequences, inference of the birth rate b and the time of the most recent common ancestor, i.e. the time that the two populations find common ancestor.
Examples and Manual
- Inside the comus directory you will find a directory called ‘manual’ which contains the manual, i.e. manual.pdf
- Also there is a directory called examples. There is a run.sh file that contains commands as well as useful notes that explain most of the results. Please consult them first.
CoMuStats
This software can be used to calculate summary statistics from single or multi-FASTA alignment files. Multi-alignment fasta files should be separated by //.
For example:
>seq1
ACGTG
>seq2
ACGGG
//
>seq1
ACCTC
>seq2
ACCCC
Bugs and Previous versions
- CoMuS v 1.0 : -oFormat unrecognized
- CoMuS v2.0 March 2016 (CoMuStats was not running properly the sliding window code).
Log Report
- CoMuStats 1.0.1 accepts an outgroup. Run CoMuStats without arguments for further details.