OmegaPlus

A parallel tool for rapid & scalable detection of selective sweeps in whole-genome datasetsWe have developed OmegaPlus, a scalable implementation of the omega-statistic (Kim and Nielsen 2004) to detect selective sweeps in whole-genome data based on linkage disequilibrium patterns.
OmegaPlus has been tested with fully phased data, but also with unphased data, where we can determine to which diploid individual a SNP belongs to, but we can not determine which of the two chromosomes carries the SNP.
Outgroup information is not required. The program recognizes FASTA, Hudson’s ms-like, and MaCS-like (http://www-hsc.usc.edu/~garykche/) formats. OmegaPlus can scan the DPGP dataset (www.dpgp.org, reference release 1.0 September 2009, 37 sequences and ~340,000 SNPs) for positive selection in 55 seconds. In addition to the efficient sequential implementation, we provide three parallelized versions that use fine-, coarse-, and multi-grained parallelism.Right now this is a pure command-line tool, available for Windows and Linux operating systems. We strongly recommend to use the LINUX version of OmegaPlus.
Note that only limited support will be provided for the Windows version.For compiling the code, GCC version 4.4 or greater is recommended. For gcc versions prior to version 4.4 please remove the optimization flag (-O3) from the Makefiles before compiling the code. When OmegaPlus is compiled with older gcc versions it will yield different results on identical input data compared to the ouput it generates when -O3 is activated. This is most probably due to some to aggressive optimizations under -O3. Many thanks to Stefan Laurent (LMU Munich) for pointing this out.
——————-
You can download the most recent version 3.0.0  from github
(git clone https://github.com/alachins/omegaplus.git)
3.0.0: Novel Parallelization with better scalability

Past Versions (Obsolete):

  •  Linux version 2.3.0 here. It it allows the parsing of a subset of samples from the VCF file.
  • 2.2.11 here. This version :
    -skips empty MS alignments or alignments (MS, FASTA, VCF, MACS) that have less than 4 SNPs.
    – reads in frequencies in the AF field in scientific notation
  • 2.2.10 here.
    This version allows for stepsize between grid-points to be less than one. This may be useful when files with multiple alignments are processed.
  • 2.2.9 here
    – allows for “.” in the FILTER and INFO fields of VCF.
    – fseek function to handle the file-type recognition
    – allows for “.” in the FILTER and INFO fields of VCF.
    – fseek function to handle the file-type recognition
  • 2.2.8 here
    – it handles the ^M in vcf files
  • 2.2.7 here.
    –  it reads the small letters a,c,g,t using the toupper function
    – reads multiple AF frequencies (, separated)
  • 2.2.6 here , it reads small-letters dna alphabet, i.e. a,c,g,t
  • Linux version 2.2.5 here , it handles correctly the ‘.’ symbol in the ALT field.
  • Linux version 2.2.4 here,   it includes the flag -noSeparator to remove the ‘//’ which is a separator between different datasets.
  • Linux version 2.2.3 here , includes bug fix in the ms file parser when it reads msABC ms-like files.
  • Linux version 2.2.2 here :a bug fix in the VCF file parser that was associated with handling missing data.
  • Linux version 2.2.1 here , it includes a minor bug fix in the ms parser
  • Linux version 2.2 here it includes a new command line flag -no-singletons to exclude the singletons from the analysis
  • Linux version 2.1 here which can now also parse the Variant Call Format (.vcf)
  • Linux version 2.0 here

download GNU GPL Windows version here download the manual here Examples are provided with the source code (see directory “examples”). If you have questions or you would like to report a bug, please register at the OmegaPlus google group (http://groups.google.com/group/omegaplus) or send an email to pavlidisp@gmail.com or n.alachiotis@gmail.com

Leave a Reply