CoNanSNV_logoCoNAn-SNV is a probabilistic framework for the discovery of single nucleotide variants in WGSS data.  This software explicitly integrates information about copy number state of different genomic segments into the inference of single nucleotide variants. CoNAn-SNV requires as input a pileup file (either Maq or Samtools format) and model parameters, as well as a file demarcating segmentation boundaries of copy number amplifications.

CoNAn-SNV is distributed as part of the SNVMixsuite which is also capable of running SNVMix and SNVMix2.


The source code implemented in C is available for distribution under an open source license.  Supported platforms are Linux and Mac OS X. A working gcc compiler is needed and under Linux libc >= 4.6.27 is required.

Software download here.



> tar -xzvf SNVMixsuite-0.11.9.tar.gz
> cd SNVMixsuite-0.11.9/
> make
> ./SNVMixsuite -h

Input Files

Available here are examples of the input files and their formats that are required for CoNAn-SNV . The model file listed below may be used for analysis, however training the model is strongly recommended.

Classification Parameters for CoNAn-SNV

Training Parameters for CoNAn-SNV

Lobular Carcinoma CNA Segmentation file

(note: to save the above links for personal use, right click and select “save as” )

Pileup file: See Samtools for documentation on generating a pileup file. When generating a pileup file it is necessary to specify the –s option in order to ensure that mapping and base qualities are included. We do not recommend using the –c option for the pileup creation because it creates additional columns that the CoNAn-SNV model does not handle. Additional columns must be parsed out prior to input to CoNAn-SNV if  -c option is used.


A full documentation files is available in the following PDF.

Additional Scripts

The User Guide documentation notes some additional scripts are available for post processing the CoNAn-SNV output.  These scripts are not required to be used as a companion to CoNAn-SNV, however they offer suggestions for how the data may otherwise summarized.