SNVMix is designed to detect single nucleotide variants from next generation sequencing data. SNVMix is a post-alignment tool. Given a pileup file (either Maq or Samtools format) as input and model parameters, SNVMix will output the probability that each position is one of three genotypes: aa (homozygous for the reference allele, where the reference is the genome the reads were aligned to), ab (heterozygous) and bb (homozygous for a non-reference allele). A tool for fitting the model using expectation maximization is also supplied (use -T option).
The source code implemented in C is available for distribution under an open source license. Supported platforms are Linux and Mac OS X. A working gcc compiler is needed and under Linux libc >= 4.6.27 is required.
Software download here.
- 0.11.8-r5: fixed a problem present when compiling with math.h on Ubuntu.
- Fixed a parsing problem present when generating pileups for RNA-Seq data with samtools > 0.1.8.
- Error reporting when number of columns in pileup file is wrong.
Notes for version 0.11.8:
Alpha and beta parameters can now be specified on the command line for training using three new flags:
-a #,#,# Provide alpha training parameters -b #,#,# Provide beta training parameters -d #,#,# Provide delta training parameters
You can also specify training parameters in a space-separated file using:
-M Provide a file containing training parameters
It is also recommended you update to this due to a bug fix. Older versions affected by that bug presented sporadic segmentation faults when dealing with model files.
Download latest development version: SNVMix2-0.12.2-rc1.tar.gz
Notes for version 0.12.2-rc1:
This version can now read BAM files directly; additionally, there are extra columns with information regarding the context in which each SNV was detected, explanation of these fields can be found in the README file.
> tar -xzvf SNVMix2-0.11.8-r4.tar.gz > cd SNVMix2-0.11.8-r4/ > make > ./SNVMix2 -h
Model parameter file
In the absence of training data, a model file (input with -m) containing the mu and pi parameters of the model derived in the Shah et al , Nature (2009) is provided here:
If you use SNVMix in your work, please cite the following papers:
Sohrab P. Shah, Ryan D. Morin, Jaswinder Khattra, Leah Prentice, Trevor Pugh, Angela Burleigh, Allen Delaney, Karen Gelmon, Ryan Giuliany, Janine Senz, Christian Steidl, Robert A. Holt, Steven Jones, Mark Sun, Gillian Leung, Richard Moore, Tesa Severson, Greg A. Taylor, Andrew E. Teschendorff, Kane Tse, Gulisa Turashvili, Richard Varhol, Rene L. Warren, Peter Watson, Yongjun Zhao, Carlos Caldas, David Huntsman, Martin Hirst, Marco A. Marra and Samuel Aparicio. Mutational evolution in a lobular breast tumour profiled at single nucleotide resolution. Nature. vol461, 809-813. (2009) [PDF]
Goya R, Sun MG, Morin RD, Leung G, Ha G, Wiegand KC, Senz J, Crisan A, Marra MA, Hirst M, Huntsman D, Murphy KP, Aparicio S, Shah SP. SNVMix: predicting single nucleotide variants from next-generation sequencing of tumors. Bioinformatics . 2010 Mar 15;26(6):730-6. [Link]