BisSNP is a utility based on the Genome Analysis Toolkit (GATK) map-reduce framework for genotyping in bisulfite treated massively parallel sequencing (Bisulfite-seq, RRBS and NOMe-seq) on Illumina platform.
BisSNP uses a bayesian inference with either manually specified or automatically estimated methylation probabilities of different cytosine context(not only CpG, CHH, CHG in Bisulfite-seq, but also GCH et.al. in other bisulfite treated sequencing) to determine genotypes and methylation levels simultaneously.
Moreover, BisSNP works for both of single-end and paired-end reads.Specificity and sensitivity has been validate by Illumina IM SNP array. In default threshold (Phred scale score > 20), it could detect 92.21% heterozygous SNPs with 0.14% false positive rate (90.88% sensitivity in C/T SNPs with 0.16% false positive rate, 98.51% sensitivity in non C/T SNPs with 0.16% false positive rate).
Cytosine calling could detect non-reference cytosine context because it's only based on reference context.
- Add an option (-notEncrypt) to output raw read ID into cpgreads files, which may cause the file to be very large.