Description

This tool enables a user to extract a VCF file out a mpileup file generated from the BAM file using samtools mpileup, for instance. The tool can also stream through STDin so that it is not necessary to store the mpileup file on disk. Mpileup files can to be very large because they describe each covered base position in the genome on a per read basis, so it is not desired to store them.

Installation

MpileupToVcf requires Java 8 to be installed on your device. Download Java 8 here or install via your distribution's package manager.

Download the latest version of MpileupToVcf here. To generate the usage run:

java -jar <MpileupToVcf_jar> --help

Manual

MpileupToVcf comes with various options. See the usage for more details. The tool can stream from stdin or accept a mpileup file. An output file and the name of the sample are always required.

Example

To convert a mpileup file to vcf from a haploid organism and an expected sequencing error rate of 0.010"

java -jar <MpileupToVcf_jar> \
-I input.mpileup \
-o output.vcf \
--sample Yeast5302 \
--ploidy 1 \
--seqError 0.010

To convert a mpileup directly from standard out:

samtools mpileup <bam> | java -jar <MpileupToVcf_jar> \
-o <output_vcf> \
--sample E.coli243

Usage

Usage for MpileupToVcf:

Option Required Can occur multiple times Description
--log_level, -l no no Level of log information printed. Possible levels: 'debug', 'info', 'warn', 'error'
--help, -h no no Print usage
--version, -v no no Print version
--input, -I no no input, default is stdin
--output, -o yes no output file (required)
--sample, -s yes no Sample name in the vcf file
--minDP no no Minimal total depth
--minAP no no Minimal alternative depth
--homoFraction no no If alleles are above this fraction it's being seen as homozygous. Default if 0.8
--ploidy no no Specify the ploidy as a number: '1' for haploid, '2' for diploid etc.
--seqError no no Expected sequencing error rate, default is 0.005
--refCalls no no If set refcalls are also writen. Warning: This will results in a very large vcf file

About

MpileupToVcf is part of BIOPET tool suite that is developed at LUMC by the SASC team. Each tool in the BIOPET tool suite is meant to offer a standalone function that can be used to perform a dedicate data analysis task or added as part of a pipeline, for example the SASC team's biowdl pipelines.

All tools in the BIOPET tool suite are Free/Libre and Open Source Software.

Contributing

The source code of MpileupToVcf can be found here. We welcome any contributions. Bug reports, feature requests and feedback can be submitted at our issue tracker.

MpileupToVcf is build using sbt. Before submitting a pull request, make sure all tests can be passed by running sbt test from the project's root. We recommend using an IDE to work on MpileupToVcf. We have had good results with this IDE.

Contact

For any question related to MpileupToVcf, please use the github issue tracker or contact the SASC team directly at: sasc@lumc.nl.