This tool enables a user to filter VCF files. For example on sample depth and/or total depth. It can also be used to filter out the reference calls and/or minimum number of sample passes. There is a wide set of options which one can use to change the filter settings.
VcfFilter requires Java 8 to be installed on your device. Download Java 8 here or install via your distribution's package manager.
Download the latest version of VcfFilter here. To generate the usage run:
java -jar <VcfFilter_jar> --help
This tool filters VCF files on a number of values. For example, it can filter on sample depth and/or total depth. It can also filter out the reference calls and/or minimum number of sample passes. For more on filtering options and how to set them, please refer to the help usage.
To filter a VCF for variants with a minimum quality score of 50:
java -jar <VcfFilter_jar> \
-I input.vcf \
-o output.vcf \
--minQualScore 50
Usage for VcfFilter:
Option | Required | Can occur multiple times | Description |
---|---|---|---|
--log_level, -l | no | no | Level of log information printed. Possible levels: 'debug', 'info', 'warn', 'error' |
--help, -h | no | no | Print usage |
--version, -v | no | no | Print version |
--inputVcf, -I | yes | no | Input vcf file |
--outputVcf, -o | yes | no | Output vcf file |
--invertedOutputVcf | no | no | inverted output vcf file |
--minSampleDepth | no | no | Min value for DP in genotype fields |
--minTotalDepth | no | no | Min value of DP field in INFO fields |
--minAlternateDepth | no | no | Min value of AD field in genotype fields |
--minSamplesPass | no | no | Min number off samples to pass --minAlternateDepth, --minBamAlternateDepth and --minSampleDepth |
--resToDom | no | yes (unlimited) | Only shows variants where child is homozygous and both parants hetrozygous |
--trioCompound | no | yes (unlimited) | Only shows variants where child is a compound variant combined from both parants |
--deNovoInSample | no | no | Only show variants that contain unique alleles in complete set for given sample |
--deNovoTrio | no | yes (unlimited) | Only show variants that are denovo in the trio |
--trioLossOfHet | no | yes (unlimited) | Only show variants where a loss of hetrozygosity is detected |
--mustHaveVariant | no | yes (unlimited) | Given sample must have 1 alternative allele |
--mustNotHaveVariant | no | yes (unlimited) | Given sample may not have alternative alleles |
--calledIn | no | yes (unlimited) | Must be called in this sample |
--mustHaveGenotype | no | yes (unlimited) | Must have genotoype |
--diffGenotype | no | yes (unlimited) | Given samples must have a different genotype |
--filterHetVarToHomVar | no | yes (unlimited) | If variants in sample 1 are heterogeneous and alternative alleles are homogeneous in sample 2 variants are filtered |
--filterRefCalls | no | no | Filter when there are only ref calls |
--filterNoCalls | no | no | Filter when there are only no calls |
--uniqueOnly | no | no | Filter when there more then 1 sample have this variant |
--sharedOnly | no | no | Filter when not all samples have this variant |
--minCalled | no | no | Number of sample where a call must be made |
--minQualScore | no | no | Min qual score |
--id | no | yes (unlimited) | Id that may pass the filter |
--idFile | no | yes (unlimited) | File that contain list of IDs to get from vcf file |
--minGenomeQuality | no | no | The minimum value in the Genome Quality field. |
--advancedGroups | no | yes (unlimited) | All members of groups sprated with a ',' |
--minAvgVariantGQ | no | yes (unlimited) | Filter on the average GQ of variants |
--infoArrayMustContain | no | yes (unlimited) | Info field must be a array and should match the given regex |
VcfFilter is part of BIOPET tool suite that is developed at LUMC by the SASC team. Each tool in the BIOPET tool suite is meant to offer a standalone function that can be used to perform a dedicate data analysis task or added as part of a pipeline, for example the SASC team's biowdl pipelines.
All tools in the BIOPET tool suite are Free/Libre and Open Source Software.
The source code of VcfFilter can be found here. We welcome any contributions. Bug reports, feature requests and feedback can be submitted at our issue tracker.
VcfFilter is build using sbt. Before submitting a pull request, make sure all tests can be passed by
running sbt test
from the project's root. We recommend using an IDE to work on VcfFilter. We have had
good results with this IDE.
For any question related to VcfFilter, please use the github issue tracker or contact the SASC team directly at: sasc@lumc.nl.