Description

This tool converts a vcf file to a Tab Separated Values (TSV) file. For every key in the INFO column of the VCF file, a separate column will be created with the corresponding values. The user can select the keys that will be parsed into the output TSV file. This can be useful in the case a program only accepts a TSV file for downstream analysis.

Installation

VcfToTsv requires Java 8 to be installed on your device. Download Java 8 here or install via your distribution's package manager.

Download the latest version of VcfToTsv here. To generate the usage run:

java -jar <VcfToTsv_jar> --help

Manual

The output of this tool is a TSV file produced from the input vcf file. Depending on which options are enabled their could be some fields discarded. Fieldseparator and list separator values can be selected.

Example

To run a simple conversion that will include all info fields in the resulting tsv run:

java -jar <VcfToTsv_jar> \
--inputFile myVCF.vcf \
--outputFile my_tabDelimited_VCF.tsv \
--all_info

Usage

Usage for VcfToTsv:

Option Required Can occur multiple times Description
--log_level, -l no no Level of log information printed. Possible levels: 'debug', 'info', 'warn', 'error'
--help, -h no no Print usage
--version, -v no no Print version
--inputFile, -I yes no Input vcf file
--outputFile, -o no no output file, default to stdout
--field, -f no yes (unlimited) Genotype field to use
--info_field, -i no yes (unlimited) Info field to use
--all_info no no Use all info fields in the vcf header
--all_format no no Use all genotype fields in the vcf header
--sample_field, -s no yes (unlimited) Genotype fields to use in the tsv file
--disable_defaults, -d no no Don't output the default columns from the vcf file
--separator no no Optional separator. Default is tab-delimited
--list_separator no no Optional list separator. By default, lists are separated by a comma
--max_decimals no no Number of decimal places for numbers. Default is 2

About

VcfToTsv is part of BIOPET tool suite that is developed at LUMC by the SASC team. Each tool in the BIOPET tool suite is meant to offer a standalone function that can be used to perform a dedicate data analysis task or added as part of BIOPET pipelines.

All tools in the BIOPET tool suite are Free/Libre and Open Source Software.

Contributing

The source code of VcfToTsv can be found here. We welcome any contributions. Bug reports, feature requests and feedback can be submitted at our issue tracker.

VcfToTsv is build using sbt. Before submitting a pull request, make sure all tests can be passed by running sbt test from the project's root. We recommend using an IDE to work on VcfToTsv. We have had good results with this IDE.

Contact

For any question related to VcfToTsv, please use the github issue tracker or contact the SASC team directly at: sasc@lumc.nl.