Description

This tool generates Fasta files out of variant (SNP) alignments or full alignments (consensus). It can be very useful to produce the right input needed for follow up tools, for example phylogenetic tree building.

Installation

BastyGenerateFasta requires Java 8 to be installed on your device. Download Java 8 here or install via your distribution's package manager.

Download the latest version of BastyGenerateFasta here. To generate the usage run:

java -jar <BastyGenerateFasta_jar> --help

Manual

BastyGenerateFasta has three modes:

  • outputVariants: This mode outputs all the variants in fasta format.
  • outputConsensus: This mode outputs consensus sequences from a BAM file.
  • outputConsensusVariants: This mode combines the above two.

Example

Minimal example for option: --outputVariants (VCF based)

java -jar <BastyGenerateFasta_jar> \
--inputVcf myVCF.vcf \
--outputName NiceTool \
--outputVariants myVariants.fasta

Minimal example for option: --outputConsensus (BAM based)

java -jar <BastyGenerateFasta_jar> \
--bamFile myBam.bam \
--outputName NiceTool \
--outputConsensus myConsensus.fasta \
--reference reference.fa

Minimal example for option: --outputConsensusVariants (Both)

java -jar <BastyGenerateFasta_jar> \
--inputVcf myVCF.vcf \
--bamFile myBam.bam \
--outputName NiceTool \
--outputConsensusVariants myConsensusVariants.fasta \
--reference reference.fa

Usage

Usage for BastyGenerateFasta:

Option Required Can occur multiple times Description
--log_level, -l no no Level of log information printed. Possible levels: 'debug', 'info', 'warn', 'error'
--help, -h no no Print usage
--version, -v no no Print version
--inputVcf, -V no no vcf file, needed for outputVariants and outputConsensusVariants
--bamFile no no bam file, needed for outputConsensus and outputConsensusVariants
--outputVariants no no fasta with only variants from vcf file
--outputConsensus no no Consensus fasta from bam, always reference bases else 'N'
--outputConsensusVariants no no Consensus fasta from bam with variants from vcf file, always reference bases else 'N'
--snpsOnly no no Only use snps from vcf file
--sampleName no no Sample name in vcf file
--outputName yes no Output name in fasta file header
--minAD no no min AD value in vcf file for sample. Defaults to: 8
--minDepth no no min depth in bam file. Defaults to: 8
--reference no no Indexed reference fasta file

About

BastyGenerateFasta is part of BIOPET tool suite that is developed at LUMC by the SASC team. Each tool in the BIOPET tool suite is meant to offer a standalone function that can be used to perform a dedicate data analysis task or added as part of a pipeline, for example the SASC team's biowdl pipelines.

All tools in the BIOPET tool suite are Free/Libre and Open Source Software.

Contributing

The source code of BastyGenerateFasta can be found here. We welcome any contributions. Bug reports, feature requests and feedback can be submitted at our issue tracker.

BastyGenerateFasta is build using sbt. Before submitting a pull request, make sure all tests can be passed by running sbt test from the project's root. We recommend using an IDE to work on BastyGenerateFasta. We have had good results with this IDE.

Contact

For any question related to BastyGenerateFasta, please use the github issue tracker or contact the SASC team directly at: sasc@lumc.nl.