This tool takes an input GTF file and outputs a GTF file with renamed contigs. For example chr1 -> 1. This can be useful in a pipeline where tools have different naming standards for contigs.
ReplaceContigsGtfFile requires Java 8 to be installed on your device. Download Java 8 here or install via your distribution's package manager.
Download the latest version of ReplaceContigsGtfFile here. To generate the usage run:
java -jar <ReplaceContigsGtfFile_jar> --help
ReplaceContigsGtfFile needs a reference fasta file and an input GTF file. The reference fasta is needed to validate the contigs. The renaming of contigs can be specified in a contig mapping file. The contig mapping file should be in the following format.
chr1 1;I;one
chr2 2;II;two
Any contigs found in the input VCF that have a contig name in the second column will be renamed with the contig name in the corresponding first column.
Alternatively, options can be specified on the command line. For example '1=chr1' will convert all contigs named '1' to 'chr1'.
Mappings are NOT case sensitive by default. If you need case sensitivity use the --caseSensitive
flag.
The output can also be a GFF file with the --writeAsGff
flag.
To convert the contig names in a gtf file with case sensitivity and output as GFF run:
java -jar <ReplaceContigsGtfFile_jar> \
-I input.gtf \
-o output.gtf \
-R reference.fasta \
--contigMappingFile contignames.tsv \
--caseSensitive \
--writeAsGff
To convert the contig names using command line options, similar to the example contig mapping file given in the manual:
java -jar <ReplaceContigsGtfFile_jar> \
-I input.gtf \
-o output.gtf \
-R reference.fasta \
--contig 1=chr1 \
--contig I=chr1 \
--contig one=chr1 \
--contig 2=chr2 \
--contig II=chr2 \
--contig two=chr2
A contig mapping file and contigs can be used together:
java -jar <ReplaceContigsGtfFile_jar> \
-I input.gtf \
-o output.gtf \
-R reference.fasta \
--contigMappingFile contignames.tsv \
--contig 3=chr3 \
--contig III=chr3
Usage for ReplaceContigsGtfFile:
Option | Required | Can occur multiple times | Description |
---|---|---|---|
--log_level, -l | no | no | Level of log information printed. Possible levels: 'debug', 'info', 'warn', 'error' |
--help, -h | no | no | Print usage |
--version, -v | no | no | Print version |
--input, -I | yes | no | Input GTF file |
--output, -o | yes | no | Output GTF file |
--referenceFile, -R | yes | no | Reference fasta file |
--contig | no | yes (unlimited) | Specify contig mappings on the command line. Example '1=chr1' will convert contig '1' to 'chr1' |
--writeAsGff | no | no | Write as GFF file instead of GTF file. |
--contigMappingFile | no | no | File how to map contig names, first column is the new name, second column is semicolon separated list of alternative names |
--caseSensitive | no | no | If set the tool does not try to match case differences, example: chr1_gl000191_random will not match to chr1_GL000191_random |
ReplaceContigsGtfFile is part of BIOPET tool suite that is developed at LUMC by the SASC team. Each tool in the BIOPET tool suite is meant to offer a standalone function that can be used to perform a dedicate data analysis task or added as part of BIOPET pipelines.
All tools in the BIOPET tool suite are Free/Libre and Open Source Software.
The source code of ReplaceContigsGtfFile can be found here. We welcome any contributions. Bug reports, feature requests and feedback can be submitted at our issue tracker.
ReplaceContigsGtfFile is build using sbt. Before submitting a pull request, make sure all tests can be passed by
running sbt test
from the project's root. We recommend using an IDE to work on ReplaceContigsGtfFile. We have had
good results with this IDE.
For any question related to ReplaceContigsGtfFile, please use the github issue tracker or contact the SASC team directly at: sasc@lumc.nl.