This tool generates a contig map file using the information from a NCBI assembly report. It has an option to select which column in the NCBI report should be used.
NcbiReportToContigMap requires Java 8 to be installed on your device. Download Java 8 here or install via your distribution's package manager.
Download the latest version of NcbiReportToContigMap here. To generate the usage run:
java -jar <NcbiReportToContigMap_jar> --help
NcbiReportToContigMap needs a NCBI assembly report, an output file, and the name column that should be used from the report. All columns in the report can be used but these are the most common fields to choose from: - 'Sequence-Name': Name of the contig within the assembly - 'UCSC-style-name': Name of the contig used by UCSC ( like hg19 ) - 'RefSeq-Accn': Unique name of the contig at RefSeq (default for NCBI)
Optionally other columns in the report can be added to the contig map with the --names
flag.
To construct a contig map from a NCBI assembly report, use the UCSC-style-name for the contigs, and include the RefSeq-Accn column:
java -jar <NcbiReportToContigMap_jar> \
-a ncbi_assembly_report.txt \
-o contig_map.tsv \
--nameHeader UCSC-style-name \
--names Refseq-Accn
Usage for NcbiReportToContigMap:
Option | Required | Can occur multiple times | Description |
---|---|---|---|
--log_level, -l | no | no | Level of log information printed. Possible levels: 'debug', 'info', 'warn', 'error' |
--help, -h | no | no | Print usage |
--version, -v | no | no | Print version |
--assembly_report, -a | yes | no | Assembly report from NCBI |
--output, -o | yes | no | output contig map |
--nameHeader | yes | no | What column to use from the NCBI report for the name of the contigs. All columns in the report can be used but this are the most common field to choose from: - 'Sequence-Name': Name of the contig within the assembly - 'UCSC-style-name': Name of the contig used by UCSC ( like hg19 ) - 'RefSeq-Accn': Unique name of the contig at RefSeq (default for NCBI) |
--names | no | yes (unlimited) | Keys of the report to use in contig map |
NcbiReportToContigMap is part of BIOPET tool suite that is developed at LUMC by the SASC team. Each tool in the BIOPET tool suite is meant to offer a standalone function that can be used to perform a dedicate data analysis task or added as part of a pipeline, for example the SASC team's biowdl pipelines.
All tools in the BIOPET tool suite are Free/Libre and Open Source Software.
The source code of NcbiReportToContigMap can be found here. We welcome any contributions. Bug reports, feature requests and feedback can be submitted at our issue tracker.
NcbiReportToContigMap is build using sbt. Before submitting a pull request, make sure all tests can be passed by
running sbt test
from the project's root. We recommend using an IDE to work on NcbiReportToContigMap. We have had
good results with this IDE.
For any question related to NcbiReportToContigMap, please use the github issue tracker or contact the SASC team directly at: sasc@lumc.nl.