BamStats is a package that contains tools to generate stats from a BAM file, merge those stats for multiple samples, and validate the generated stats files.
Generate reports clipping stats, flag stats, insert size and mapping quality on a BAM file. It outputs a JSON file, but can optionally also output in TSV format.
The output of the JSON file is organized in a sample - library - readgroup tree structure.
If readgroups in the BAM file are not annotated with sample (
SM) and library (
an error will be thrown.
This can be fixed by using
samtools addreplacerg or
This module will merge bamstats files together and keep the sample/library/readgroup structure. Values for the same readgroups will be added. It will also validate the resulting file.
Validates a BamStats file. If aggregation values can not be regenerated the file is considered corrupt. This should only happen when the file has been manually edited.
BamStats requires Java 8 to be installed on your device. Download Java 8 here or install via your distribution's package manager.
Download the latest version of BamStats here. To generate the usage run:
java -jar <BamStats_jar> --help
Generate requires a BAM file and an output directory for its stats. Optionally a reference fasta file can be added against which the BAM file will be validated. There is a flag to also output in TSV format.
Generate requires BAM files that have all the
@RG groups annotated with
an error is thrown.
When merging the files BamStats will validate the input files and the output files. If aggregation values can not be regenerated the file is considered corrupt.
To generate stats from
java -jar <Generate_jar> \ -b file.bam \ -o output_dir
To generate stats from
file.bam, and output the result also as TSV:
java -jar <Generate_jar> \ -o output_dir \ -b file.bam \ --tsvOutputs
To generate stats from certain regions in
validate the regions and bam with
reference.fa and also include unmapped reads:
java -jar <Generate_jar> \ -R reference.fa \ -o output_dir \ -b file.bam \ --bedFile regions.bed
Merging multiple files and writing the results to an output file.
java -jar <BamStats_jar> merge \ -i <bamstats file> \ -i <bamstats file> \ -o <output file>
To validate a
java -jar <BamStats_jar> validate \ -i <input file>
Usage for BamStats:
|Option||Required||Can occur multiple times||Description|
|--log_level, -l||no||no||Level of log information printed. Possible levels: 'debug', 'info', 'warn', 'error'|
|--help, -h||no||no||Print usage|
|--version, -v||no||no||Print version|
|toolName||no||no||Name of the tool to execute|
|tool args||no||yes (unlimited)||Arguments for the tool|
BamStats is part of BIOPET tool suite that is developed at LUMC by the SASC team. Each tool in the BIOPET tool suite is meant to offer a standalone function that can be used to perform a dedicate data analysis task or added as part of a pipeline, for example the SASC team's biowdl pipelines.
BamStats is build using sbt. Before submitting a pull request, make sure all tests can be passed by
sbt test from the project's root. We recommend using an IDE to work on BamStats. We have had
good results with this IDE.