Description

This tool breaks a reference or bed file into smaller scatter regions of equal size. This can be used for processing inside a pipeline.

Installation

ScatterRegions requires Java 8 to be installed on your device. Download Java 8 here or install via your distribution's package manager.

Download the latest version of ScatterRegions here. To generate the usage run:

java -jar <ScatterRegions_jar> --help

Manual

This always require a reference fasta with a dict file next to it. If the a bed file is supplied the tool will validate this file to the given reference.

Example

Default run:

java -jar <ScatterRegions_jar> \
-R reference fasta \
-o <output dir>

With scatter size:

java -jar <ScatterRegions_jar> \
-R reference fasta \
-o <output dir> \
-s 5000000

Usage

Usage for ScatterRegions:

Option Required Can occur multiple times Description
--log_level, -l no no Level of log information printed. Possible levels: 'debug', 'info', 'warn', 'error'
--help, -h no no Print usage
--version, -v no no Print version
--outputDir, -o yes no Output directory
--referenceFasta, -R yes no Reference fasta file, (dict file should be next to it)
--scatterSize, -s no no Approximately scatter size, tool will make all scatters the same size. default = 1000000
--regions, -L no no If given only regions in the given bed file will be used for scattering
--notCombineContigs no no If set each scatter can only contain 1 contig
--maxContigsInScatterJob no no If set each scatter can only contain 1 contig
--bamFile no no When given the regions will be scattered based on number of reads in the index file
--notSplitContigs no no When this option is set contigs are not split.

About

ScatterRegions is part of BIOPET tool suite that is developed at LUMC by the SASC team. Each tool in the BIOPET tool suite is meant to offer a standalone function that can be used to perform a dedicate data analysis task or added as part of a pipeline, for example the SASC team's biowdl pipelines.

All tools in the BIOPET tool suite are Free/Libre and Open Source Software.

Contributing

The source code of ScatterRegions can be found here. We welcome any contributions. Bug reports, feature requests and feedback can be submitted at our issue tracker.

ScatterRegions is build using sbt. Before submitting a pull request, make sure all tests can be passed by running sbt test from the project's root. We recommend using an IDE to work on ScatterRegions. We have had good results with this IDE.

Contact

For any question related to ScatterRegions, please use the github issue tracker or contact the SASC team directly at: sasc@lumc.nl.