Description

WipeReads is a tool for removing reads from indexed BAM files that are inside a user defined region. It takes pairing information into account and can be set to remove reads if one of the pairs maps outside of the target region. An application example is to remove reads mapping to known ribosomal RNA regions (using a supplied BED file containing intervals for these regions).

Installation

WipeReads requires Java 8 to be installed on your device. Download Java 8 here or install via your distribution's package manager.

Download the latest version of WipeReads here. To generate the usage run:

java -jar <WipeReads_jar> --help

Manual

This tool will remove BAM records that overlaps a set of given regions. By default, if the removed reads are also mapped to other regions outside the given ones, they will also be removed. This tool outputs a bam file containing all the reads not inside the ribosomal region. It can optionally output a bam file with only the reads inside the ribosomal region.

Example

An input file, interval file and output file are required. The output BAM can be indexed. Example:

java -jar <WipeReads_jar> \
--input_file myBam.bam \
--interval_file myRibosomal_regions.bed \
--output_file myFilteredBam.bam \
--make_index yes

Usage

Usage for WipeReads:

Option Required Can occur multiple times Description
--log_level, -l no no Level of log information printed. Possible levels: 'debug', 'info', 'warn', 'error'
--help, -h no no Print usage
--version, -v no no Print version
--input_file, -I yes no Input BAM file
--interval_file, -r yes no Interval BED file
--output_file, -o yes no Output BAM file
--discarded_file, -f no no Discarded reads BAM file (default: none)
--min_mapq, -Q no no Minimum MAPQ of reads in target region to remove (default: 0)
--read_group, -G no yes (unlimited) Read group IDs to be removed (default: remove reads from all read groups)
--limit_removal no no Whether to remove multiple-mapped reads outside the target regions (default: yes)
--make_index no no Whether to index output BAM file (default: no)
no no GTF-only options:
--feature_type, -t no no GTF feature containing intervals (default: exon)
no no Advanced options:
--bloom_size no no Expected maximum number of reads in target regions (default: 7e7)
--false_positive no no False positive rate (default: 4e-7)
no no This tool will remove BAM records that overlaps a set of given regions. By default, if the removed reads are also mapped to other regions outside the given ones, they will also be removed.

About

WipeReads is part of BIOPET tool suite that is developed at LUMC by the SASC team. Each tool in the BIOPET tool suite is meant to offer a standalone function that can be used to perform a dedicate data analysis task or added as part of a pipeline, for example the SASC team's biowdl pipelines.

All tools in the BIOPET tool suite are Free/Libre and Open Source Software.

Contributing

The source code of WipeReads can be found here. We welcome any contributions. Bug reports, feature requests and feedback can be submitted at our issue tracker.

WipeReads is build using sbt. Before submitting a pull request, make sure all tests can be passed by running sbt test from the project's root. We recommend using an IDE to work on WipeReads. We have had good results with this IDE.

Contact

For any question related to WipeReads, please use the github issue tracker or contact the SASC team directly at: sasc@lumc.nl.