Description

This tool validates a FASTQ file. When data is paired it can also validate a pair of FASTQ files. ValidateFastq will check if the FASTQ is in valid FASTQ format. This includes checking for duplicate reads and checking whether a pair of FASTQ files contains the same amount of reads and headers match. It also check whether the quality encodings are correct and outputs the most likely encoding format (Sanger, Solexa etc.).

Installation

ValidateFastq requires Java 8 to be installed on your device. Download Java 8 here or install via your distribution's package manager.

Download the latest version of ValidateFastq here. To generate the usage run:

java -jar <ValidateFastq_jar> --help

Manual

ValidateFastq validates the following things:

  • If paired: whether both fastqs have the same amount of reads
  • If paired: whether sequence headers match.
  • Whether the quality encoding is of the same length as the sequence in a read
  • Whether the sequence consists of AGTC only. Regex: ([actgnACTGN+]+)
  • Whether the quality encoding is within a valid ASCII range

Example

To validate a fastq file use:

java -jar <ValidateFastq_jar> \
-i input.fastq


To validate a pair of fastq files use:

java -jar <ValidateFastq_jar> \
-i input.fastq \
-j input2.fastq

Usage

Usage for ValidateFastq:

Option Required Can occur multiple times Description
--log_level, -l no no Level of log information printed. Possible levels: 'debug', 'info', 'warn', 'error'
--help, -h no no Print usage
--version, -v no no Print version
--fastq1, -i yes no FASTQ file to be validated. (Required)
--fastq2, -j no no Second FASTQ to be validated if FASTQs are paired. (Optional)

About

ValidateFastq is part of BIOPET tool suite that is developed at LUMC by the SASC team. Each tool in the BIOPET tool suite is meant to offer a standalone function that can be used to perform a dedicate data analysis task or added as part of BIOPET pipelines.

All tools in the BIOPET tool suite are Free/Libre and Open Source Software.

Contributing

The source code of ValidateFastq can be found here. We welcome any contributions. Bug reports, feature requests and feedback can be submitted at our issue tracker.

ValidateFastq is build using sbt. Before submitting a pull request, make sure all tests can be passed by running sbt test from the project's root. We recommend using an IDE to work on ValidateFastq. We have had good results with this IDE.

Contact

For any question related to ValidateFastq, please use the github issue tracker or contact the SASC team directly at: sasc@lumc.nl.