This tool enables a user to create a full sample sheet in JSON format or YAML format, suitable for all Biopet Queue pipelines, from TSV file(s).
SamplesTsvToConfig requires Java 8 to be installed on your device. Download Java 8 here or install via your distribution's package manager.
Download the latest version of SamplesTsvToConfig here. To generate the usage run:
java -jar <SamplesTsvToConfig_jar> --help
A user provides a TAB separated file (TSV) with sample specific properties which are parsed into JSON format by the tool. For example, a user wants to add certain properties to the description of a sample, such as the treatment a sample received. Then a TSV file with an extra column called treatment is provided. The resulting file will have the 'treatment' property in it as well. The order of the columns is not relevant to the end result
The tag files works the same only the value is prefixed in the key tags
.
To get the below example out of the tool one should provide 2 TSV files as follows:
sample | library | bam |
---|---|---|
Sample_ID_1 | Lib_ID_1 | MyFirst.bam |
Sample_ID_2 | Lib_ID_2 | MySecond.bam |
The second TSV file can contain as much properties as you would like. Possible option would be: gender, age and family. Basically anything you want to pass to your pipeline is possible.
sample | treatment |
---|---|
Sample_ID_1 | heatshock |
Sample_ID_2 | heatshock |
samples:
Sample_ID_1:
treatment: heatshock
libraries:
Lib_ID_1:
bam: MyFirst.bam
Sample_ID_2:
treatment: heatshock
libraries:
Lib_ID_2:
bam: MySecond.bam
{
"samples" : {
"Sample_ID_1" : {
"treatment" : "heatshock",
"libraries" : {
"Lib_ID_1" : {
"bam" : "MyFirst.bam"
}
}
},
"Sample_ID_2" : {
"treatment" : "heatshock",
"libraries" : {
"Lib_ID_2" : {
"bam" : "MySecond.bam"
}
}
}
}
}
Usage for SamplesTsvToConfig:
Option | Required | Can occur multiple times | Description |
---|---|---|---|
--log_level, -l | no | no | Level of log information printed. Possible levels: 'debug', 'info', 'warn', 'error' |
--help, -h | no | no | Print usage |
--version, -v | no | no | Print version |
--inputFiles, -i | no | yes (unlimited) | Input must be a tsv file, first line is seen as header and must at least have a 'sample' column, 'library' column is optional, multiple files can be specified by using multiple flags. |
--tagFiles, -t | no | yes (unlimited) | This works the same as for a normal input file. Difference is that it placed in a sub key 'tags' in the config file |
--outputFile, -o | no | no | When the extension is .yml or .yaml the output is in yaml format, otherwise it is in json. When no file is given the output goes to stdout as yaml. |
SamplesTsvToConfig is part of BIOPET tool suite that is developed at LUMC by the SASC team. Each tool in the BIOPET tool suite is meant to offer a standalone function that can be used to perform a dedicate data analysis task or added as part of BIOPET pipelines.
All tools in the BIOPET tool suite are Free/Libre and Open Source Software.
The source code of SamplesTsvToConfig can be found here. We welcome any contributions. Bug reports, feature requests and feedback can be submitted at our issue tracker.
SamplesTsvToConfig is build using sbt. Before submitting a pull request, make sure all tests can be passed by
running sbt test
from the project's root. We recommend using an IDE to work on SamplesTsvToConfig. We have had
good results with this IDE.
For any question related to SamplesTsvToConfig, please use the github issue tracker or contact the SASC team directly at: sasc@lumc.nl.