Musings on Bioinformatics, Data Science, Python, R, and more.
by Amit Indap
I had a LinkedIn post a few weeks ago about design patterns in Nextflow described in this repo To my suprise, it went kinda viral by my modest standards - 40 likes and 2400 impressions.
One of the patterns in the in the repo is executing a process/workflow on each record in a CSV file. It uses the splitCSV operator that parses the rows of CSV file.
This week though, I had to find a solution where I needed to parse a JSON file to launch my pipeline. I’m not a Nextlflow / Groovy expert, but with the help of SeqeraAI, I was able to find a solution with JsonSlurper to parse the JSON file.
Assuming I have a JSON file that looks like this:
{
"organism": "human",
"samples": [
{
"sample_id": "1234",
"vcf_file": "/path/to/1234.vcf.gz",
"vcf_index_file": "/path/to/1234.vcf.gz.tbi"
},
{
"sample_id": "12345",
"vcf_file": "/path/to/12345.vcf.gz",
"vcf_index_file": "/path/to/12345.vcf.gz.tbi"
}
]
}
It’s pretty straightforward to parse the JSON and launch my process:
import groovy.json.JsonSlurper
// Define the input parameters
params.json_file = 'file.json'
// Read and parse the JSON file
def jsonSlurper = new JsonSlurper()
def jsonData = jsonSlurper.parse(new File(params.json_file))
// Create a channel from the parsed JSON data
def samplesChannel = Channel.fromList(jsonData.samples)
// Process the samples
process processSamples {
input:
tuple val(sample_id), val(vcf_file), val(vcf_index_file)
output:
stdout
script:
"""
echo "Processing sample: ${sample_id}"
echo "VCF file: ${vcf_file}"
echo "VCF index file: ${vcf_index_file}"
"""
}
// Execute the workflow
workflow {
samplesChannel
.map { sample -> [sample.sample_id, sample.vcf_file, sample.vcf_index_file] }
.set { samples }
processSamples(samples)
.view()
}
Running the above code will produce the following output:
Processing sample: 1234
VCF file: /path/to/1234.vcf.gz
VCF index file: /path/to/1234.vcf.gz.tbi
Processing sample: 12345
VCF file: /path/to/12345.vcf.gz
VCF index file: /path/to/12345.vcf.gz.tbi
And with that, my Nextflow journey continues …
tags: SeqeraAI - Nextflow - Groovy