This repository contains a Nextflow pipeline that integrates the Bio2Byte tool ShiftCrypt. ShiftCrypt is used for the prediction of NMR chemical shifts using a variety of atom sets. This pipeline facilitates the execution of ShiftCrypt on NMR-related data files in either NEF or NMR-STAR format, streamlining data processing and prediction tasks.
The pipeline is designed to run the ShiftCrypt prediction tool with the following key parameters:
- Input Directory: The pipeline processes files from the specified input directory, supporting both .nef and .nmr file formats.
- Model: The user can select one of three atom set models for prediction:
- Model 1: Full atom set.
- Model 2: H, HA, CA, N, CB, C atoms.
- Model 3: CA, N, H atoms.
- Original Numbering: Whether to use the original sequence numbering in the prediction (default is true).
- File Type: Specify whether the input is in NMR-STAR format (default is false for NEF files).
The core of the workflow is the PREDICT_SHIFTCRYPT
module, which handles the execution of ShiftCrypt for each input file. The input files are automatically identified and processed, with results being output to a specified directory.
params.inputDir = params.inputDir ?: "./data"
params.outputDir = params.outputDir ?: "./results"
params.model = params.model ?: "1"
params.isModelStar = params.isModelStar ?: false
params.originalNumbering = params.originalNumbering ?: true
include { PREDICT_SHIFTCRYPT } from "$projectDir/modules/bio2byte"
input_files = Channel.fromPath("${params.inputDir}/*.{nef,nmr}")
.map { file -> tuple(file.baseName, file) }
workflow {
main:
PREDICT_SHIFTCRYPT(
params.model,
params.isModelStar,
params.originalNumbering,
input_files
)
}
- Nextflow: The pipeline is written in Nextflow DSL2 and requires Nextflow to be installed.
- Conda: Conda environments are required to run the pipeline.
To run the ShiftCrypt pipeline from this repository, follow these steps:
- Prepare your input data files. Place your .nef or .nmr files in the input directory, or specify a custom input directory using the
--inputDir
parameter. - Run the pipeline with Nextflow:
nextflow run nf-b2btools-shiftcrypt \
--inputDir ./data/nef_examples \
--outputDir ./results \
--model 1 \
--isModelStar false \
--originalNumbering true
# `–inputDir`: Directory containing input files (.nef or .nmr).
# `–outputDir`: Directory to store the results.
# `–model`: Choose from model 1, 2, or 3 based on atom set.
# `–isModelStar`: Set to true if using NMR-STAR files, otherwise leave as false for NEF files.
# `–originalNumbering`: Use true to retain original sequence numbering.
- Monitor the pipeline execution through the console output or view logs in the working directory.
The pipeline will output the ShiftCrypt prediction results into the directory specified by the --outputDir
parameter. The results for each input file will be stored in JSON format.
TBD