Skip to content
This repository has been archived by the owner on Feb 7, 2023. It is now read-only.

syntax error: Input and output files have to be specified as strings or lists of strings. #33

Open
peterthorpe5 opened this issue Apr 10, 2022 · 3 comments

Comments

@peterthorpe5
Copy link

Dear Nanoporetech,

I am having an issue running this. I have altered the config.yaml (pasted below). I read an another issue that full paths were required so I added these, but have removed identifying names. (also can the pipeline take compressed fq files?). LIne 15 is the transcriptome: "/PATH/TO/analysis/GRCh38.primary_assembly.genome.fa" - line. I cant see what is wrong with this. Sorry! can you please help?
Pete

I get the following error:

pipeline-transcriptome-de]$ snakemake --use-conda -j 24 all
SyntaxError:
Input and output files have to be specified as strings or lists of strings.
File "/PATH/analysis/pipeline-transcriptome-de/Snakefile", line 15, in
File "/PATH/analysis/pipeline-transcriptome-de/snakelib/utils.snake", line 15, in

General pipeline parameters:

Name of the pipeline:

pipeline: "pipeline-transcriptome-de_phe"

ABSOLUTE path to directory holding the working directory:

workdir_top: "/PATH/TO/analysis/"

Results directory:

resdir: "results"

Repository URL:

repo: "https://github.com/nanoporetech/pipeline-transcriptome-de"

Pipeline-specific parameters:

Transcriptome fasta

transcriptome: "/PATH/TO/analysis/GRCh38.primary_assembly.genome.fa"

Annotation GFF/GTF

annotation: "/PATH/TO/analysis/gencode.v39.annotation.gff3"

Control samples

control_samples:
C1: "/PATH/TO/analysis/R1_.fastq.gz"
C2: "/PATH/TO/analysis/R2_.fastq.gz"
C3: "/PATH/TO/analysis/R3_.fastq.gz"

Treated samples

treated_samples:
IR1: "/PATH/TO/analysis/R4_.fastq.gz"
IR2: "/PATH/TO/analysis/R5_.fastq.gz"
IR3: "/PATH/TO/analysis/R6_.fastq.gz"

Minimap2 indexing options

minimap_index_opts: ""

Minimap2 mapping options

minimap2_opts: ""

Maximum secondary alignments

maximum_secondary: 100

Secondary score ratio (-p for minimap2)

secondary_score_ratio: 1.0

Salmon library type

salmon_libtype: "U"

Count filtering options - customize these according to your experimental design:

Genes expressed in minimum this many samples

min_samps_gene_expr: 3

Transcripts expressed in minimum this many samples

min_samps_feature_expr: 1

Minimum gene counts

min_gene_expr: 10

Minimum transcript counts

min_feature_expr: 3

Threads

threads: 24

@peterthorpe5
Copy link
Author

Is it possible for someone to give me some guidance here?

@EnJun-Yang
Copy link

Hi,

I've encountered the same issue with running the transcriptomics-de pipeline; though I should note that I'm using paired branch of the pipeline (https://github.com/nanoporetech/pipeline-transcriptome-de/tree/paired_dge_dtu).

In my case it appears to be a rule that's been implemented in the snakelib/utils.snake document that is preventing the pipeline from recognising the input files.

I've tried hashing out that line in the Snakefile (line 15, include: "snakelib/utils.snake"), and so far the pipeline appears to be running (now on the mapping step). Happy to discuss more, and would love to hear feedback from the ONT side on what might need to be updated.

EnJun
PS: Also replied with something similar on the ONT community forums

@peterthorpe5
Copy link
Author

@EnJun-Yang thank you for your reply :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants