-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: DSL2+ #312
Proposal: DSL2+ #312
Conversation
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Closing in favor of #309
Based on discussions with the community, I have concluded that the other proposed changes (use params only in entry workflow, remove ext config, workflow outputs) will be much easier to do with static types, so I'm not going to push for it very much in the meantime. IMO it makes more sense to wait for static types and refactor your pipeline once, rather than refactor now with suboptimal syntax and then refactor again in a year, which wouldn't bring much benefit, especially now with the language server. I do encourage fetchngs to go ahead and update where appropriate, but we can pursue these changes in smaller pieces, in particular:
|
Spun off from #309 to showcase just the DSL2 parts:
Only use params in top-level workflow: can be done today. Pass params into processes and workflows as explicit inputs. Might have missed a few, but you get the idea
Replace ext config with params and process inputs: can be done today, see Refactor ext config as params #308. Formalizes ext args as process inputs. If an args input needs to be configurable, it can be a param.
Replace publishDir with workflow output definition: available in 24.04 as a preview feature (Workflow output definition nextflow-io/nextflow#4784). Moves publish definition to workflow level by publishing channels. Combined with the ext refactor, removes the need for most module config and process selectors.
Make config comply with strict config parser: coming sometime this year as an opt-in feature (Config parser (and loader) nextflow-io/nextflow#4744). Restricts the config syntax to assignment / block / include with the ability for values to be Groovy expressions. Mostly improves the error reporting at runtime, not much syntax changes are needed. Replace
check_max()
with theresourceLimits
directive, coming in 24.04 (Add resourceLimits directive nextflow-io/nextflow#2911).Use params schema as source of truth: Only define params in schema instead of config file. Convert schema to YAML for better readability. Config profiles can still override param default value. New config parser (above) will fix issue with params resolution in config (Allow custom configs
params
to be parsed before nextflow.config nextflow-io/nextflow#2662). Incorporate params validation from nf-validation into core Nextflow.Use eval output, topic channels to collect tool versions: can be done today, see 'versions' directive in process nextflow-io/nextflow#4386. Simplify the collection of tool versions, removes lots of boilerplate from processes and workflows.
Add workflow output schema: coming sometime this year in the final version of the workflow output definition. The output schema is essentially a collection of schemas for index files (like a samplesheet).
This schema can be used to launch a chain of pipelines in Seqera Platform -- when filling out the launch form for a downstream pipeline, you should be able to select "expected" outputs from the upstream pipeline as inputs, e.g. mapping an output samplesheet to an input samplesheet. The schema is used to verify whether an output and input can be connected.
For example, the schema for the fetchngs output samplesheet should somehow "match" the schema for the rnaseq input samplesheet, so that you can select it, then Seqera Platform should launch the rnaseq run immediately after the fetchngs run completes.
Make pipeline import-able: See https://github.com/bentsherman/fetchngs2rnaseq for a more complete example of this and some notes. The main thing missing from this PR for importability is to make sure that the
SRA
workflow can be used independently of any params or publishing. It should be possible to pass a channel of samples + metadata directly into a downstream workflow, rather than saving to and re-loading from a samplesheet.