Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NFCORE_RNASEQ:RNASEQ:CUSTOM_DUMPSOFTWAREVERSIONS (1)` terminated with an error exit status (1) #1103

Closed
kvn95ss opened this issue Oct 25, 2023 · 15 comments · Fixed by nf-core/modules#4556
Labels
bug Something isn't working

Comments

@kvn95ss
Copy link

kvn95ss commented Oct 25, 2023

Description of the bug

I am running the pipeline in Rackham server in UPPMAX, so I am using -profile uppmax when running the command. I resume a run where I have processed around 200 samples. In the step CUSTOM_DUMPSOFTWAREVERSIONS, the run always terminates with the error. Funnily, I have had a couple of runs in the same output and work directory, none of them had failed previously.

Command used and terminal output

INFO:    Environment variable SINGULARITYENV_TMPDIR is set, but APPTAINERENV_TMPDIR is preferred
INFO:    Environment variable SINGULARITYENV_NXF_TASK_WORKDIR is set, but APPTAINERENV_NXF_TASK_WORKDIR is preferred
INFO:    Environment variable SINGULARITYENV_NXF_DEBUG is set, but APPTAINERENV_NXF_DEBUG is preferred
INFO:    Environment variable SINGULARITYENV_SNIC_TMP is set, but APPTAINERENV_SNIC_TMP is preferred
Traceback (most recent call last):
  File "/work/.command.sh", line 101, in <module>
    main()
  File "/work/.command.sh", line 61, in main
    versions_by_process = yaml.load(f, Loader=yaml.BaseLoader) | versions_this_module
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/yaml/__init__.py", line 81, in load
    return loader.get_single_data()
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/yaml/constructor.py", line 49, in get_single_data
    node = self.get_single_node()
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/yaml/composer.py", line 36, in get_single_node
    document = self.compose_document()
               ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/yaml/composer.py", line 55, in compose_document
    node = self.compose_node(None, None)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/yaml/composer.py", line 84, in compose_node
    node = self.compose_mapping_node(anchor)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/yaml/composer.py", line 127, in compose_mapping_node
    while not self.check_event(MappingEndEvent):
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/yaml/parser.py", line 98, in check_event
    self.current_event = self.state()
                         ^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/yaml/parser.py", line 438, in parse_block_mapping_key
    raise ParserError("while parsing a block mapping", self.marks[-1],
yaml.parser.ParserError: while parsing a block mapping
  in "collated_versions.yml", line 1, column 1
expected <block end>, but found '<scalar>'
  in "collated_versions.yml", line 1329, column 2

Relevant files

No response

System information

Running on Rackham server in UPPMAX.

@kvn95ss kvn95ss added the bug Something isn't working label Oct 25, 2023
@martinfthomsen
Copy link

I get a similar error:
" expected , but found '['"

ERROR ~ Error executing process > 'NFCORE_RNASEQ:RNASEQ:CUSTOM_DUMPSOFTWAREVERSIONS (1)'

Caused by:
  Process `NFCORE_RNASEQ:RNASEQ:CUSTOM_DUMPSOFTWAREVERSIONS (1)` terminated with an error exit status (1)

Command executed [/<REDACTED>/./workflows/../modules/nf-core/custom/dumpsoftwareversions/te
mplates/dumpsoftwareversions.py]:

  #!/usr/bin/env python
  
  
  """Provide functions to merge multiple versions.yml files."""
  
  
  import yaml
  import platform
  from textwrap import dedent
  
  
  def _make_versions_html(versions):
      """Generate a tabular HTML output of all versions for MultiQC."""
      html = [
          dedent(
              """\
              <style>
              #nf-core-versions tbody:nth-child(even) {
                  background-color: #f2f2f2;
              }
              </style>
              <table class="table" style="width:100%" id="nf-core-versions">
                  <thead>
                      <tr>
                          <th> Process Name </th>
                          <th> Software </th>
                          <th> Version  </th>
                      </tr>
                  </thead>
              """
          )
      ]
      for process, tmp_versions in sorted(versions.items()):
          html.append("<tbody>")
          for i, (tool, version) in enumerate(sorted(tmp_versions.items())):
              html.append(
                  dedent(
                      f"""\
                      <tr>
                          <td><samp>{process if (i == 0) else ''}</samp></td>
                          <td><samp>{tool}</samp></td>
                          <td><samp>{version}</samp></td>
                      </tr>
                      """
                  )
              )
          html.append("</tbody>")
      html.append("</table>")
      return "\n".join(html)
  
  
  def main():
      """Load all version files and generate merged output."""
      versions_this_module = {}
      versions_this_module["NFCORE_RNASEQ:RNASEQ:CUSTOM_DUMPSOFTWAREVERSIONS"] = {
          "python": platform.python_version(),
          "yaml": yaml.__version__,
      }
  
      with open("collated_versions.yml") as f:
          versions_by_process = yaml.load(f, Loader=yaml.BaseLoader) | versions_this_module
  
      # aggregate versions by the module name (derived from fully-qualified process name)
      versions_by_module = {}
      for process, process_versions in versions_by_process.items():
          module = process.split(":")[-1]
          try:
              if versions_by_module[module] != process_versions:
                  raise AssertionError(
                      "We assume that software versions are the same between all modules. "
                      "If you see this error-message it means you discovered an edge-case "
                      "and should open an issue in nf-core/tools. "
                  )
                  )
          except KeyError:
              versions_by_module[module] = process_versions
  
      versions_by_module["Workflow"] = {
          "Nextflow": "23.04.3",
          "nf-core/rnaseq": "3.13.2",
      }
  
      versions_mqc = {
          "id": "software_versions",
          "section_name": "nf-core/rnaseq Software Versions",
          "section_href": "https://github.com/nf-core/rnaseq",
          "plot_type": "html",
          "description": "are collected at run time from the software output.",
          "data": _make_versions_html(versions_by_module),
      }
  
      with open("software_versions.yml", "w") as f:
          yaml.dump(versions_by_module, f, default_flow_style=False)
      with open("software_versions_mqc.yml", "w") as f:
          yaml.dump(versions_mqc, f, default_flow_style=False)
  
      with open("versions.yml", "w") as f:
          yaml.dump(versions_this_module, f, default_flow_style=False)
  
  
  if __name__ == "__main__":
      main()

Command exit status:
  1

Command output:
  (empty)

Command error:
  INFO:    Converting SIF file to temporary sandbox...
  Traceback (most recent call last):
    File ".command.sh", line 101, in <module>
      main()
    File ".command.sh", line 61, in main
      versions_by_process = yaml.load(f, Loader=yaml.BaseLoader) | versions_this_module
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.12/site-packages/yaml/__init__.py", line 81, in load
      return loader.get_single_data()
             ^^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.12/site-packages/yaml/constructor.py", line 49, in get_single_data
      node = self.get_single_node()
             ^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.12/site-packages/yaml/composer.py", line 36, in get_single_node
      document = self.compose_document()
                 ^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.12/site-packages/yaml/composer.py", line 55, in compose_document
      node = self.compose_node(None, None)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.12/site-packages/yaml/composer.py", line 84, in compose_node
      node = self.compose_mapping_node(anchor)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.12/site-packages/yaml/composer.py", line 133, in compose_mapping_node
      item_value = self.compose_node(node, item_key)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.12/site-packages/yaml/composer.py", line 84, in compose_node
      node = self.compose_mapping_node(anchor)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.12/site-packages/yaml/composer.py", line 127, in compose_mapping_node
      while not self.check_event(MappingEndEvent):
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.12/site-packages/yaml/parser.py", line 98, in check_event
      self.current_event = self.state()
                           ^^^^^^^^^^^^
    File "/usr/local/lib/python3.12/site-packages/yaml/parser.py", line 438, in parse_block_mapping_key
      raise ParserError("while parsing a block mapping", self.marks[-1],
  yaml.parser.ParserError: while parsing a block mapping
    in "collated_versions.yml", line 126, column 5
  expected <block end>, but found '['
    in "collated_versions.yml", line 126, column 21
  INFO:    Cleaning up image...

@fruce-ki
Copy link

fruce-ki commented Dec 4, 2023

I also get an error in this task, with version 3.13.2 and the version before it.

I am not sure if it is the same issue as originally posted. I've traced mine back to umi-tools and matplotlib.

In the working folder for the task, the collated_versions.yml has unexpected formatting for umi-tools:

"NFCORE_RNASEQ:RNASEQ:BAM_DEDUP_STATS_SAMTOOLS_UMITOOLS_GENOME:UMITOOLS_DEDUP":
    umitools: Matplotlib created a temporary config/cache directory at /tmp/matplotlib-xzn3u8vt because the default path (/home/kimon/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
 1.1.4

I am not sure why this warning appears there of all places. what I do know is that the path in question is for sure present and has dwrx permissions for everyone. But it is oudside Nextflow's sandbox directory. Maybe the pipeline should set this env variable suggested in the warning message to an appropriate local directory, and then unset it. Or anticipate the potential presence of the warning when collating software versions.

As far as I can tell, everything else in the pipeline has completed correctly.

@kvn95ss
Copy link
Author

kvn95ss commented Dec 4, 2023

@fruce-ki I just checked my yml files, I too have the same error in some of the yml files.

Looking at the spacing in first line, maybe it could be a parsing error?

For ex. the original error might be shown as

UMI-tools version:     umitools: Matplotlib created a temporary config/cache directory at /tmp/matplotlib-xzn3u8vt because the default path (/home/kimon/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
UMI-tools version: 1.1.4

and since the regex used in sed is 's/^.*UMI-tools version://; s/ *\$//' it must have gone through it.

But that's just my guess, maybe adding tail -1 could ensure only last line is processed by sed, but that's making assumptions that any future error codes won't appear in the later lines.

@MatthiasZepper
Copy link
Member

MatthiasZepper commented Dec 4, 2023

Yes, it is a parsing error that is related to the dreaded home directory issue popping up in large numbers in the recent weeks.

Since Nextflow 23.10, the user’s home directory is no longer mounted automatically inside Singularity containers, the --no-home setting became default. In terms of reproducibility, this is certainly desirable, but it causes multiple pipelines to crash, which silently misuse the home directory as writeable temporary cache (most common offenders are Python libraries such as Matplotlib or Numba and tools like Qiime2, Pixelator or CheckM or apparently also umi-tools.

Ultimately, the respective modules need to be fixed, but until those updates propagated into new, validated pipeline releases, it will likely be required to set NXF_SINGULARITY_HOME_MOUNT=true to stick to the old behaviour.

So either run the pipeline with Nextflow 23.04 or set the environment variable accordingly to mount the home directory into the Singularity containers.

@martinfthomsen
Copy link

Thanks for the suggestion Matthias. Sadly, adding "NXF_SINGULARITY_HOME_MOUNT=true" did not resolve my issue.

@martinfthomsen
Copy link

martinfthomsen commented Dec 6, 2023

I cannot seem to find this "collated_versions.yml". It is not found in the work directory, do anyone know where this file can be located?

EDIT: nvm, I found the reference inside the .command.run file 🙂

@martinfthomsen
Copy link

Update: I can see my issue stems from the following entry in the yaml file.

"NFCORE_RNASEQ:RNASEQ:FASTQ_FASTQC_UMITOOLS_TRIMGALORE:FASTQC":
    fastqc: [0.001s][warning][os,container] Duplicate cpuset controllers detected. Picking /sys/fs/cgroup/cpuset, skipping /ngc/tools/ngctools/singularity/3.9.6/var/sing
0.12.1

Seems the tool creating the yaml file, does not handle a log warning properly, and adds it to the output (collated_versions.yml)...

Any suggestions how to fix/hack this issue?

@martinfthomsen
Copy link

I also get an error in this task, with version 3.13.2 and the version before it.

I am not sure if it is the same issue as originally posted. I've traced mine back to umi-tools and matplotlib.

In the working folder for the task, the collated_versions.yml has unexpected formatting for umi-tools:

"NFCORE_RNASEQ:RNASEQ:BAM_DEDUP_STATS_SAMTOOLS_UMITOOLS_GENOME:UMITOOLS_DEDUP":
    umitools: Matplotlib created a temporary config/cache directory at /tmp/matplotlib-xzn3u8vt because the default path (/home/kimon/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
 1.1.4

I am not sure why this warning appears there of all places. what I do know is that the path in question is for sure present and has dwrx permissions for everyone. But it is oudside Nextflow's sandbox directory. Maybe the pipeline should set this env variable suggested in the warning message to an appropriate local directory, and then unset it. Or anticipate the potential presence of the warning when collating software versions.

As far as I can tell, everything else in the pipeline has completed correctly.

The message after "umitools: " also, like my issue, seems to be a warning or similar, that is being interpreted as the version.
I think the culprit is the tool which generated the collated_versions.yml file. It does not handle these warnings and other messages correctly, and just puts anything it gets into the yaml file, rather than just the version number of the tool...

However in your case, you might be able to get rid of the warning message by following Matthias's advice.

@martinfthomsen
Copy link

@fruce-ki I just checked my yml files, I too have the same error in some of the yml files.

Looking at the spacing in first line, maybe it could be a parsing error?

For ex. the original error might be shown as

UMI-tools version:     umitools: Matplotlib created a temporary config/cache directory at /tmp/matplotlib-xzn3u8vt because the default path (/home/kimon/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
UMI-tools version: 1.1.4

and since the regex used in sed is 's/^.*UMI-tools version://; s/ *\$//' it must have gone through it.

But that's just my guess, maybe adding tail -1 could ensure only last line is processed by sed, but that's making assumptions that any future error codes won't appear in the later lines.

Thanks @kvn95ss,

I opened modules/nf-core/fastqc/main.nf and changed line 40 and 52 from:

        fastqc: \$( fastqc --version | sed -e "s/FastQC v//g" )

to

        fastqc: \$( fastqc --version | tail -1 | sed -e "s/FastQC v//g" )

which solved the issue for me.

@mahesh-panchal
Copy link
Member

Reopened as the modules need to be included in RNAseq still

@drpatelh drpatelh added this to the 3.13.3 milestone Jan 3, 2024
drpatelh added a commit to drpatelh/nf-core-rnaseq that referenced this issue Jan 3, 2024
drpatelh added a commit that referenced this issue Jan 3, 2024
@drpatelh
Copy link
Member

drpatelh commented Jan 4, 2024

The problematic modules have now been patched in drpatelh@a59a6f3 which should fix this issue in the upcoming 3.14.0 release.

@drpatelh drpatelh closed this as completed Jan 4, 2024
@asp8200
Copy link

asp8200 commented Apr 4, 2024

Hi guys! I think I got a similar issue over in Sarek v3.3.2 (singularity-run), that is, I got a versions.yml-file with the following content:

"NFCORE_SAREK:sarek:BAM_VARIANT_CALLING_SOMATIC_ALL:BAM_VARIANT_CALLING_CNVKIT:CNVKIT_BATCH":
    samtools: samtools: error while loading shared libraries: libhts.so.3: cannot open shared object file: No such file or directory
    cnvkit: 0.9.10

which ends up in collated_versions.yml and then causes CUSTOM_DUMPSOFTWAREVERSIONS to fail.

I saw your fixes for your seemingly similar issues, but changing the way the "version string" gets passed seems a bit of a "superficial" fix to me, since there is clearly something fundamentally wrong when a call to <some_tool> --version throws an error, right? At first glance, I would prefer the job to fail loudly (instead of silently as it does now). Any idea about what the underlying problem might be?

@maxulysse and I have a discussion going on this on Slack.

@mahesh-panchal
Copy link
Member

I suggested this: https://nfcore.slack.com/archives/C043UU89KKQ/p1707125822258109?thread_ts=1706698395.146229&cid=C043UU89KKQ

Maybe we need to start checking the version strings against a regex in the script. This would be another reason to use environment variables (I'm not 100% on this idea as it may kill an otherwise working analysis):

VERSION=\$( pigz --version 2>&1 | sed 's/pigz //g' ) && [[ \$VERSION =~ ^[0-9.]+\$ ]]

( I'm not sure how one would handle R or python scripts though ).
The custom dumpsoftware versions should handle malformed version numbers too.

@asp8200
Copy link

asp8200 commented Apr 4, 2024

I suggested this: https://nfcore.slack.com/archives/C043UU89KKQ/p1707125822258109?thread_ts=1706698395.146229&cid=C043UU89KKQ

Maybe we need to start checking the version strings against a regex in the script. This would be another reason to use environment variables (I'm not 100% on this idea as it may kill an otherwise working analysis):

VERSION=\$( pigz --version 2>&1 | sed 's/pigz //g' ) && [[ \$VERSION =~ ^[0-9.]+\$ ]]

( I'm not sure how one would handle R or python scripts though ).
The custom dumpsoftware versions should handle malformed version numbers too.

You memory is strong, @mahesh-panchal 💪 Mine - not so much 😆 Your suggestion seems like a step in the right direction. Do we have a GitHub-issue for fixing this? (I looked in the modules-repo, since it seems like this is something that should be fixed across all modules.)

@mahesh-panchal
Copy link
Member

There's no specific issue as far as I know. Only discussions as we're still up in the air whether to use environment variables or cmd.
https://nfcore.slack.com/archives/C043FMKUNLB/p1701869129258849?thread_ts=1701792739.273019&cid=C043FMKUNLB
main discussion: https://nfcore.slack.com/archives/C043UU89KKQ/p1702306529139299
PR: #1115

@samuelruizperez samuelruizperez removed this from the 3.14.0 milestone Dec 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants