Skip to content

Commit

Permalink
Merge pull request #8 from Workflomics/execution_improvements
Browse files Browse the repository at this point in the history
FIxed order of execution
  • Loading branch information
vedran-kasalica authored Mar 19, 2024
2 parents f2f47a2 + 507c4ca commit 7239fb0
Show file tree
Hide file tree
Showing 6 changed files with 104 additions and 32 deletions.
50 changes: 50 additions & 0 deletions CITATION.cff
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
cff-version: 1.2.0
title: 'Workflomics: Bioinformatics Workflow Generation and Benchmarking'
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Vedran
family-names: Kasalica
email: [email protected]
affiliation: Netherlands eScience Center, Netherlands
orcid: 'https://orcid.org/0000-0002-0097-1056'
- family-names: Nauman
given-names: Ahmed
email: [email protected]
affiliation: Netherlands eScience Center, Netherlands
orcid: 'https://orcid.org/0000-0003-3559-9941'
- family-names: Peter
given-names: Kok
email: [email protected]
affiliation: Netherlands eScience Center, Netherlands
orcid: 'https://orcid.org/0000-0002-6630-7326'
- family-names: Anna-Lena
given-names: Lamprecht
email: [email protected]
affiliation: University of Potsdam, Germany
orcid: 'https://orcid.org/0000-0003-1953-5606'
- family-names: Magnus
given-names: Palmblad
email: [email protected]
affiliation: Leiden University Medical Center, Netherlands
orcid: 'https://orcid.org/0000-0002-5865-8994'
identifiers:
- type: doi
value: 10.5281/zenodo.10047136
repository-code: 'https://github.com/workflomics/workflomics-frontend'
url: 'https://github.com/workflomics/workflomics-frontend'
abstract: >-
The Workflomics platform aims to address the challenge faced by life science researchers who work with increasingly large and complex datasets and struggle to create optimal workflows for their data analysis problems.
The platform facilitates a "Great Bake Off" of computational workflows in bioinformatics by integrating bioinformatics tools and metadata with technologies for automated workflow exploration and benchmarking. This enables a systematic and rigorous approach to the development of cutting-edge workflows, specifically in the field of proteomics, to increase scientific quality, robustness, reproducibility, FAIRness, and maintainability.
The platform currently focuses on the proteomics domain. We aim to extend the platform to additional domains, e.g., metabolomics, genomics.
keywords:
- bioinformatics
- workflow benchmarking
license: Apache-2.0
commit: bb02154d29104b2667006e12d37239a3d382b75d
version: 0.1.1
date-released: '2023-11-12'
51 changes: 33 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,57 +1,72 @@
# Workflomics Benchmarker

Library used to execute workflows (in CWL) and benchmark them as part of the Workflomics ecosystem.
**Workflomics Benchmarker** is a versatile library designed for executing and benchmarking workflows encapsulated in Common Workflow Language (CWL) within the Workflomics ecosystem.

## Credits
Detailed user documentation is available on [readthedocs](https://workflomics.readthedocs.io/en/latest/workflomics-benchmarker/benchmarker.html).

## Badges

| Description | Badge |
|:------------|:------|
| **Packages and Releases** | ![Latest release](https://img.shields.io/github/release/workflomics/workflomics-benchmarker.svg) [![PyPI](https://img.shields.io/pypi/v/workflomics-benchmarker.svg)](https://pypi.python.org/pypi/workflomics-benchmarker/) |
| **License** | [![GitHub license](https://img.shields.io/github/license/workflomics/workflomics-benchmarker)](https://github.com/workflomics/workflomics-benchmarker/blob/main/LICENSE) |
<!-- | **DOI** | [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.7867899.svg)](https://doi.org/10.5281/zenodo.7867899) | -->

The `workflomics benchmarker` script was developed by [Nauman Ahmed](@nahmedraja) as part of the [containers](https://github.com/Workflomics/containers) repository, but was since migrated to its own repository (see [PR #49](https://github.com/Workflomics/containers/pull/49)) to be published as a stand-alone package.

## Requirements

- Python 3.9+
- Poetry
- Docker or Singularity running
- Python 3.9 or higher
- Docker or Singularity
- Poetry (if you want to build the package from source)

## Installation

Install `workflomics-benchmarker` from PyPI using pip:

```bash
poetry install
pip install workflomics-benchmarker
```

Alternatively, you clone the repository and can install it using Poetry by running:

```bash
git clone https://github.com/workflomics/workflomics-benchmarker.git
cd workflomics-benchmarker
poetry install
```

## Usage

The command is used with Docker or Singularity service running. It will execute the workflow and benchmark it.
Ensure Docker or Singularity is running before executing workflows. Here are the commands for both services:

### Docker

```bash
workflomics benchmark tests/data/
workflomics benchmark tests/data/
```

which is equivalent to
Or directly with Python:

```bash
python src/benchmarker/workflomics.py benchmark tests/data/
python src/benchmarker/workflomics.py benchmark tests/data/
```

The results will be stored in the `./tests/data` folder.

The results will be saved in the `./tests/data` directory.

### Singularity

Finally, you can run the test with Singularity. This will require you to have Singularity installed and running, and to use the `--singularity` flag.
To use Singularity, ensure it's installed and append the `--singularity` flag:

```bash
python src/benchmarker/workflomics.py benchmark tests/data/ --singularity
python src/benchmarker/workflomics.py benchmark tests/data/ --singularity
```

## Testing

To run the tests, you can use the following command:
Run the following command to execute tests:

```bash
poetry run pytest -s
poetry run pytest -s
```

The tests will execute a workflow and benchmark it (require Docker running). The results will be stored in the `./tests/data` folder.
This command runs a workflow and benchmarks it, assuming Docker is operational. Results are stored in the `./tests/data` directory.
6 changes: 3 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
name = "workflomics-benchmarker"
version = "0.1.0"
description = "Library used to execute workflows (in CWL) and benchmark them as part of the Workflomics ecosystem."
authors = ["Nauman Ahmed <n.ahmed@esciencecenter.nl>",
"Peter Kok <p.kok@esciencecenter.nl>",
"Vedran Kasalica <v.kaslica@esciencecenter.nl>"]
authors = ["Vedran Kasalica <v.kaslica@esciencecenter.nl>",
"Nauman Ahmed <n.ahmed@esciencecenter.nl>",
"Peter Kok <p.kok@esciencecenter.nl>"]
license = "Apache-2.0"
classifiers = [
"License :: OSI Approved :: Apache Software License",
Expand Down
16 changes: 7 additions & 9 deletions src/workflomics_benchmarker/cwltool_runtime_benchmark.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,8 @@ class CWLToolRuntimeBenchmark(CWLToolWrapper):
"1001+": 0,
}
WARNINGS_DESIRABILITY_BINS = {
"0-1": 0,
"2-3": -0.25,
"0-0": 0,
"1-3": -0.25,
"4-5": -0.5,
"6-7": -0.75,
"8+": -1,
Expand Down Expand Up @@ -150,8 +150,6 @@ def run_workflow(self, workflow) -> None:
entry["memory"] = "N/A"
entry["warnings"] = "N/A"
entry["errors"] = "N/A"
print("success_steps")
print(success_steps)
for (
step
) in (
Expand Down Expand Up @@ -229,13 +227,13 @@ def aggregate_workflow_benchmark_value(self, benchmark_name) -> int | Literal["
The value of the benchmark.
"""
value: int = 0
for entry in self.workflow_benchmark_result["steps"]:
for index, entry in enumerate(self.workflow_benchmark_result["steps"], start=1):
match benchmark_name:
case "status":
if entry[benchmark_name] != "✗" and entry[benchmark_name] != "-":
value = "✓"
else:
return "✗"
return f"({index-1}/{len(self.workflow_benchmark_result['steps'])}) ✗"
case "time":
if entry[benchmark_name] != "N/A":
value = value + entry["time"]
Expand Down Expand Up @@ -272,10 +270,10 @@ def calc_desirability(self, benchmark_name, value):
case "status":
if value == "✓":
return 1
elif value == "✗":
return -1
else:
elif value == "-":
return 0
else:
return -1
case "errors":
if isinstance(value, list):
value = len(value)
Expand Down
4 changes: 2 additions & 2 deletions src/workflomics_benchmarker/cwltool_wrapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
import sys

from workflomics_benchmarker.loggingwrapper import LoggingWrapper
from workflomics_benchmarker.utils import natural_keys
class CWLToolWrapper():
""" The class contains the common methods for the benchmarking and running CWL workflows."""

Expand All @@ -31,12 +32,11 @@ def __init__(self, args):

self.verbose = args.verbose if hasattr(args, 'verbose') else False

self.workflows = [str(file) for file in Path(args.workflows).glob('*.cwl')]
self.workflows = sorted([str(file) for file in Path(args.workflows).glob('*.cwl')], key=natural_keys)
self.version = self.check_cwltool()
self.input = self.update_input_yaml(self.input_yaml_path)



def check_cwltool(self):
"""Check if cwltool is installed and return the version"""
try:
Expand Down
9 changes: 9 additions & 0 deletions src/workflomics_benchmarker/utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@

import re


def natural_keys(text):
'''
Function to convert text to lower case and digits to integer
'''
return [int(c) if c.isdigit() else c.lower() for c in re.split('(\d+)', text)]

0 comments on commit 7239fb0

Please sign in to comment.