Split linters in separate classes #17081

bernt-matthias · 2023-11-24T16:01:24Z

So far mainly a call for discussion. Would be nice to have this discussed first before I continue with the other linters
(discuss we could transform only stdio, a few more, or all linters within this PR).

Currently, a linter is a function prefixed by lint_ (most of them are contained in tool_util.linters which are used by default). These linters are at the moment quite monolithic, e.g. there is a single linter for the input tag and we have
only a limited number of states (check, info, warn, error).

This led to a lot of problems and frustration (see) since

It has never been defined what qualifies as an error/warning and contributors had their own different definitions.
It has not been possible to disable single checks, but only complete linters (which is not useful due to their monolithic nature)

So far this change does the following

Create a new Linter class which makes it easier to decide what a linter is in the loop over the modules.
The main method of this class is the classmethod lint
We could have a fix method that may be used to fix some problems (discuss: maybe not in this PR?).
when I started I added a code property (inspired by pep8/flake8 codes) which we could add to the LintMessages that are generated by the lint methods. But then I realised that skipping of linters simply works by the linter name (i.e. class name) .. maybe this is sufficient (discuss: advantage of codes is that we definitely have short codes for the planemo CLI, alphanumeric codes might become longer but this is closer to what we have at the moment). Alternatively we could just call the classes with codes, e.g. S001.

The following already works: planemo lint tools/filters/cutWrapper.xml --skip StdIOAbsence

removes the attribute check for regex and exit_code tags and subtag check for stdio since those are covered already by the xsd linter.

How to test the changes?

(Select all options that apply)

I've included appropriate automated tests.
This is a refactoring of components with existing test coverage.
Instructions for manual testing are as follows:
1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

mvdbeek · 2023-11-24T16:09:39Z

lib/galaxy/tool_util/linters/stdio.py

+                    try:
+                        re.compile(match)
+                    except Exception as e:
+                        lint_ctx.error(f"Match '{match}' is no valid regular expression: {str(e)}", node=child)


Don't we need to have a code per message, unless one linter only emits one possible message ?

We can try to do this. With the current state I would need to split StdIOAbsence only, or? I hoped to come away with something like at most one message per linter (which would be fine for profile versions as in StdIOAbsence.

Also, it's important to note that we currently filter for linter function/class names (here) and not linter message codes. I would say that do not need the codes. But we can either name the filter classes using a code.

One message per linter sounds fine, but also like a lot of work.

One message per linter sounds fine, but also like a lot of work.

The good thing is that we have good unit test coverage - which now pays off.

If you augment the lint_ctx.function to instead pass codes we can use those codes to determine what is classified as an error / what can be ignored. This seems like less work (we'd have to run all relatively coarse linters, but I think that's ok ?)

Excellent idea, but I think we are fixed to the signature of the linter function, i.e 1st parameter ToolSource and 2nd parameter LintContext. The reason is that there are some linters that live in the planemo sources. One option would be to move them here (not sure if this would add extra requirements) - which might be good anyway.

But there might be another way, due to this hack we can filter the messages before adding them to the final message list (and also if without this hack we could filter the list of messages after each linter application).

I'm a bit worried how we can ensure that our codes are not duplicated.

One message per linter sounds fine, but also like a lot of work.

Main question for me is what is the "best" solution ..

we'd have to run all relatively coarse linters, but I think that's ok ?

I also think this would be OK. Runtime should not be a concern.

The reason is that there are some linters that live in the planemo sources

I think when I first implemented the XSD validation the external dependency on xmllint (https://github.com/galaxyproject/planemo/blob/master/planemo/xml/validation.py#L14C20-L14C27) was the issue. lxml seemed less mature at the time than xmllint. I guess we use lxml a lot more aggressively now and we have been generally pleased with the XSD validation it is doing and we now have a Galaxy dependency on lxml. I think I would be fine dropping the xmllint path through planemo if it would simplify things and let the XSD validation move into Galaxy.

Thanks for the comment @jmchilton. Will try to move it here, but probably in a separate PR.

I'm still experimenting if separate classes are the way to go. It seems difficult to avoid code duplication, but it will become clearer how we end up in the different linter messages.

bernt-matthias · 2023-12-07T16:38:15Z

Ping @mvdbeek and @jmchilton I'm quite happy with the current state. I have moved the lxml-based xsd validation from planemo to here. I thought it would be a good idea to run the xsd linter with each of the unit tests for the linters (mainly to remove redundant linters) which uncovered a few minor problems (e.g. in the xsd).

For me, the TODOs would be

decide if we want to store the linter type with each message and if we want shorter (eg numeric) linter names
- we could also have something like aliases which would also allow to continue to use current skip lists, e.g. "inputs"
run the current and previous state of the linters on the IUC repo (maybe also devteam)
adapt planemo
- remove the lxml based linter (should we keep the possibility to lint with xmllint?)
- maybe allow to specify a skip list from file
- maybe new sub command to list all available linters + doc

Regarding lxml I was also wondering if we should remove https://github.com/galaxyproject/galaxy/blob/e7a168000e627dc8579ad749174bd5eabbe15873/lib/galaxy/util/__init__.py#L68C9-L68C9

used for instance [here](https://github.com/galaxyproject/tools-iuc/blob/0d019235fcfc835b99d5651b0bc4fd0da06707ac/data_managers/data_manager_manual/data_manager/data_manager_manual.xml#L2) and allowed by xsd

* fix all XML schema errors discovered here galaxyproject/galaxy#17081 * more linter fixes * fix value * add missing macros file for deprecated tool creates a slightly annoying Traceback in the CI setup job * ampvis: fix URLs * ampvis2: more URL fixes * ampvis2 heatmap fix duplicated output label * Apply suggestions from code review Co-authored-by: Marius van den Beek <[email protected]> --------- Co-authored-by: Marius van den Beek <[email protected]>

now in galaxy core galaxyproject/galaxy#17081

mvdbeek · 2024-01-16T14:28:42Z

lib/galaxy/tool_util/lint.py

    return lint_context


 def lint_xml_with(lint_context, tool_xml, extra_modules=None) -> LintContext:
    extra_modules = extra_modules or []
    tool_source = get_tool_source(xml_tree=tool_xml)
    return lint_tool_source_with(lint_context, tool_source, extra_modules=extra_modules)
+
+
+def list_linters(extra_modules: Optional[List[str]] = None) -> List[str]:


I think that's actually a rare case where you can use a Metaclass:

In [1]: linters = {} ...: ...: class LinterMeta(type): ...: def __new__(cls, clsname, bases, attrs): ...: newclass = super(LinterMeta, cls).__new__(cls, clsname, bases, attrs) ...: linters[clsname] = newclass ...: return newclass ...: ...: ...: class MyLinter(metaclass=LinterMeta): ...: pass ...: ...: linters Out[1]: {'MyLinter': __main__.MyLinter}

Thanks for pointing me to metaclasses. Never took the time to learn about them.

Still I hope that 3a98a3a is sufficient?

using `__subclasses__`

mvdbeek · 2024-02-14T17:36:08Z

Any chance you could resolve the conflicts ?

lib/galaxy/tool_util/linters/xsd.py

This reverts commit 46ff3f2.

Co-authored-by: Nicola Soranzo <[email protected]>

…hias/galaxy into topic/linter-overhaul

mvdbeek · 2024-02-16T16:41:41Z

Thanks a lot @bernt-matthias, this is really important work!

github-actions · 2024-02-16T16:41:56Z

This PR was merged without a "kind/" label, please correct.

now in galaxy core galaxyproject/galaxy#17081

github-actions bot added area/testing area/tool-framework labels Nov 24, 2023

github-actions bot added this to the 23.2 milestone Nov 24, 2023

bernt-matthias marked this pull request as draft November 24, 2023 16:01

bernt-matthias mentioned this pull request Nov 24, 2023

Drop lint from preventing deployment galaxyproject/tools-iuc#5650

Closed

5 tasks

mvdbeek reviewed Nov 24, 2023

View reviewed changes

bernt-matthias force-pushed the topic/linter-overhaul branch 6 times, most recently from 1780d95 to 3305cef Compare November 26, 2023 13:57

bernt-matthias force-pushed the topic/linter-overhaul branch 15 times, most recently from bb17d76 to aa85c64 Compare December 7, 2023 16:24

bernt-matthias force-pushed the topic/linter-overhaul branch from aa85c64 to 26c2a89 Compare December 7, 2023 16:39

bernt-matthias marked this pull request as ready for review December 7, 2023 16:39

allow options element in tool

62adedb

used for instance [here](https://github.com/galaxyproject/tools-iuc/blob/0d019235fcfc835b99d5651b0bc4fd0da06707ac/data_managers/data_manager_manual/data_manager/data_manager_manual.xml#L2) and allowed by xsd

try source_path() instead of _source_path

555fe5f

bernt-matthias force-pushed the topic/linter-overhaul branch from 0fc0ec4 to 555fe5f Compare December 19, 2023 08:05

mvdbeek modified the milestones: 23.2, 24.0 Dec 19, 2023

add function to list available linters

ccced2f

bernt-matthias force-pushed the topic/linter-overhaul branch from 42dd707 to ccced2f Compare January 16, 2024 09:08

bernt-matthias added a commit to bernt-matthias/planemo that referenced this pull request Jan 16, 2024

remove lxml based schema linter

1fbbc0e

now in galaxy core galaxyproject/galaxy#17081

mvdbeek reviewed Jan 16, 2024

View reviewed changes

implement listing of linters

3a98a3a

using `__subclasses__`

bernt-matthias force-pushed the topic/linter-overhaul branch from 2021ffb to 3a98a3a Compare January 17, 2024 14:07

Merge branch 'dev' into topic/linter-overhaul

2428b9d

bernt-matthias commented Feb 15, 2024

View reviewed changes

lib/galaxy/tool_util/linters/xsd.py Outdated Show resolved Hide resolved

use types-lxml instead of lxml-stubs

46ff3f2

nsoranzo reviewed Feb 15, 2024

View reviewed changes

lib/galaxy/tool_util/linters/xsd.py Outdated Show resolved Hide resolved

bernt-matthias and others added 4 commits February 15, 2024 16:00

Revert "use types-lxml instead of lxml-stubs"

368f3e8

This reverts commit 46ff3f2.

Add type ignore

ed8eb6e

Co-authored-by: Nicola Soranzo <[email protected]>

new lines

a0b25b5

Merge branch 'topic/linter-overhaul' of https://github.com/bernt-matt…

101a8a9

…hias/galaxy into topic/linter-overhaul

mvdbeek merged commit 74a1b02 into galaxyproject:dev Feb 16, 2024
53 checks passed

nsoranzo added the kind/enhancement label Feb 16, 2024

bernt-matthias deleted the topic/linter-overhaul branch February 17, 2024 11:43

bernt-matthias added a commit to bernt-matthias/planemo that referenced this pull request May 2, 2024

remove lxml based schema linter

1713690

now in galaxy core galaxyproject/galaxy#17081

bernt-matthias added a commit to bernt-matthias/planemo that referenced this pull request May 7, 2024

remove lxml based schema linter

9e18735

now in galaxy core galaxyproject/galaxy#17081

jdavcs mentioned this pull request Jun 6, 2024

[24.0] Make sure that all Linter subclasses are imported for listing them #18339

Merged

4 tasks

jmchilton mentioned this pull request Aug 11, 2024

Dynamic Models for Tool Test Validation #18679

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split linters in separate classes #17081

Split linters in separate classes #17081

bernt-matthias commented Nov 24, 2023 •

edited

Loading

mvdbeek Nov 24, 2023

bernt-matthias Nov 24, 2023

bernt-matthias Nov 24, 2023

mvdbeek Nov 24, 2023

bernt-matthias Nov 24, 2023

mvdbeek Nov 24, 2023

bernt-matthias Nov 24, 2023

jmchilton Nov 27, 2023

bernt-matthias Nov 27, 2023

bernt-matthias commented Dec 7, 2023 •

edited

Loading

mvdbeek Jan 16, 2024

bernt-matthias Jan 17, 2024

mvdbeek commented Feb 14, 2024

mvdbeek commented Feb 16, 2024

github-actions bot commented Feb 16, 2024

Split linters in separate classes #17081

Split linters in separate classes #17081

Conversation

bernt-matthias commented Nov 24, 2023 • edited Loading

How to test the changes?

License

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bernt-matthias commented Dec 7, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mvdbeek commented Feb 14, 2024

mvdbeek commented Feb 16, 2024

github-actions bot commented Feb 16, 2024

bernt-matthias commented Nov 24, 2023 •

edited

Loading

bernt-matthias commented Dec 7, 2023 •

edited

Loading