-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add codespell to CI #2075
Add codespell to CI #2075
Conversation
Thanks, but I'm not sure this is word the overhead of an extra CI check to maintain. |
No problem, it's as you wish. It was one of the motivations behind PR #2073. |
A few notes: * One of the caveats is the necessity to maintain a list of false positives. There are not too many of them, but contributors might have to add a new false positive to the list from time to time. * The location of the list of false positives is: .github/codespell_ignore_words.txt * Codespell checks the current directory for a file named setup.cfg or .codespellrc. In the absence of setup.cfg, I have added a new hidden file .codespellrc. The benefit is that the same configuration is automatically used when launching codespell manually from the root directory or from the CI. * The default dictionaries used by codespell are "clear" and "rare". While "rare" might catch a few additional typos, it raises a few additional false positives, most notably "complies" and "theses".
A little late, but instead of adding a whole new CI workflow, we could simply add the codespell pre-commit check to our existing pre-commit config. This keeps the version locked for reproducible results, and requires no additional maintenance aside from the rare case of adding a new ignore. Furthermore, this would not only check and auto-correct spelling on the CIs, but allow contributors to do so locally with no additional setup, either on-demand ( |
Indeed codespell has been part of Supported hooks since pre-commit/pre-commit.com#237 in 2019. |
On the other hand, contributors should not let codespell auto-correct typos. There are false positives: contributors need to review suggestions and silence codespell when needed. |
I'd be generally against including a spellchecker like codespell in the CI (and pre-commit is run on the CI). It's likely to fail for valid words and the contributor might not know they need to add something to an ignore list, and may change their text to something worse just to get past the tool, or give up. But they can be useful with human review, so perhaps a |
You're right, that's a lot of false positives, but I just want to point out that a lot of typos have already been fixed by #2073. |
I could add codespell to Do you have a suggestion for the location/name of the file containing the false positives, or at least the most common ones? I tend to name it after the Would I really need to pin the version? I see that other tools are not really pinned down: Lines 1 to 6 in 4109619
|
Or prefix with a dot so it's hidden by default?
Hm, maybe not. The pinning idea was so that new warnings don't suddenly appear in new versions, but as this isn't run by CI perhaps pinning is not needed. |
I definitely can understand not wanting to deter contributors if it runs in CI, given the chance of false positives. And if autofix is enabled, while pre-commit will abort the commit with a message to review the changes, which the contributor would then need to re-add and re-commit, I could certainly see the possibility of a contributor not carefully checking, especially if they weren't familiar with the workflow. Instead of adding it as a hard requirement which gets installed on all platforms, CIs and locally, (regardless if the user actually wants to perform spell checking), and in the same environment as the build (potential for dep conflicts), why not just add the pre-commit hook with |
I understand you are referring to the addition to More important to me is the question of consistent versions across developers. Is it really indispensable to have a single version? Can't some developers run a more recent version or use a more recent dictionary (see Updating the dictionaries), if they want to? On the other hand, we wouldn't want them to use a very outdated version. |
Perhaps this discussion should be moved to the new PR #2151. I have added codespell configuration files, and I'm open to suggestions like this one on how to manually run codespell. |
The intention is to help catch typos early.
A few notes:
One of the caveats is the necessity to maintain a list of false positives. There are not too many of them, but contributors might have to add a new false positive to the list from time to time.
The location of the list of false positives is:
.github/codespell_ignore_words.txt
Codespell checks the current directory for a file named
setup.cfg
or.codespellrc
. In the absence ofsetup.cfg
, I have added a new hidden file.codespellrc
. The benefit is that the same configuration is automatically used when launching codespell manually from the root directory or from the CI.The default dictionaries used by codespell are
clear
andrare
. Whilerare
might catch a few additional typos, it raises a few additional false positives, most notablycomplies
andtheses
.