Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

license check: add license check skip id #28

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

acomodi
Copy link
Contributor

@acomodi acomodi commented May 21, 2021

Signed-off-by: Alessandro Comodi [email protected]

Fixes #27

@acomodi
Copy link
Contributor Author

acomodi commented May 21, 2021

cc @mithro

@mithro
Copy link
Contributor

mithro commented May 21, 2021

We should never need a skip like this.

If we want to exclude generated or vendor code, we should mark them as generated or vendored (what linguist calls third party) in the .gitattributes.

GitHub
Language Savant. If your repository's language is being reported incorrectly, send us a pull request! - github/linguist
GitHub
Language Savant. If your repository's language is being reported incorrectly, send us a pull request! - github/linguist

@acomodi acomodi force-pushed the allow-skipping-license-check branch from eb1db9f to a563073 Compare May 21, 2021 14:16
This commit also allows to skip the generated files checks by adding the
file to the .gitattributes file

Signed-off-by: Alessandro Comodi <[email protected]>
@acomodi acomodi force-pushed the allow-skipping-license-check branch from a563073 to e62d888 Compare May 21, 2021 14:39
Signed-off-by: Alessandro Comodi <[email protected]>
@acomodi
Copy link
Contributor Author

acomodi commented May 21, 2021

@mithro I have added linguist, so that we have a stronger file type detection. As reported in the override docs, one can now add a directory/file path to the .gitattributes to mark files as vendored or generated. These files will skip the license checking

@@ -463,9 +474,9 @@ def main(args):
fwarn(fpath, 'Skipping nonfile')
continue

ftype = detect_file_type(fpath)
ftype = ftypes.get(os.path.relpath(fpath, root_dir), None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either we should remove the detect_file_type function or do something like;

Suggested change
ftype = ftypes.get(os.path.relpath(fpath, root_dir), None)
ftype = ftypes.get(os.path.relpath(fpath, root_dir), detect_file_type(fpath))

@@ -417,6 +419,15 @@ def main(args):
logging.debug('Starting search in: %s', root_dir)

errors = {}

json_data = subprocess.check_output("github-linguist --json", shell=True).decode('utf8')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output of github-linguist doesn't seem to actually specify if a file is vendored or generated.

Looking at https://github.com/github/linguist/blob/master/bin/github-linguist#L61-L87

tansell@tansell-glapstation:~/github/SymbiFlow/actions$ github-linguist third_party/make-env/os.mk 
os.mk: 95 lines (85 sloc)
  type:      Text
  mime type: text/plain
  language:  Makefile
  appears to be a vendored file

However the generated attribute doesn't seem to be working?

tansell@tansell-glapstation:~/github/SymbiFlow/actions$ git check-attr --all checks/tests/license/test-missing-spdx-generated.v
checks/tests/license/test-missing-spdx-generated.v: linguist-generated: set
tansell@tansell-glapstation:~/github/SymbiFlow/actions$ github-linguist checks/tests/license/test-missing-spdx-generated.v
test-missing-spdx-generated.v: 5 lines (3 sloc)
  type:      Text
  mime type: text/plain
  language:  Verilog

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have dove a bit in the linguist code, and, for what I understand, the .gitattributes file is taken into account only when dealing with full repository statistics, therefore it does not have an effect when dealing with single files stats, hence the verilog file is not detected as generated.

Furthermore, when the full-repo statistics are generated, only programming and markup files are taken to report statistics, and data and nil types are left out, as written here.
It turns out that YAML is defined as a data type, and all yaml files are currently excluded by the license check.

I think we can still use linguist to detect the file types, but not to correctly override the generated files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@acomodi - I think we probably want to extend linguist to support outputting the information we want. I started looking at that and ran into issues understanding how to use the Ruby rugged module.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Avoid checking License for generated sources
2 participants