From 373d4fee3b8faafa7f92e86fcb4f4c60db2395ec Mon Sep 17 00:00:00 2001 From: FrankMaverick Date: Mon, 8 Jul 2024 17:38:49 +0200 Subject: [PATCH 01/11] chore: Remove __pycache__ from repository --- .../download_eu_vocabularies.cpython-38.pyc | Bin 734 -> 0 bytes tests/__pycache__/jsonschema.cpython-38.pyc | Bin 406 -> 0 bytes 2 files changed, 0 insertions(+), 0 deletions(-) delete mode 100644 tests/__pycache__/download_eu_vocabularies.cpython-38.pyc delete mode 100644 tests/__pycache__/jsonschema.cpython-38.pyc diff --git a/tests/__pycache__/download_eu_vocabularies.cpython-38.pyc b/tests/__pycache__/download_eu_vocabularies.cpython-38.pyc deleted file mode 100644 index f494f791da83b3522ccf2777dd5406d169a569a5..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 734 zcmZuv&5ja55FS7iGjUIzO**&-G0rsTjcmZg9U=*9%)myn7t(Y$&~DFEZ~r)q#s~71 zxcUm7JlV}E5Mw7(ld7-2`oBIqizENkc}-q4qv%IaPG7?-Z~SK-d%>cch&65+tT~N3 zE%MWY@rOJcjpv9_V*BV~W3B3ATvb(Cy-q=|acOYCo(=wb0#ki-@MOf2lsgS7{4jSO z#!y<2uom24P7`lAvKc)|qnwueB#dIqbz$%!%hs^}mh-CH?cw>=WJ))@Ajh~XAgd!` zTgcL+6bc;dxSRHTih!^2FAP z;2eYmPTiaI6(b zDR2e))^jLNLLmIfK$#f7meLBEC`szI23Im{s20llL;6$tg1g0Mpk(PJ(f&$ORqq~M Q6u_M11AoPl3F5(%-%JzYi2wiq diff --git a/tests/__pycache__/jsonschema.cpython-38.pyc b/tests/__pycache__/jsonschema.cpython-38.pyc deleted file mode 100644 index 37544300a24f56b66c41b8c4c3a465a8da55e018..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 406 zcmYjNJx{|h5cTELR4oHcY^)vH`~WJ15F1GIm2++hp4$$?14osMbKx8Ds zMPMILi*#&~JV7lKu}I#66OyMgy&}`ooJ#tBAkq(T*UO0nNnZiAq^m%d6Or7JA>H8} zG6Wyp^0w?Mw~uR288ci}jG=O@99wg(1bcFI<%&`ozC-Wi=D9!qrQ Date: Wed, 10 Jul 2024 10:38:46 +0200 Subject: [PATCH 02/11] docs: improve readability and conciseness of README --- README.en.md | 44 +++++++++++++++++++++++++++++++------------- README.md | 33 ++++++++++++++++++++++++--------- 2 files changed, 55 insertions(+), 22 deletions(-) diff --git a/README.en.md b/README.en.md index 3a6b85c..6623c14 100644 --- a/README.en.md +++ b/README.en.md @@ -30,32 +30,50 @@ to the [dedicated section in the Operational Manual of the Catalog](https://teamdigitale.github.io/dati-semantic-guida-ndc-docs/docs/manuale-operativo/istruzioni-su-come-predisporre-il-repository-in-cui-pubblicare-le-risorse-semantiche.html). -## Automated Checks and Testing +## Automatic Checks and Tests -Below are described the procedures for automated checks and testing implemented, essential for ensuring the quality and integrity of the repository content. +This section describes the automatic check and test procedures +used to ensure the quality and integrity of the repository content. -### Automated Checks (Pre-commit) +### Automatic Checks (Pre-commit) -This repository implements automated checks using [pre-commit](https://pre-commit.com/). The specifications of the checks are defined in the file [`.pre-commit-config.yaml`](.pre-commit-config.yaml). +This repository uses [pre-commit](https://pre-commit.com/) +for automatic checks. The checks are specified +in the [`.pre-commit-config.yaml`](.pre-commit-config.yaml) file. -These checks can be executed using GitHub Actions. The `validate.yaml` file in `.github/workflows` automatically enables pre-commit checks after each push or pull request (PR). Additionally, these checks can be manually activated at any time. +These checks can be run via GitHub Actions. +The `validate.yaml` file in `.github/workflows` +automatically enables pre-commit checks after each push or pull request (PR). +You can also run them manually at any time. -To enable pre-commit checks in another repository, simply copy the [`.pre-commit-config.yaml`](.pre-commit-config.yaml) file and the [`.github/workflows/validate.yaml`](.github/workflows/validate.yaml) file. +To enable pre-commit checks in another repository, +copy the [`.pre-commit-config.yaml`](.pre-commit-config.yaml) file +and the [`.github/workflows/validate.yaml`](.github/workflows/validate.yaml) file. -### URL Testing +### URL Tests -In the `tests` directory, there is a script named `test_urls.py`, which verifies GitHub-related URLs present in the files of the `assets` subdirectories. +The `test_urls.py` script in the `tests` directory verifies +GitHub-related URLs in the `assets` subdirectory files. -This test can also be automated using GitHub Actions. The `test.yaml` file in `.github/workflows` automatically activates tests after each push or pull request. Similarly, these tests can be manually initiated at any time. +This test can also be automated using GitHub Actions. +The `test.yaml` file in `.github/workflows` +automatically runs the tests after each push or pull request. +You can also run them manually at any time. -To enable URL testing in another repository, simply copy the [`/tests/test_urls.py`](/tests/test_urls.py) file and the [`.github/workflows/test.yaml`](.github/workflows/test.yaml) file. +To enable URL tests in another repository, +copy the [`/tests/test_urls.py`](/tests/test_urls.py) file +and the [`.github/workflows/test.yaml`](.github/workflows/test.yaml) file. -### Local Checks and Testing +### Local Checks and Tests -Local checks and testing can be performed using Docker or simply Python. An integrated test environment to reproduce the CI pipeline is available through `docker-compose`, which executes a series of steps. +Checks and tests can also be run locally +using Docker or Python. Use `docker-compose` +to replicate the CI pipeline: ```bash docker-compose -f docker-compose-test.yml up ``` -Note: If you wish to transfer this environment to another repository, it's important to note that the Docker environment requires the Dockerfiles present in the tests directory (such as Dockerfile.precommit and Dockerfile.pytest). \ No newline at end of file +Note: To transfer this environment to another repository, +include the Dockerfiles in the `tests` directory +(such as `Dockerfile.precommit` and `Dockerfile.pytest`). \ No newline at end of file diff --git a/README.md b/README.md index cc42cdb..15ced15 100644 --- a/README.md +++ b/README.md @@ -52,31 +52,46 @@ fai riferimento alla ## Controlli Automatici e Test -Di seguito vengono descritte le procedure di controllo automatico e di test implementate, utili per garantire la qualità e l'integrità del contenuto del repository +Questa sezione descrive le procedure di controllo automatico e test, +utili per garantire la qualità e l'integrità del contenuto del repository. ### Controlli Automatici (Pre-commit) -Questo repository implementa i controlli automatici utilizzando [pre-commit](https://pre-commit.com/). Le specifiche delle verifiche sono definite nel file [`.pre-commit-config.yaml`](.pre-commit-config.yaml). +Questo repository implementa i controlli automatici utilizzando [pre-commit](https://pre-commit.com/). +Le specifiche delle verifiche sono definite nel file [`.pre-commit-config.yaml`](.pre-commit-config.yaml). -È possibile eseguire tali verifiche mediante GitHub Actions. Il file `validate.yaml` in `.github/workflows` abilita automaticamente i controlli pre-commit dopo ogni push o pull request (PR). Inoltre, è possibile attivare manualmente tali controlli in qualsiasi momento. +È possibile eseguire tali verifiche mediante GitHub Actions. +Il file `validate.yaml` in `.github/workflows` abilita automaticamente +i controlli pre-commit dopo ogni push o pull request (PR). +Inoltre, è possibile eseguirli manualmente in qualsiasi momento. -Per abilitare i controlli pre-commit in un altro repository, è sufficiente copiare il file [`.pre-commit-config.yaml`](.pre-commit-config.yaml) e il file [`.github/workflows/validate.yaml`](.github/workflows/validate.yaml). +Per abilitare i controlli pre-commit in un altro repository, +copiare il file [`.pre-commit-config.yaml`](.pre-commit-config.yaml) e il file [`.github/workflows/validate.yaml`](.github/workflows/validate.yaml). ### Test URL -Nella directory `tests` è presente uno script denominato `test_urls.py`, che consente di verificare gli URL relativi a GitHub presenti nei file delle sottodirectory `assets`. +Lo script `test_urls.py` nella directory `tests` consente di verificare +gli URL relativi a GitHub presenti nei file delle sottodirectory `assets`. -Anche questo test può essere automatizzato mediante GitHub Actions. Il file `test.yaml` in `.github/workflows` attiva automaticamente i test dopo ogni push o pull request. Allo stesso modo, è possibile avviare manualmente questi test in qualsiasi momento. +Anche questo test può essere automatizzato mediante GitHub Actions. +Il file `test.yaml` in `.github/workflows` attiva automaticamente i test +dopo ogni push o pull request. +Inoltre, è possibile eseguirli manualmente in qualsiasi momento. -Per abilitare i test URL in un altro repository, è sufficiente copiare il file [`/tests/test_urls.py`](/tests/test_urls.py) e il file [`.github/workflows/test.yaml`](.github/workflows/test.yaml). +Per abilitare i test URL in un altro repository, +copiare il file [`/tests/test_urls.py`](/tests/test_urls.py) e il file [`.github/workflows/test.yaml`](.github/workflows/test.yaml). ### Controlli e Test in Locale -È possibile eseguire i controlli e i test in locale utilizzando l'ambiente Docker o semplicemente Python. Un ambiente di test integrato per riprodurre la pipeline CI è disponibile tramite `docker-compose`, che esegue una serie di passaggi. +I controlli e i test possono essere eseguiti anche in locale +con Docker o Python. Usa `docker-compose` per replicare +la pipeline CI: ```bash docker-compose -f docker-compose-test.yml up ``` -Nota: Se si desidera trasferire questo ambiente su un altro repository, è importante considerare che l'ambiente Docker richiede i Dockerfile presenti nella directory `tests` (come `Dockerfile.precommit` e `Dockerfile.pytest`). +Nota: Per trasferire questo ambiente su un altro repository, +è necessario includere i Dockerfile +presenti nella directory `tests` (come `Dockerfile.precommit` e `Dockerfile.pytest`). From e056fc858d4dc99a8f8254685d2c0b0a02e60965 Mon Sep 17 00:00:00 2001 From: FrankMaverick Date: Wed, 10 Jul 2024 15:40:28 +0200 Subject: [PATCH 03/11] chore: remove hooks and scripts moved to another repository --- .pre-commit-hooks.yaml | 74 ------------ scripts/__init__.py | 0 scripts/check_filename_format.py | 61 ---------- scripts/check_filename_match_uri.py | 115 ------------------- scripts/check_filenames_match_directories.py | 83 ------------- scripts/check_repo_structure.py | 31 ----- scripts/check_supported_files.py | 98 ---------------- scripts/check_versioning_pattern.py | 77 ------------- 8 files changed, 539 deletions(-) delete mode 100644 .pre-commit-hooks.yaml delete mode 100644 scripts/__init__.py delete mode 100644 scripts/check_filename_format.py delete mode 100644 scripts/check_filename_match_uri.py delete mode 100644 scripts/check_filenames_match_directories.py delete mode 100644 scripts/check_repo_structure.py delete mode 100644 scripts/check_supported_files.py delete mode 100644 scripts/check_versioning_pattern.py diff --git a/.pre-commit-hooks.yaml b/.pre-commit-hooks.yaml deleted file mode 100644 index 9ba8874..0000000 --- a/.pre-commit-hooks.yaml +++ /dev/null @@ -1,74 +0,0 @@ -# -# Define running hooks. -# - -- id: check-repo-structure - name: Check Repository Structure - description: |- - Check whether the directory structure is correct - - assets/ontologies/* - - assets/controlled-vocabularies/* - - assets/schemas/* - entry: check_repo_structure - args: ["assets/controlled-vocabularies/", "assets/ontologies/", "assets/schemas/"] - language: python - files: ^assets/.* - pass_filenames: false - types: [file] - -- id: check-filename-format - name: Check Filename Format - description: |- - Check whether file and directory names follow the specified format (^[\\.a-z0-9 _-]{1,64}$) - entry: check_filename_format - args: ["assets/controlled-vocabularies/", "assets/ontologies/", "assets/schemas/"] - language: python - files: ^assets/.* - pass_filenames: false - types: [file] - -- id: check-filenames-match-uri - name: Check Filename match URI - description: |- - Checks whether the name of each TTL or oas3.yaml file matches the final part of its relative URI - entry: check_filename_match_uri - args: ["assets/controlled-vocabularies/", "assets/ontologies/", "assets/schemas/"] - language: python - files: ^assets/.* - pass_filenames: false - types: [file] - additional_dependencies: [rdflib] - -- id: check-filenames-match-directories - name: Check Filename match Directories - description: |- - Check if filenames match the containing directory names - entry: check_filenames_match_directories - args: ["assets/controlled-vocabularies/", "assets/ontologies/", "assets/schemas/"] - language: python - files: ^assets/.* - pass_filenames: false - types: [file] - -- id: check-supported-files - name: Check encoding and file suffix - entry: check_supported_files - description: |- - Checks the leaf directories of the specified root directories - to ensure that each leaf directory contains at least one .ttl file in UTF-8 format. - args: ["assets/controlled-vocabularies/", "assets/ontologies/", "assets/schemas/"] - language: python - files: ^assets/.* - pass_filenames: false - types: [file] - -- id: check-versioning-pattern - name: Check versioning pattern - entry: check_versioning_pattern - description: |- - Check if the versioning pattern is correct for leaf directories - args: ["assets/controlled-vocabularies/", "assets/ontologies/", "assets/schemas/"] - language: python - files: ^assets/.* - pass_filenames: false - types: [file] \ No newline at end of file diff --git a/scripts/__init__.py b/scripts/__init__.py deleted file mode 100644 index e69de29..0000000 diff --git a/scripts/check_filename_format.py b/scripts/check_filename_format.py deleted file mode 100644 index 8b798c4..0000000 --- a/scripts/check_filename_format.py +++ /dev/null @@ -1,61 +0,0 @@ -import os -import re -import sys - -def check_filename_format(root_dirs): - """ - Check whether file and directory names follow the specified format (pattern) - Args: - root_dirs (list): A list of root directories to be checked. - Returns: - bool: True if all file and directory names match the required format, False otherwise. - """ - - pattern = r'^[\\.a-z0-9 _-]{1,64}$' - extensions_to_check = ['.ttl', '.rdf', '.csv', '.yaml'] - - for root_dir in root_dirs: - for dirpath, dirnames, filenames in os.walk(root_dir): - for filename in filenames: - name, extension = os.path.splitext(filename) - if extension not in extensions_to_check: - continue - if not re.match(pattern, name): - print(f"Error: filename '{filename}' in directory '{dirpath}' does not match the required format.") - return False - for dirname in dirnames: - if not re.match(pattern, dirname): - print(f"Error: directory name '{dirname}' in directory '{dirpath}' does not match the required format.") - return False - - return True - -def check_directory_existence(root_dirs): - existing_dirs = [root_dir for root_dir in root_dirs if os.path.exists(root_dir)] - - if not existing_dirs: - print(f"{root_dirs} don't exist") - return False - - # Check if any directories don't exist - non_existent_dirs = [root_dir for root_dir in root_dirs if root_dir not in existing_dirs] - - for root_dir in non_existent_dirs: - print(f"WARNING: {root_dir} does not exist") - return True - -def main(): - root_dirs = sys.argv[1:] # Read dir args - - if not root_dirs: - print("No root directories provided.") - exit(1) - - if not check_directory_existence(root_dirs): - exit(1) - - if not check_filename_format(root_dirs): - exit(1) - -if __name__ == "__main__": - main() diff --git a/scripts/check_filename_match_uri.py b/scripts/check_filename_match_uri.py deleted file mode 100644 index b538929..0000000 --- a/scripts/check_filename_match_uri.py +++ /dev/null @@ -1,115 +0,0 @@ -import sys -import os -from pathlib import Path -from rdflib import Graph, RDF, RDFS, OWL, SKOS, Namespace - -def extract_main_uri(ttl_file,root_dir): - """ - Extracts the main URI relative to the specified TTL file. - - Args: - ttl_file (str): The path of the TTL file. - - Returns: - str: The main relative URI if found, otherwise None. - """ - g = Graph() - g.parse(ttl_file, format="ttl") - - # Define namespace prefixes - dcatapit = Namespace("http://dati.gov.it/onto/dcatapit#") - - main_uri = None - - for s, p, o in g: - if (s, RDF.type, OWL.Ontology) in g and "onto" in root_dir.lower(): - main_uri = s - break - elif p == RDF.type and o == dcatapit.Dataset: - main_uri = s - break - # elif (s, RDF.type, RDFS.Class) in g: - # main_uri = s - # break - elif (s, RDF.type, SKOS.ConceptScheme) in g: - main_uri = s - break - - return main_uri - -def check_filename_match_uri(root_dirs): - """ - Checks whether the name of each TTL or oas3.yaml file matches the final part of its relative URI. - - Args: - root_dirs (list): List of directories to search for TTL files. - - Returns: - list: List of tuples (file, uri) for TTL or oas3.yaml files that do not match the URI. - """ - mismatches = [] - - for root_dir in root_dirs: - for file_path in Path(root_dir).rglob("*.ttl"): - filename = file_path.stem # File name without extension - uri = extract_main_uri(str(file_path), root_dir) # Main relative URI of the file - - if uri: - # Extract the final part of the URI - uri_parts = str(uri).split("/") - last_uri_part = uri_parts[-1] - if last_uri_part == '': - last_uri_part = uri_parts[-2] - - # Check if the root directory contains "schema" - if "schema" in root_dir.lower(): - # Check if the file with .oas3.yaml extension exists - oas3_yaml_file = Path(file_path.parent, f"{last_uri_part}") - if not oas3_yaml_file.exists(): - mismatches.append((str(oas3_yaml_file), str(uri))) - else: - # Compare the file name with the last part of the URI - if filename != last_uri_part: - mismatches.append((str(file_path), str(uri))) - else: - print(f"Warning: No main relative URI found for file {file_path}") - - return mismatches - -def check_directory_existence(root_dirs): - existing_dirs = [root_dir for root_dir in root_dirs if os.path.exists(root_dir)] - - if not existing_dirs: - print(f"{root_dirs} don't exist") - return False - - # Check if any directories don't exist - non_existent_dirs = [root_dir for root_dir in root_dirs if root_dir not in existing_dirs] - - for root_dir in non_existent_dirs: - print(f"WARNING: {root_dir} does not exist") - return True - -def main(): - root_dirs = sys.argv[1:] - - if not root_dirs: - print("No root directories provided.") - exit(1) - - if not check_directory_existence(root_dirs): - exit(1) - - mismatches = check_filename_match_uri(root_dirs) - - if mismatches: - print("Error: The following files do not match their relative URI:") - for file_path, uri in mismatches: - print(f"- File: {file_path}, URI: {uri}") - exit(1) - else: - print("All files match their relative URI.") - -if __name__ == "__main__": - main() - diff --git a/scripts/check_filenames_match_directories.py b/scripts/check_filenames_match_directories.py deleted file mode 100644 index db251f1..0000000 --- a/scripts/check_filenames_match_directories.py +++ /dev/null @@ -1,83 +0,0 @@ -import os -import sys - - -# List of filenames to be excluded -EXCLUDED_FILENAMES = ["index", "datapackage", "context-short", "rules"] - -# List of extensions to be excluded -EXCLUDED_EXTENSIONS = [".oas3.yaml", ".md", ".shacl", ".frame.yamlld", ".ld.yaml"] - -def split_filename_extension(filename): - """ - Split filename into name and extension. - Args: - filename (str): The filename to split. - Returns: - tuple: A tuple containing the name and extension of the filename. - """ - parts = filename.split(".") - if len(parts) > 2: - # If there are more than 1 periods, consider the last one as part of the extension - # e.g. education-level.frame.yamlld -> education-level .frame.yamlld - name = ".".join(parts[:-2]) - extension = "." + ".".join(parts[-2:]) - else: - # Otherwise, consider only the extension as the last part - name, extension = os.path.splitext(filename) - - return name, extension - -def check_filenames_match_directories(root_dirs): - """ - Check if filenames match the containing directory names. - Args: - root_dirs (list): A list of root directories to be checked. - Returns: - bool: True if all filenames match their containing directory names, False otherwise. - """ - - for root_dir in root_dirs: - for dirpath, _, filenames in os.walk(root_dir): - for filename in filenames: - name, extension = split_filename_extension(filename) - parent_dir = os.path.basename(dirpath) - parent_dir_1 = os.path.basename(os.path.dirname(dirpath)) - parent_dir_2 = os.path.basename(os.path.dirname(os.path.dirname(dirpath))) - if name != parent_dir and name != parent_dir_1 and name != parent_dir_2: - if name not in EXCLUDED_FILENAMES and extension not in EXCLUDED_EXTENSIONS: - print(f"Error: Filename '{filename}' in '{dirpath}' dir does not match its containing directory name.") - return False - - return True - -def check_directory_existence(root_dirs): - existing_dirs = [root_dir for root_dir in root_dirs if os.path.exists(root_dir)] - - if not existing_dirs: - print(f"{root_dirs} don't exist") - return False - - # Check if any directories don't exist - non_existent_dirs = [root_dir for root_dir in root_dirs if root_dir not in existing_dirs] - - for root_dir in non_existent_dirs: - print(f"WARNING: {root_dir} does not exist") - return True - -def main(): - root_dirs = sys.argv[1:] # Read dir args - - if not root_dirs: - print("No root directories provided.") - exit(1) - - if not check_directory_existence(root_dirs): - exit(1) - - if not check_filenames_match_directories(root_dirs): - exit(1) - -if __name__ == "__main__": - main() - diff --git a/scripts/check_repo_structure.py b/scripts/check_repo_structure.py deleted file mode 100644 index 29ddc7a..0000000 --- a/scripts/check_repo_structure.py +++ /dev/null @@ -1,31 +0,0 @@ -import os -import sys - -def check_structure(required_dirs): - """ - Check whether the directory structure is correct. - Args: - required_dirs (list): A list of required directories to be checked. - Returns: - bool: True if all required directories exist, False otherwise. - """ - - for dir in required_dirs: - if not os.path.exists(dir): - print(f"Error: directory '{dir}' not exists.") - return False - - return True - -def main(): - required_dirs = sys.argv[1:] # Read dir args - - if not required_dirs: - print("No root directories provided.") - exit(1) - - if not check_structure(required_dirs): - exit(1) - -if __name__ == "__main__": - main() \ No newline at end of file diff --git a/scripts/check_supported_files.py b/scripts/check_supported_files.py deleted file mode 100644 index e302288..0000000 --- a/scripts/check_supported_files.py +++ /dev/null @@ -1,98 +0,0 @@ -import os -import sys -from pathlib import Path - -""" -This script checks the leaf directories of the specified root directories to ensure that each leaf directory contains at least one .ttl file in UTF-8 format. -""" - -def is_utf8(file_path): - """ - Check if a file is encoded in UTF-8 format. - Args: - file_path (str): The path to the file. - Returns: - bool: True if the file is encoded in UTF-8, False otherwise. - """ - try: - with open(file_path, 'r', encoding='utf-8') as file: - file.read() - return True - except UnicodeDecodeError: - return False - -def check_supported_files(root_dirs): - """ - Check if the leaf directories contain at least one .ttl file in UTF-8 format. - Args: - root_dirs (list): A list of root directories to be checked. - Returns: - bool: True if all leaf directories contain a .ttl file in UTF-8 format, False otherwise. - """ - def dfs(directory): - """ - Perform a depth-first search (DFS) to check leaf directories. - Args: - directory (Path): The directory to be checked. - Returns: - bool: True if all leaf directories contain a .ttl file in UTF-8 format, False otherwise. - """ - has_ttl_file = False - if not any(item.is_dir() for item in directory.iterdir()): - for item in directory.iterdir(): - if item.is_file() and item.suffix == '.ttl': - has_ttl_file = True - if not is_utf8(item): - print(f"Error: {item} is not encoded in UTF-8.") - return False - if not has_ttl_file and directory != root_dir: - print(f"Error: No .ttl files found in directory: {directory}") - return False - else: - for item in directory.iterdir(): - if item.is_dir(): - if not dfs(item): - return False - - return True - - for root_dir in root_dirs: - root_path = Path(root_dir) - if not root_path.exists() or not root_path.is_dir(): - print(f"Error: Invalid root directory: {root_dir}") - return False - - if not dfs(root_path): - return False - - return True - -def check_directory_existence(root_dirs): - existing_dirs = [root_dir for root_dir in root_dirs if os.path.exists(root_dir)] - - if not existing_dirs: - print(f"{root_dirs} don't exist") - return False - - # Check if any directories don't exist - non_existent_dirs = [root_dir for root_dir in root_dirs if root_dir not in existing_dirs] - - for root_dir in non_existent_dirs: - print(f"WARNING: {root_dir} does not exist") - return True - -def main(): - root_dirs = sys.argv[1:] - - if not root_dirs: - print("No root directories provided.") - exit(1) - - if not check_directory_existence(root_dirs): - exit(1) - - if not check_supported_files(root_dirs): - exit(1) - -if __name__ == "__main__": - main() diff --git a/scripts/check_versioning_pattern.py b/scripts/check_versioning_pattern.py deleted file mode 100644 index f0fc2df..0000000 --- a/scripts/check_versioning_pattern.py +++ /dev/null @@ -1,77 +0,0 @@ -#!/usr/bin/env python -import sys -import os -import re - -def check_versioning_pattern(root_dirs): - """ - Check if the versioning pattern is correct for leaf directories. - """ - version_pattern = r"(latest|v?\d+(\.\d+){0,2})$" # Regular expression pattern to match versioning format - dir_pattern = r"(latest|\b(?:\D*\d\D*)+\b)" # Regular expression pattern to match directory names - errors = False - checked_versions = {} # Dictionary to store checked versions - - for root_dir in root_dirs: - for dirpath, dirnames, _ in os.walk(root_dir): - - if not dirnames: # Check only leaf directories - versions = set() - for dirname in os.listdir(os.path.dirname(dirpath)): - - if re.match(dir_pattern, dirname): # Check if the directory name matches the pattern - versions.add(dirname) - - # Remove "latest" if present in the set - versions.discard("latest") - - superior_directory_path = os.path.dirname(dirpath) - - # Check if the versions have already been checked - if tuple(versions) in checked_versions: - continue # Skip if already checked - - # Verify that all strings in the set start with a number or "v" - if not (all(re.match(r"v\d", version) for version in versions) or all(version[0].isdigit() for version in versions)): - # If not all strings start with a number or "v", report an error - print(f"Error: Inconsistent versioning pattern found in {superior_directory_path}: {versions}") - errors = True - - # Verify that all strings in the set match the versioning pattern - if not (all(re.match(version_pattern, version) for version in versions)): - print(f"Error: Inconsistent versioning pattern found in {superior_directory_path}: {versions}") - errors = True - - checked_versions[tuple(versions)] = True # Mark versions as checked - - return not errors - -def check_directory_existence(root_dirs): - existing_dirs = [root_dir for root_dir in root_dirs if os.path.exists(root_dir)] - - if not existing_dirs: - print(f"{root_dirs} don't exist") - return False - - # Check if any directories don't exist - non_existent_dirs = [root_dir for root_dir in root_dirs if root_dir not in existing_dirs] - - for root_dir in non_existent_dirs: - print(f"WARNING: {root_dir} does not exist") - return True - -def main(): - root_dirs = sys.argv[1:] # Read args - - if not root_dirs: - print("No root directories provided.") - exit(1) - - if not check_directory_existence(root_dirs): - exit(1) - - if not check_versioning_pattern(root_dirs): - exit(1) - -if __name__ == "__main__": - main() From 20a74d0565b62e706a51b3ea695d3fa51e96d23e Mon Sep 17 00:00:00 2001 From: FrankMaverick Date: Wed, 10 Jul 2024 15:40:59 +0200 Subject: [PATCH 04/11] chore: remove unnecessary file --- .../education-level.oas3.yaml | 109 ------------------ 1 file changed, 109 deletions(-) delete mode 100644 other/vocabularies-schema/education-level.oas3.yaml diff --git a/other/vocabularies-schema/education-level.oas3.yaml b/other/vocabularies-schema/education-level.oas3.yaml deleted file mode 100644 index df9d257..0000000 --- a/other/vocabularies-schema/education-level.oas3.yaml +++ /dev/null @@ -1,109 +0,0 @@ -SchemaVocabulary: - x-count: 16 - x-version: 2018-06-05 - oneOf: - - enum: - - ACO - externalDocs: - url: https://w3id.org/italia/controlled-vocabulary/classifications-for-people/education-level/ACO - title: Diploma di Accademia di Belle Arti, Danza, Arte Drammatica, ISIA, ecc. - Conservatorio (vecchio ordinamento) - type: string - - enum: - - CDU - externalDocs: - url: https://w3id.org/italia/controlled-vocabulary/classifications-for-people/education-level/CDU - title: Diploma universitario di due/tre anni, Scuola diretta a fini speciali, - altro diploma terziario non universitario - type: string - - enum: - - FLA - externalDocs: - url: https://w3id.org/italia/controlled-vocabulary/classifications-for-people/education-level/FLA - title: Diploma accademico di Alta Formazione Artistica, Musicale e Coreutica di - I livello - type: string - - enum: - - IFP - externalDocs: - url: https://w3id.org/italia/controlled-vocabulary/classifications-for-people/education-level/IFP - title: Attestato IFP di qualifica professionale triennale (operatore)/Diploma - professionale IFP di tecnico (quarto anno) (dal 2005) - type: string - - enum: - - IFTS - externalDocs: - url: https://w3id.org/italia/controlled-vocabulary/classifications-for-people/education-level/IFTS - title: Certificato di specializzazione tecnica superiore IFTS (dal 2000) - type: string - - enum: - - ITS - externalDocs: - url: https://w3id.org/italia/controlled-vocabulary/classifications-for-people/education-level/ITS - title: Diploma di tecnico superiore ITS (corsi biennali) (dal 2013) - type: string - - enum: - - L - externalDocs: - url: https://w3id.org/italia/controlled-vocabulary/classifications-for-people/education-level/L - title: "Laurea di primo livello " - type: string - - enum: - - LD - externalDocs: - url: https://w3id.org/italia/controlled-vocabulary/classifications-for-people/education-level/LD - title: Laurea specialistica/magistrale a ciclo unico o diploma di laurea di 4-6 - anni - type: string - - enum: - - LS - externalDocs: - url: https://w3id.org/italia/controlled-vocabulary/classifications-for-people/education-level/LS - title: Laurea specialistica/magistrale biennale - type: string - - enum: - - LSE - externalDocs: - url: https://w3id.org/italia/controlled-vocabulary/classifications-for-people/education-level/LSE - title: Licenza media o avviamento professionale (conseguito non oltre l'anno - 1965) /Diploma di Istruzione secondaria di I grado - type: string - - enum: - - NED - externalDocs: - url: https://w3id.org/italia/controlled-vocabulary/classifications-for-people/education-level/NED - title: Nessun titolo di studio - type: string - - enum: - - PSE - externalDocs: - url: https://w3id.org/italia/controlled-vocabulary/classifications-for-people/education-level/PSE - title: Licenza elementare/ Attestato di valutazione finale - type: string - - enum: - - RDD - externalDocs: - url: https://w3id.org/italia/controlled-vocabulary/classifications-for-people/education-level/RDD - title: "Dottorato di ricerca/Diploma accademico di formazione alla ricerca " - type: string - - enum: - - SLA - externalDocs: - url: https://w3id.org/italia/controlled-vocabulary/classifications-for-people/education-level/SLA - title: Diploma accademico di Alta Formazione Artistica, Musicale e Coreutica di - II livello - type: string - - enum: - - USG - externalDocs: - url: https://w3id.org/italia/controlled-vocabulary/classifications-for-people/education-level/USG - title: Diploma di maturità / diploma di istruzione secondaria di II grado di 4-5 - anni (che permette l’iscrizione all’università) - type: string - - enum: - - USV - externalDocs: - url: https://w3id.org/italia/controlled-vocabulary/classifications-for-people/education-level/USV - title: Diploma di qualifica professionale di scuola secondaria di II grado di - 2-3 anni (che non permette l’iscrizione all’Università) - type: string From e8365d3b24c4f82d31e896ea10760cb7457bb638 Mon Sep 17 00:00:00 2001 From: FrankMaverick Date: Wed, 10 Jul 2024 15:52:40 +0200 Subject: [PATCH 05/11] chore: update pre-commit-config Added commented Python checks and semantic checks referencing dati-semantic-tools repo --- .pre-commit-config.yaml | 127 +++++++++++++++++++++++++++++----------- 1 file changed, 94 insertions(+), 33 deletions(-) diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index 54b0667..a0335fd 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -1,38 +1,99 @@ -repos: -- repo: https://github.com/teamdigitale/dati-semantic-cookiecutter - rev: 931e0529c8839f6fa8c1ae315839ba7c3060c5f2 - hooks: - - id: check-repo-structure - args: ["assets/controlled-vocabularies/", "assets/ontologies/", "assets/schemas/"] - - id: check-filename-format - args: ["assets/controlled-vocabularies/", "assets/ontologies/", "assets/schemas/"] - - id: check-filenames-match-uri - args: ["assets/controlled-vocabularies/", "assets/ontologies/", "assets/schemas/"] - - id: check-filenames-match-directories - args: ["assets/controlled-vocabularies/", "assets/ontologies/", "assets/schemas/"] - - id: check-supported-files - args: ["assets/controlled-vocabularies/", "assets/ontologies/", "assets/schemas/"] - - id: check-versioning-pattern - args: ["assets/controlled-vocabularies/", "assets/ontologies/", "assets/schemas/"] +# +# Run pre-commit hooks. You can run them without installing +# the hook with +# +# $ pre-commit run --all-files +# +# See https://pre-commit.com for more information +# See https://pre-commit.com/hooks.html for more hooks + +# +# Python code checks +# If you don't have Python code to check, +# you can leave these hooks commented out +# +# repos: +# - repo: https://github.com/pre-commit/pre-commit-hooks +# rev: v4.6.0 +# hooks: +# - id: trailing-whitespace +# - id: end-of-file-fixer +# - id: check-yaml +# args: [--allow-multiple-documents] +# - id: check-added-large-files +# args: +# - "--maxkb=4000" +# - repo: https://github.com/myint/autoflake +# rev: v2.3.1 +# hooks: +# - id: autoflake +# args: +# - --in-place +# - --remove-unused-variables +# - --remove-all-unused-imports +# - repo: https://github.com/psf/black +# rev: 24.4.2 +# hooks: +# - id: black +# - repo: https://github.com/pycqa/isort +# rev: 5.12.0 +# hooks: +# - id: isort +# name: isort (python) +# args: ["--profile", "black"] +# - id: isort +# name: isort (cython) +# types: [cython] +# - id: isort +# name: isort (pyi) +# types: [pyi] +# - repo: https://github.com/PyCQA/flake8 +# rev: 7.0.0 +# hooks: +# - id: flake8 +# - repo: https://github.com/PyCQA/bandit +# rev: 1.7.8 +# hooks: +# - id: bandit +# name: bandit +# args: ["-c", ".bandit.yaml"] +# description: 'Bandit is a tool for finding common security issues in Python code' +# entry: bandit +# language: python +# language_version: python3 +# types: [python] +# - repo: https://github.com/Lucas-C/pre-commit-hooks-safety +# rev: v1.3.3 +# hooks: +# - id: python-safety-dependencies-check # # Semantic checks. # -- repo: https://github.com/teamdigitale/json-semantic-playground - rev: 0b4ad4cc883a49878fdfd4539e694ae56b041e29 +- repo: https://github.com/teamdigitale/dati-semantic-tools + rev: c2074cd9c90dc1751f5535459afc3da6d21ab60d hooks: - - id: validate-csv - files: >- - ^assets\/controlled-vocabularies/.*\.csv - - id: validate-oas-schema - files: >- - ^assets\/schemas\/.*.oas3.yaml - - id: validate-turtle - files: >- - ^assets\/ontologies\/[^\/]+\/latest\/.*\.ttl - - id: validate-turtle - files: >- - ^assets\/controlled-vocabularies\/.*\.ttl - - id: validate-turtle - files: >- - ^assets\/schemas\/.*\.ttl + - id: validate-repo-structure + files: ^assets\/.* + - id: validate-filename-format + files: ^assets\/.* + - id: validate-filename-match-uri + files: ^assets\/.*\.ttl + - id: validate-filename-match-directory + files: ^assets\/.* + - id: validate-directory-versioning-pattern + files: ^assets\/.*\.ttl + - id: validate-mandatory-files-presence + files: ^assets\/.* + - id: validate-utf8-file-encoding + files: ^assets\/.* + - id: validate-turtle + files: ^assets/.*\.ttl$ + - id: validate-oas-schema + files: ^assets/.*\.schema.yaml + - id: validate-openapi-schema + files: ^assets/.*\.oas3.yaml + - id: validate-directory-versioning + files: ^assets/.*\.ttl + - id: validate-csv + files: ^assets/.*\.csv From 1d8a107980c668db286c61b71a7838b2b4395c8f Mon Sep 17 00:00:00 2001 From: FrankMaverick Date: Wed, 10 Jul 2024 16:13:47 +0200 Subject: [PATCH 06/11] chore: remove unnecessary file --- setup.py | 29 ----------------------------- 1 file changed, 29 deletions(-) delete mode 100644 setup.py diff --git a/setup.py b/setup.py deleted file mode 100644 index 29cb0d1..0000000 --- a/setup.py +++ /dev/null @@ -1,29 +0,0 @@ -from setuptools import find_packages -from setuptools import setup - -with open("requirements.txt") as f: - requirements = f.read().splitlines() - -setup( - name="dati_semantic_cookiecutter", - version="0.1.0", - description="Tools to check semantic assets", - classifiers=[ - "Programming Language :: Python :: 3.11", - "License :: OSI Approved :: MIT License", - "Operating System :: OS Independent", - ], - packages=find_packages('.'), - install_requires=requirements, - entry_points={ - "console_scripts": [ - "check_repo_structure = scripts.check_repo_structure:main", - "check_filename_format = scripts.check_filename_format:main", - "check_filename_match_uri = scripts.check_filename_match_uri:main", - "check_filenames_match_directories = scripts.check_filenames_match_directories:main", - "check_supported_files = scripts.check_supported_files:main", - "check_versioning_pattern = scripts.check_versioning_pattern:main", - "directory_existence_checker = scripts.directory_existence_checker:check_directory_existence" - ] - }, -) From 71fe113429e35f2b943c2c19f9440804f1bfec4a Mon Sep 17 00:00:00 2001 From: FrankMaverick Date: Wed, 10 Jul 2024 16:18:25 +0200 Subject: [PATCH 07/11] fix: small change --- .github/workflows/test.yaml | 5 ++--- .github/workflows/validate.yaml | 1 - .gitignore | 1 + tests/Dockerfile.pytest | 3 +-- tests/test_urls.py | 4 +--- 5 files changed, 5 insertions(+), 9 deletions(-) diff --git a/.github/workflows/test.yaml b/.github/workflows/test.yaml index 95d00cd..b085ba9 100644 --- a/.github/workflows/test.yaml +++ b/.github/workflows/test.yaml @@ -25,8 +25,7 @@ jobs: - name: Install dependencies run: | python -m pip install --upgrade pip - pip install requests - pip install pytest + pip install requests pytest - name: Run tests - run: pytest -s ./tests/ \ No newline at end of file + run: pytest -s ./tests/*.py \ No newline at end of file diff --git a/.github/workflows/validate.yaml b/.github/workflows/validate.yaml index c93ac48..9eb4805 100644 --- a/.github/workflows/validate.yaml +++ b/.github/workflows/validate.yaml @@ -25,6 +25,5 @@ jobs: - name: Run a script run: |- pip install pre-commit - pip install rdflib pre-commit autoupdate pre-commit run --all-files --verbose \ No newline at end of file diff --git a/.gitignore b/.gitignore index af969a4..bce02bb 100644 --- a/.gitignore +++ b/.gitignore @@ -1,6 +1,7 @@ # Insert here the files/folders/extensions you would like to exclude from git tracking *.py[co] .* +!.github/ __pycache__ *.log diff --git a/tests/Dockerfile.pytest b/tests/Dockerfile.pytest index b7ae4e5..0dcdddb 100644 --- a/tests/Dockerfile.pytest +++ b/tests/Dockerfile.pytest @@ -1,4 +1,3 @@ FROM python:3.11 -RUN pip install requests==2.31.0 -RUN pip install pytest==8.1.1 +RUN pip install requests==2.31.0 && pip install pytest==8.1.1 ENTRYPOINT ["pytest", "-s", "./tests/"] \ No newline at end of file diff --git a/tests/test_urls.py b/tests/test_urls.py index 6b4bb42..7c94388 100644 --- a/tests/test_urls.py +++ b/tests/test_urls.py @@ -89,10 +89,8 @@ def test_url(): print(f"WARNING: root directory '{root_dir}' does not exist.") for file_path, url, root_dir in get_urls(root_dirs): - print(f"Testing URL: {url}") ret = request_url(requests.head, url) - print(f"status_code: {ret}") # Check if the response status code is 200 or 301 (redirect) if ret.status_code not in [200, 301]: @@ -111,7 +109,7 @@ def test_url(): print("\nErrors found during URL test:") for error in errors: print(error) - assert False, "\n".join(errors) + assert False # Run test test_url() \ No newline at end of file From 495be3d06bf6e7ab6320be7fa459f107c1769bde Mon Sep 17 00:00:00 2001 From: FrankMaverick Date: Fri, 12 Jul 2024 17:26:06 +0200 Subject: [PATCH 08/11] docs: Add note about optional pre-commit validations --- README.en.md | 2 ++ README.md | 5 +++++ 2 files changed, 7 insertions(+) diff --git a/README.en.md b/README.en.md index 6623c14..1181285 100644 --- a/README.en.md +++ b/README.en.md @@ -50,6 +50,8 @@ To enable pre-commit checks in another repository, copy the [`.pre-commit-config.yaml`](.pre-commit-config.yaml) file and the [`.github/workflows/validate.yaml`](.github/workflows/validate.yaml) file. +Note: It is possible to comment on checks deemed unnecessary or inappropriate. For example, when using a solution with stable URIs, the validation check of the filename against the URIs (validate-filename-match-uri) may not be essential. + ### URL Tests The `test_urls.py` script in the `tests` directory verifies diff --git a/README.md b/README.md index 15ced15..11257f6 100644 --- a/README.md +++ b/README.md @@ -68,6 +68,11 @@ Inoltre, è possibile eseguirli manualmente in qualsiasi momento. Per abilitare i controlli pre-commit in un altro repository, copiare il file [`.pre-commit-config.yaml`](.pre-commit-config.yaml) e il file [`.github/workflows/validate.yaml`](.github/workflows/validate.yaml). +Nota: È possibile commentare i controlli ritenuti non necessari o inappropriati. +Ad esempio, se si utilizza una soluzione con URI stabili, +il controllo di validazione del nome del file +rispetto agli URI (`validate-filename-match-uri`) potrebbe non essere indispensabile. + ### Test URL Lo script `test_urls.py` nella directory `tests` consente di verificare From d86262b79402960f87c05deca9ebfab8bc9b3320 Mon Sep 17 00:00:00 2001 From: Clou-dia <98462345+Clou-dia@users.noreply.github.com> Date: Mon, 15 Jul 2024 10:46:29 +0200 Subject: [PATCH 09/11] Update README.md eliminato testo nota in controlli automatici, e rifrasato a inizio del paragrafo --- README.md | 5 ----- 1 file changed, 5 deletions(-) diff --git a/README.md b/README.md index 11257f6..15ced15 100644 --- a/README.md +++ b/README.md @@ -68,11 +68,6 @@ Inoltre, è possibile eseguirli manualmente in qualsiasi momento. Per abilitare i controlli pre-commit in un altro repository, copiare il file [`.pre-commit-config.yaml`](.pre-commit-config.yaml) e il file [`.github/workflows/validate.yaml`](.github/workflows/validate.yaml). -Nota: È possibile commentare i controlli ritenuti non necessari o inappropriati. -Ad esempio, se si utilizza una soluzione con URI stabili, -il controllo di validazione del nome del file -rispetto agli URI (`validate-filename-match-uri`) potrebbe non essere indispensabile. - ### Test URL Lo script `test_urls.py` nella directory `tests` consente di verificare From 0b0d043ef0d44a593aa67c4d8879c552ea278a87 Mon Sep 17 00:00:00 2001 From: Clou-dia <98462345+Clou-dia@users.noreply.github.com> Date: Mon, 15 Jul 2024 10:49:11 +0200 Subject: [PATCH 10/11] Update README.md --- README.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 15ced15..76c9a62 100644 --- a/README.md +++ b/README.md @@ -21,7 +21,7 @@ Per la leggibilità: - Tutti i file JSON sono serializzati come YAML; - Tutti i file RDF sono serializzati come `text/turtle`; -### Asset semantici (schemi, vocabolari, ontologie) +### Risorse semantici (schemi, vocabolari, ontologie) Tutte le risorse semantiche da raccogliere / pubblicare sono in `assets/`; I file al di fuori di questa directory vengono ignorati dal catalogo @@ -55,6 +55,8 @@ fai riferimento alla Questa sezione descrive le procedure di controllo automatico e test, utili per garantire la qualità e l'integrità del contenuto del repository. +I controlli implementati possono essere disattivati se non applicabili al proprio caso d'uso. Ad esempio, se si è già implementata una propria soluzione per le URI stabili, il controllo sull'uguaglianza tra i nomi dei file e delle relative cartelle e tra il nome dei file e delle relative risorse negli URI non devono essere necessariamente superati, quindi possono essere commentati. + ### Controlli Automatici (Pre-commit) Questo repository implementa i controlli automatici utilizzando [pre-commit](https://pre-commit.com/). From e1db74efe8f26787ba2a2f1df5bb60aec80ee81c Mon Sep 17 00:00:00 2001 From: Clou-dia <98462345+Clou-dia@users.noreply.github.com> Date: Mon, 15 Jul 2024 10:50:35 +0200 Subject: [PATCH 11/11] Update README.en.md changed "automatic checks and test" as per the italian version --- README.en.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/README.en.md b/README.en.md index 1181285..1fe0b98 100644 --- a/README.en.md +++ b/README.en.md @@ -35,6 +35,8 @@ to the This section describes the automatic check and test procedures used to ensure the quality and integrity of the repository content. +The controls may be disabled if they are not applicable to your specific use case. For example, if you have already implemented a solution for stable URIs, the check on the equality between the names of files and related folders and the check between the names of files and related resources in URIs can be commented out. + ### Automatic Checks (Pre-commit) This repository uses [pre-commit](https://pre-commit.com/) @@ -50,8 +52,6 @@ To enable pre-commit checks in another repository, copy the [`.pre-commit-config.yaml`](.pre-commit-config.yaml) file and the [`.github/workflows/validate.yaml`](.github/workflows/validate.yaml) file. -Note: It is possible to comment on checks deemed unnecessary or inappropriate. For example, when using a solution with stable URIs, the validation check of the filename against the URIs (validate-filename-match-uri) may not be essential. - ### URL Tests The `test_urls.py` script in the `tests` directory verifies @@ -78,4 +78,4 @@ docker-compose -f docker-compose-test.yml up Note: To transfer this environment to another repository, include the Dockerfiles in the `tests` directory -(such as `Dockerfile.precommit` and `Dockerfile.pytest`). \ No newline at end of file +(such as `Dockerfile.precommit` and `Dockerfile.pytest`).