Spell checked

artic-network · Jul 23, 2024 · 089c629 · 089c629
1 parent 6f42236
commit 089c629
Show file tree

Hide file tree

Showing 6 changed files with 65 additions and 55 deletions.
diff --git a/README.md b/README.md
@@ -25,8 +25,8 @@ pip install primal-page
 
 Each version of a primerscheme has three parts; `{schemename}/{ampliconsize}/{version}`, which when combined these form the schemes unique identifier.
 
-For a scheme to be added to the repo it requires three essental files. 
-- `primer.bed`: Contains the primer infomation.   
+For a scheme to be added to the repo it requires three essential files. 
+- `primer.bed`: Contains the primer information.   
 - `reference.fasta`: Contains the reference genomes.
 - `info.json`: Contains key metadata for the scheme.
 
@@ -81,11 +81,11 @@ This is the main metadata file for each primerscheme.
     - `withdrawn`: Removed due to major issue
     - `deprecated`: Newer scheme is recommended
     - `autogenerated`: Scheme has been autogenerated using species-agnostic pipelines
-    - `draft`: Scheme has been inspected in silico
+    - `draft`: Scheme has been inspected _in silico_
     - `tested`: Scheme has been tested in the laboratory
     - `validated`: Scheme has been validated and/or published
 - `citations`: How the scheme should be cited if used. DOI links are recommended, however, tweets/blogs are all allowed
-- `authors`: The person or organisation who generated the scheme. It is recommended that only corresponding/primary authors are included, with all other contributors recognised in the `citations` field
+- `authors`: The person or organization who generated the scheme. It is recommended that only corresponding/primary authors are included, with all other contributors recognized in the `citations` field
 - `algorithmversion`: The algorithm and the version used to generate the scheme
 - `species`: A list of organisms targeted by this scheme. NCBI TaxIds are recommend
 - `license`: The name of the license the primerscheme is offered under
@@ -94,13 +94,13 @@ This is the main metadata file for each primerscheme.
 - `articbedversion`: The version of the primer.bed (See below)
 - `description`: A free text description to describe the primerscheme
 - `derivedfrom`: To show if this scheme has been based on another primerscheme. 
-- `collections`: A collection of vocab to provide filtering/grouping of schemes.
+- `collections`: A collection of vocabulary to provide filtering/grouping of schemes.
     - `ARTIC`: Developed with the ARTIC network
     - `MODJADJI`: Developed with MODJADJI
     - `QUICK-LAB`: Developed with QUICK-LAB
     - `COMMUNITY`: Developed by the COMMUNITY
     - `WASTE-WATER`: Scheme capable of recovering genomes from high Ct samples (~30) samples, like wastewater. Typically 400bp schemes
-    - `CLINAL-ISOLATES`: Scheme capable of recovering genomes from medium Ct samples (~25) samples.  Typically ~1000bp schemes
+    - `CLINICAL-ISOLATES`: Scheme capable of recovering genomes from medium Ct samples (~25) samples.  Typically ~1000bp schemes
     - `WHOLE-GENOME`: Scheme that can theoretically recover a full genome
     - `PANEL`: Scheme that can recover sections of a target genome
     - `MULTI-TARGET`: Scheme that contains more than one target
@@ -221,6 +221,7 @@ $ primal-page [OPTIONS] COMMAND [ARGS]...
 * `aliases`: Manage aliases
 * `build-index`: Build an index.json file from all schemes...
 * `create`: Create a new scheme in the required format
+* `dev`: Development commands
 * `download`: Download schemes from the index.json
 * `modify`: Modify an existing scheme's metadata...
 * `remove`: Remove a scheme's version from the repo,...
@@ -241,32 +242,32 @@ $ primal-page aliases [OPTIONS] COMMAND [ARGS]...
 
 **Commands**:
 
-* `add`: Add an alias:schemeid to the alias file
-* `remove`: Remove an alias from the alias file
+* `add`: Add an alias:schemename to the alias file
+* `remove`: Remove an alias from the alias file.
 
 ### `primal-page aliases add`
 
-Add an alias:schemeid to the alias file
+Add an alias:schemename to the alias file
 
 **Usage**:
 
 ```console
-$ primal-page aliases add [OPTIONS] ALIASES_FILE ALIAS SCHEMEID
+$ primal-page aliases add [OPTIONS] ALIASES_FILE ALIAS SCHEMENAME
 ```
 
 **Arguments**:
 
 * `ALIASES_FILE`: The path to the alias file to write to  [required]
 * `ALIAS`: The alias to add  [required]
-* `SCHEMEID`: The schemeid to add the alias refers to. In the form of 'schemename/ampliconsize/schemeversion'  [required]
+* `SCHEMENAME`: The schemename the alias refers to  [required]
 
 **Options**:
 
 * `--help`: Show this message and exit.
 
 ### `primal-page aliases remove`
 
-Remove an alias from the alias file
+Remove an alias from the alias file. Does nothing if the alias does not exist
 
 **Usage**:
 
@@ -326,15 +327,15 @@ $ primal-page create [OPTIONS] SCHEMEPATH
 * `--schemestatus [withdrawn|deprecated|autogenerated|draft|tested|validated]`: Scheme status  [default: draft]
 * `--citations TEXT`: Any associated citations. Please use DOI
 * `--primerbed PATH`: Manually specify the primer bed file, default is *primer.bed
-* `--reference PATH`: Manually specify the referance.fasta file, default is *.fasta
+* `--reference PATH`: Manually specify the reference.fasta file, default is *.fasta
 * `--output PATH`: Where to output the scheme  [default: primerschemes]
 * `--configpath PATH`: Where the config.json file is located
 * `--algorithmversion TEXT`: The version of primalscheme or other
 * `--description TEXT`: A description of the scheme
 * `--derivedfrom TEXT`: Which scheme has this scheme been derived from
 * `--primerclass [primerschemes]`: The primer class  [default: primerschemes]
-* `--collection [ARTIC|MODJADJI|QUICK-LAB|COMMUNITY|WASTE-WATER|CLINAL-ISOLATES|WHOLE-GENOME|PANEL|MULTI-TARGET]`: The collection
-* `--link-protocal TEXT`: Optional link to protocol
+* `--collection [ARTIC|MODJADJI|QUICK-LAB|COMMUNITY|WASTE-WATER|CLINICAL-ISOLATES|WHOLE-GENOME|PANEL|MULTI-TARGET]`: The collection
+* `--link-protocol TEXT`: Optional link to protocol
 * `--link-validation TEXT`: Optional link to validation data
 * `--links-homepage TEXT`: Optional link to homepage
 * `--link-vendor TEXT`: Optional link to vendors
@@ -425,6 +426,7 @@ $ primal-page modify [OPTIONS] COMMAND [ARGS]...
 * `change-license`: Replaces the license in the info.json file
 * `change-primerclass`: Change the primerclass field in the info.json
 * `change-status`: Change the status field in the info.json
+* `regenerate`: Validates the info.json and regenerate the...
 * `remove-author`: Remove an author from the authors list in...
 * `remove-citation`: Remove an citation form the authors list...
 * `remove-collection`: Remove an Collection from the Collection...
@@ -477,13 +479,13 @@ Add a Collection to the Collection list in the info.json file
 **Usage**:
 
 ```console
-$ primal-page modify add-collection [OPTIONS] SCHEMEINFO COLLECTION:{ARTIC|MODJADJI|QUICK-LAB|COMMUNITY|WASTE-WATER|CLINAL-ISOLATES|WHOLE-GENOME|PANEL|MULTI-TARGET}
+$ primal-page modify add-collection [OPTIONS] SCHEMEINFO COLLECTION:{ARTIC|MODJADJI|QUICK-LAB|COMMUNITY|WASTE-WATER|CLINICAL-ISOLATES|WHOLE-GENOME|PANEL|MULTI-TARGET}
 ```
 
 **Arguments**:
 
 * `SCHEMEINFO`: The path to info.json  [required]
-* `COLLECTION:{ARTIC|MODJADJI|QUICK-LAB|COMMUNITY|WASTE-WATER|CLINAL-ISOLATES|WHOLE-GENOME|PANEL|MULTI-TARGET}`: The Collection to add  [required]
+* `COLLECTION:{ARTIC|MODJADJI|QUICK-LAB|COMMUNITY|WASTE-WATER|CLINICAL-ISOLATES|WHOLE-GENOME|PANEL|MULTI-TARGET}`: The Collection to add  [required]
 
 **Options**:
 
@@ -502,7 +504,7 @@ $ primal-page modify add-link [OPTIONS] SCHEMEINFO LINKFIELD LINK
 **Arguments**:
 
 * `SCHEMEINFO`: The path to info.json  [required]
-* `LINKFIELD`: The link field to add to. protocals, validation, homepage, vendors, misc  [required]
+* `LINKFIELD`: The link field to add to. protocols, validation, homepage, vendors, misc  [required]
 * `LINK`: The link to add.  [required]
 
 **Options**:
@@ -522,7 +524,7 @@ $ primal-page modify change-contactinfo [OPTIONS] SCHEMEINFO CONTACTINFO
 **Arguments**:
 
 * `SCHEMEINFO`: The path to info.json  [required]
-* `CONTACTINFO`: The contact infomation for this scheme. Use 'None' to remove the contact info  [required]
+* `CONTACTINFO`: The contact information for this scheme. Use 'None' to remove the contact info  [required]
 
 **Options**:
 
@@ -623,6 +625,24 @@ $ primal-page modify change-status [OPTIONS] SCHEMEINFO [SCHEMESTATUS]:[withdraw
 
 * `--help`: Show this message and exit.
 
+### `primal-page modify regenerate`
+
+Validates the info.json and regenerate the README.md
+
+**Usage**:
+
+```console
+$ primal-page modify regenerate [OPTIONS] SCHEMEINFO
+```
+
+**Arguments**:
+
+* `SCHEMEINFO`: The path to info.json  [required]
+
+**Options**:
+
+* `--help`: Show this message and exit.
+
 ### `primal-page modify remove-author`
 
 Remove an author from the authors list in the info.json file
@@ -668,13 +688,13 @@ Remove an Collection from the Collection list in the info.json file
 **Usage**:
 
 ```console
-$ primal-page modify remove-collection [OPTIONS] SCHEMEINFO COLLECTION:{ARTIC|MODJADJI|QUICK-LAB|COMMUNITY|WASTE-WATER|CLINAL-ISOLATES|WHOLE-GENOME|PANEL|MULTI-TARGET}
+$ primal-page modify remove-collection [OPTIONS] SCHEMEINFO COLLECTION:{ARTIC|MODJADJI|QUICK-LAB|COMMUNITY|WASTE-WATER|CLINICAL-ISOLATES|WHOLE-GENOME|PANEL|MULTI-TARGET}
 ```
 
 **Arguments**:
 
 * `SCHEMEINFO`: The path to info.json  [required]
-* `COLLECTION:{ARTIC|MODJADJI|QUICK-LAB|COMMUNITY|WASTE-WATER|CLINAL-ISOLATES|WHOLE-GENOME|PANEL|MULTI-TARGET}`: The Collection to remove  [required]
+* `COLLECTION:{ARTIC|MODJADJI|QUICK-LAB|COMMUNITY|WASTE-WATER|CLINICAL-ISOLATES|WHOLE-GENOME|PANEL|MULTI-TARGET}`: The Collection to remove  [required]
 
 **Options**:
 
@@ -693,7 +713,7 @@ $ primal-page modify remove-link [OPTIONS] SCHEMEINFO LINKFIELD LINK
 **Arguments**:
 
 * `SCHEMEINFO`: The path to info.json  [required]
-* `LINKFIELD`: The link field to remove from. protocals, validation, homepage, vendors, misc  [required]
+* `LINKFIELD`: The link field to remove from. protocols, validation, homepage, vendors, misc  [required]
 * `LINK`: The link to remove.  [required]
 
 **Options**:
@@ -713,7 +733,7 @@ $ primal-page modify reorder-authors [OPTIONS] SCHEMEINFO [AUTHOR_INDEX]
 **Arguments**:
 
 * `SCHEMEINFO`: The path to info.json  [required]
-* `[AUTHOR_INDEX]`: The indexes in the new order, seperated by spaces. e.g. 1 0 2. Any indexes not provided will be appended to the end
+* `[AUTHOR_INDEX]`: The indexes in the new order, separated by spaces. e.g. 1 0 2. Any indexes not provided will be appended to the end
 
 **Options**:
 

diff --git a/primal_page/build_index.py b/primal_page/build_index.py
@@ -1,20 +1,12 @@
-import hashlib
 import json
 import pathlib
 import sys
 
 from primal_page.logging import log
+from primal_page.modify import hash_file
 from primal_page.schemas import PrimerClass
 
 
-def hashfile(fname):
-    hash_md5 = hashlib.md5()
-    with open(fname, "rb") as f:
-        for chunk in iter(lambda: f.read(4096), b""):
-            hash_md5.update(chunk)
-    return hash_md5.hexdigest()
-
-
 def create_rawlink(repo, scheme_name, length, version, file, pclass) -> str:
     return f"https://raw.githubusercontent.com/{repo}/main/{pclass}/{scheme_name}/{length}/{version}/{file}"
 
@@ -49,14 +41,14 @@ def parse_version(
     version_dict["primer_bed_url"] = create_rawlink(
         repo_url, scheme_name, length, version.name, primerbed.name, pclass
     )
-    version_dict["primer_bed_md5"] = hashfile(primerbed)
+    version_dict["primer_bed_md5"] = hash_file(primerbed)
 
     # Add the reference.fasta file
     reference = version_path / "reference.fasta"
     version_dict["reference_fasta_url"] = create_rawlink(
         repo_url, scheme_name, length, version.name, reference.name, pclass
     )
-    version_dict["reference_fasta_md5"] = hashfile(reference)
+    version_dict["reference_fasta_md5"] = hash_file(reference)
 
     # Add the info.json file url
     version_dict["info_json_url"] = create_rawlink(
@@ -185,7 +177,7 @@ def create_index(
     Args:
         server_url (str): The URL of the server.
         repo_url (str): The URL of the repository.
-        parent_dir (str, optional): The parent directory path containing the primerscheme dir. index.json will be writem to parent_dir/index.json Defaults to ".".
+        parent_dir (str, optional): The parent directory path containing the primerscheme dir. index.json will be written to parent_dir/index.json Defaults to ".".
         git_commit (str, optional): The git commit hash. Defaults to None.
         force (bool, optional): Force the creation of the index.json file. Allowing the change of hashes
 

diff --git a/primal_page/dev.py b/primal_page/dev.py
@@ -11,7 +11,7 @@
     regenerate_v3_bedfile,
 )
 from primal_page.logging import log
-from primal_page.modify import generate_files, hashfile, trim_file_whitespace
+from primal_page.modify import generate_files, hash_file, trim_file_whitespace
 from primal_page.schemas import (
     INFO_SCHEMA,
     BedfileVersion,
@@ -53,7 +53,7 @@ def regenerate(
 
     # Hash the reference.fasta file
     # If the hash is different, rewrite the file
-    ref_hash = hashfile(scheme_path / "reference.fasta")
+    ref_hash = hash_file(scheme_path / "reference.fasta")
     ref_str = "".join(
         x.format("fasta") for x in SeqIO.parse(scheme_path / "reference.fasta", "fasta")
     )
@@ -70,8 +70,8 @@ def regenerate(
     info_json["articbedversion"] = articbedversion.value
 
     # Regenerate the files hashes
-    info_json["primer_bed_md5"] = hashfile(scheme_path / "primer.bed")
-    info_json["reference_fasta_md5"] = hashfile(scheme_path / "reference.fasta")
+    info_json["primer_bed_md5"] = hash_file(scheme_path / "primer.bed")
+    info_json["reference_fasta_md5"] = hash_file(scheme_path / "reference.fasta")
 
     info = Info(**info_json)
     info.infoschema = INFO_SCHEMA

diff --git a/primal_page/main.py b/primal_page/main.py
@@ -25,7 +25,7 @@
 from primal_page.modify import app as modify_app
 from primal_page.modify import (
     generate_files,
-    hashfile,
+    hash_file,
     trim_file_whitespace,
 )
 from primal_page.schemas import (
@@ -114,9 +114,7 @@ def find_ref(
     if cli_reference is None:  # No reference specified
         # Search for a single *.fasta
         reference_list: list[pathlib.Path] = [
-            path
-            for path in found_files
-            if path.name == ("reference.fasta") or path.name == ("referance.fasta")
+            path for path in found_files if path.name == ("reference.fasta")
         ]
         if len(reference_list) == 1:
             return reference_list[0]
@@ -236,7 +234,7 @@ def create(
     reference: Annotated[
         Optional[pathlib.Path],
         typer.Option(
-            help="Manually specify the referance.fasta file, default is *.fasta",
+            help="Manually specify the reference.fasta file, default is *.fasta",
             readable=True,
         ),
     ] = None,
@@ -266,7 +264,7 @@ def create(
     collection: Annotated[
         Optional[list[Collection]], typer.Option(help="The collection")
     ] = None,
-    link_protocal: Annotated[
+    link_protocol: Annotated[
         list[str], typer.Option(help="Optional link to protocol")
     ] = [],
     link_validation: Annotated[
@@ -387,7 +385,7 @@ def create(
 
     # Create the links set
     links = Links(
-        protocols=link_protocal,
+        protocols=link_protocol,
         validation=link_validation,
         homepage=links_homepage,
         vendors=link_vendor,
@@ -426,7 +424,7 @@ def create(
     repo_dir.mkdir(parents=True)
 
     # If this fails it will deleted the half completed scheme
-    # Need to check the repo doesnt already exist
+    # Need to check the repo doesn't already exist
     try:
         # Copy files and trim whitespace
         if fix:
@@ -440,8 +438,8 @@ def create(
             SeqIO.write(records, ref_file, "fasta")
 
         # Update the hashes in the info.json
-        info.primer_bed_md5 = hashfile(repo_dir / "primer.bed")
-        info.reference_fasta_md5 = hashfile(repo_dir / "reference.fasta")
+        info.primer_bed_md5 = hash_file(repo_dir / "primer.bed")
+        info.reference_fasta_md5 = hash_file(repo_dir / "reference.fasta")
 
         working_dir = repo_dir / "work"
         working_dir.mkdir()

diff --git a/primal_page/modify.py b/primal_page/modify.py
@@ -37,7 +37,7 @@ def trim_file_whitespace(in_path: pathlib.Path, out_path: pathlib.Path):
         outfile.writelines(inlines)
 
 
-def hashfile(fname: pathlib.Path) -> str:
+def hash_file(fname: pathlib.Path) -> str:
     hash_md5 = hashlib.md5()
     with open(fname, "rb") as f:
         for chunk in iter(lambda: f.read(4096), b""):
@@ -249,7 +249,7 @@ def reorder_authors(
     author_index: Annotated[
         Optional[str],
         typer.Argument(
-            help="The indexes in the new order, seperated by spaces. e.g. 1 0 2. Any indexes not provided will be appended to the end"
+            help="The indexes in the new order, separated by spaces. e.g. 1 0 2. Any indexes not provided will be appended to the end"
         ),
     ] = None,
 ):
@@ -265,7 +265,7 @@ def reorder_authors(
 
         # Get the new order
         new_order_str: str = typer.prompt(
-            "Please provide the indexes in the new order, seperated by spaces. e.g. 1 0 2. Any indexes not provided will be appended to the end",
+            "Please provide the indexes in the new order, separated by spaces. e.g. 1 0 2. Any indexes not provided will be appended to the end",
             type=str,
         )
         new_order = [int(x) for x in new_order_str.split()]
@@ -520,7 +520,7 @@ def change_contactinfo(
     contactinfo: Annotated[
         Optional[str],
         typer.Argument(
-            help="The contact infomation for this scheme. Use 'None' to remove the contact info",
+            help="The contact information for this scheme. Use 'None' to remove the contact info",
         ),
     ],
 ):