pre-commit run

Speech-Lab-IITM · Mar 15, 2023 · 9d40b6b · 9d40b6b
1 parent d203093
commit 9d40b6b
Show file tree

Hide file tree

Showing 1,213 changed files with 2,726 additions and 3,168 deletions.
diff --git a/.github/ISSUE_TEMPLATE/bug_report.md b/.github/ISSUE_TEMPLATE/bug_report.md
@@ -17,7 +17,7 @@ A clear and concise description of what the bug is.
  - Git hash [e.g. b88e89fc7246fed4c2842b55baba884fe1b4ecc2]
    - Commit date [e.g. Tue Sep 1 09:32:54 2020 -0400]
  - pytorch version [e.g. pytorch 1.4.0]
- 
+
 You can obtain them by the following command
 ```
 cd <espnet-root>/tools

diff --git a/.github/ISSUE_TEMPLATE/installation-issue-template.md b/.github/ISSUE_TEMPLATE/installation-issue-template.md
@@ -24,7 +24,7 @@ cd <espnet-root>/tools
  - Git hash [e.g. b88e89fc7246fed4c2842b55baba884fe1b4ecc2]
    - Commit date [e.g. Tue Sep 1 09:32:54 2020 -0400]
  - pytorch version [e.g. pytorch 1.4.0]
- 
+
 You can obtain them by the following command
 ```
 cd <espnet-root>/tools

diff --git a/.github/workflows/docker.yml b/.github/workflows/docker.yml
@@ -23,19 +23,19 @@ jobs:
         uses: docker/setup-buildx-action@v1
 
       - name: Login to DockerHub
-        uses: docker/login-action@v1 
+        uses: docker/login-action@v1
         with:
           username: ${{ secrets.DOCKERHUB_USERNAME }}
           password: ${{ secrets.DOCKERHUB_TOKEN }}
-      
+
       - name: Build and push CPU container
         run: |
           cd docker
           docker build --build-arg FROM_TAG=runtime-latest \
             -f prebuilt/devel.dockerfile \
             --target devel \
             -t espnet/espnet:cpu-latest .
-          docker push espnet/espnet:cpu-latest   
+          docker push espnet/espnet:cpu-latest
 
       - name: Build and push GPU container
         run: |

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -5,6 +5,10 @@ repos:
     rev: v3.2.0
     hooks:
     -   id: trailing-whitespace
+        exclude: ^(egs2/TEMPLATE/asr1/steps|egs2/TEMPLATE/asr1/steps|egs2/TEMPLATE/tts1/sid)
     -   id: end-of-file-fixer
-    # -   id: check-yaml
+        exclude: ^(egs2/TEMPLATE/asr1/steps|egs2/TEMPLATE/asr1/steps|egs2/TEMPLATE/tts1/sid)
+    -   id: check-yaml
+        exclude: ^(egs2/TEMPLATE/asr1/steps|egs2/TEMPLATE/asr1/steps|egs2/TEMPLATE/tts1/sid)
     -   id: check-added-large-files
+        exclude: ^(egs2/TEMPLATE/asr1/steps|egs2/TEMPLATE/asr1/steps|egs2/TEMPLATE/tts1/sid)
diff --git a/doc/.gitignore b/doc/.gitignore
@@ -1,4 +1,4 @@
 _gen/
 _build/
 build/
-notebook/
+notebook/
diff --git a/doc/apis/utils_sh.rst b/doc/apis/utils_sh.rst
@@ -6,4 +6,3 @@ bash utility tools
 ESPnet provides several command-line bash tools under ``utils/``
 
 .. include:: ../_gen/utils_sh.rst
-
diff --git a/doc/docker.md b/doc/docker.md
@@ -53,7 +53,7 @@ $ cd docker
 $ ./run.sh --docker-gpu -1 --docker-egs an4/asr1 --ngpu 0
 ```
 
-The script will build a docker if your are using a `user` different from `root` user. To use containers with `root` access 
+The script will build a docker if your are using a `user` different from `root` user. To use containers with `root` access
 add the flag `--is-root` to the command line.
 
 
@@ -104,4 +104,4 @@ Pytorch 1.3.1, No warp-ctc:
 Pytorch 1.0.1, warp-ctc:
 
 - [`cuda10.0-cudnn7` (*docker/prebuilt/gpu/10.0/cudnn7/Dockerfile*)](https://github.com/espnet/espnet/tree/master/docker/prebuilt/devel/gpu/10.0/cudnn7/Dockerfile)
-- [`cpu-u18` (*docker/prebuilt/devel/Dockerfile*)](https://github.com/espnet/espnet/tree/master/docker/prebuilt/devel/Dockerfile)
+- [`cpu-u18` (*docker/prebuilt/devel/Dockerfile*)](https://github.com/espnet/espnet/tree/master/docker/prebuilt/devel/Dockerfile)
diff --git a/doc/espnet2_task.md b/doc/espnet2_task.md
@@ -1,12 +1,12 @@
 # Task class and data input system for training
 ## Task class
 
-In ESpnet1, we have too many duplicated python modules. 
-One of the big purposes of ESPnet2 is to provide a common interface and 
+In ESpnet1, we have too many duplicated python modules.
+One of the big purposes of ESPnet2 is to provide a common interface and
 enable us to focus more on the unique parts of each task.
 
-`Task` class is a common system to build training tools for each task, 
-ASR, TTS, LM, etc. inspired by `Fairseq Task` idea. 
+`Task` class is a common system to build training tools for each task,
+ASR, TTS, LM, etc. inspired by `Fairseq Task` idea.
 To build your task, only you have to do is just inheriting `AbsTask` class:
 
 ```python
@@ -57,7 +57,7 @@ if __name__ == "__main__":
 Espnet2 also provides a command line interface to describe the training corpus.
 On the contrary, unlike `fairseq` or training system such as `pytorch-lightning`,
 our `Task` class doesn't have an interface for building the dataset explicitly.
-This is because we aim at the task related to speech/text only, 
+This is because we aim at the task related to speech/text only,
 so we don't need such general system so far.
 
 The following is an example of the command lint arguments:
@@ -82,15 +82,15 @@ for batch in iterator:
 
 Where the `model` is same as the model built by `Task.build_model()`.
 
-You can flexibly construct this mini-batch object 
+You can flexibly construct this mini-batch object
 using `--*_data_path_and_name_and_type`.
-`--*_data_path_and_name_and_type` can be repeated as you need and 
+`--*_data_path_and_name_and_type` can be repeated as you need and
 each `--*_data_path_and_name_and_type` corresponds to an element in the mini-batch.
 Also, keep in mind that **there is no distinction between input and target data**.
 
 
-The argument of `--train_data_path_and_name_and_type` 
-should be given as three values separated by commas, 
+The argument of `--train_data_path_and_name_and_type`
+should be given as three values separated by commas,
 like `<file-path>,<key-name>,<file-format>`.
 
 - `key-name` specify the key of dict
@@ -106,8 +106,8 @@ python -m espnet2.bin.asr_train --help
 ```
 
 Almost all formats are referred as `scp` file  according to Kaldi-ASR.
-`scp` is just a text file which has two columns for each line: 
-The first indicates the sample id and the second is some value. 
+`scp` is just a text file which has two columns for each line:
+The first indicates the sample id and the second is some value.
 e.g. file path, transcription, a sequence of numbers.
 
 
@@ -139,11 +139,11 @@ e.g. file path, transcription, a sequence of numbers.
 
 
 ### `required_data_names()` and `optional_data_names()`
-Though an arbitrary dictionary can be created by this system, 
-each task assumes that the specific key is given for a specific purpose. 
+Though an arbitrary dictionary can be created by this system,
+each task assumes that the specific key is given for a specific purpose.
 e.g. ASR Task requires `speech` and `text` keys and
-each value is used for input data and target data respectively. 
-See again the methods of `Task` class: 
+each value is used for input data and target data respectively.
+See again the methods of `Task` class:
 `required_data_names()` and `optional_data_names()`.
 
 
@@ -176,7 +176,7 @@ python -m new_task \
   --train_data_path_and_name_and_type=filepath,unknown,sometype
 ```
 
-The intention of this system is just an assertion check, so if feel unnecessary, 
+The intention of this system is just an assertion check, so if feel unnecessary,
 you can turn off this checking with `--allow_variable_data_keys true`.
 
 ```bash
@@ -197,7 +197,7 @@ class NewTask(AbsTask):
     ...
 ```
 
-`collcate_fn` is an argument of `torch.utils.data.DataLoader` and 
+`collcate_fn` is an argument of `torch.utils.data.DataLoader` and
 it can modify the data which is received from data-loader. e.g.:
 
 ```python
@@ -212,8 +212,8 @@ for modified_data in data_loader:
     ...
 ```
 
-The type of argument is determined by the input `dataset` class and 
-our dataset is always `espnet2.train.dataset.ESPnetDataset`, 
+The type of argument is determined by the input `dataset` class and
+our dataset is always `espnet2.train.dataset.ESPnetDataset`,
 which the return value is a tuple of sample id and a dict of tensor,
 
 ```python
@@ -230,16 +230,16 @@ data = [
 ]
 ```
 
-The return type of collate_fn is supposed to be a tuple of list and a dict of tensor in espnet2, 
+The return type of collate_fn is supposed to be a tuple of list and a dict of tensor in espnet2,
 so the collcate_fn for `Task` must transform the data type to it.
 
 ```python
 for ids, batch in data_loader:
   model(**batch)
 ```
 
-We provide common collate_fn and this function can support many cases, 
-so you might not need to customize it. 
+We provide common collate_fn and this function can support many cases,
+so you might not need to customize it.
 This collate_fn is aware of variable sequence features for seq2seq task:
 
 - The first axis of the sequence tensor from dataset must be length axis: e.g. (Length, Dim), (Length, Dim, Dim2), or (Length, ...)
Original file line number	Diff line number	Diff line change
Expand Up		@@ -6,4 +6,3 @@ bash utility tools
		ESPnet provides several command-line bash tools under ``utils/``

		.. include:: ../_gen/utils_sh.rst