Skip to content

Commit

Permalink
pre-commit run
Browse files Browse the repository at this point in the history
  • Loading branch information
kamo-naoyuki committed Mar 15, 2023
1 parent d203093 commit 9d40b6b
Show file tree
Hide file tree
Showing 1,213 changed files with 2,726 additions and 3,168 deletions.
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ A clear and concise description of what the bug is.
- Git hash [e.g. b88e89fc7246fed4c2842b55baba884fe1b4ecc2]
- Commit date [e.g. Tue Sep 1 09:32:54 2020 -0400]
- pytorch version [e.g. pytorch 1.4.0]

You can obtain them by the following command
```
cd <espnet-root>/tools
Expand Down
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/installation-issue-template.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ cd <espnet-root>/tools
- Git hash [e.g. b88e89fc7246fed4c2842b55baba884fe1b4ecc2]
- Commit date [e.g. Tue Sep 1 09:32:54 2020 -0400]
- pytorch version [e.g. pytorch 1.4.0]

You can obtain them by the following command
```
cd <espnet-root>/tools
Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/docker.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,19 +23,19 @@ jobs:
uses: docker/setup-buildx-action@v1

- name: Login to DockerHub
uses: docker/login-action@v1
uses: docker/login-action@v1
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}

- name: Build and push CPU container
run: |
cd docker
docker build --build-arg FROM_TAG=runtime-latest \
-f prebuilt/devel.dockerfile \
--target devel \
-t espnet/espnet:cpu-latest .
docker push espnet/espnet:cpu-latest
docker push espnet/espnet:cpu-latest
- name: Build and push GPU container
run: |
Expand Down
6 changes: 5 additions & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,10 @@ repos:
rev: v3.2.0
hooks:
- id: trailing-whitespace
exclude: ^(egs2/TEMPLATE/asr1/steps|egs2/TEMPLATE/asr1/steps|egs2/TEMPLATE/tts1/sid)
- id: end-of-file-fixer
# - id: check-yaml
exclude: ^(egs2/TEMPLATE/asr1/steps|egs2/TEMPLATE/asr1/steps|egs2/TEMPLATE/tts1/sid)
- id: check-yaml
exclude: ^(egs2/TEMPLATE/asr1/steps|egs2/TEMPLATE/asr1/steps|egs2/TEMPLATE/tts1/sid)
- id: check-added-large-files
exclude: ^(egs2/TEMPLATE/asr1/steps|egs2/TEMPLATE/asr1/steps|egs2/TEMPLATE/tts1/sid)
2 changes: 1 addition & 1 deletion doc/.gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
_gen/
_build/
build/
notebook/
notebook/
1 change: 0 additions & 1 deletion doc/apis/utils_sh.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,3 @@ bash utility tools
ESPnet provides several command-line bash tools under ``utils/``

.. include:: ../_gen/utils_sh.rst

4 changes: 2 additions & 2 deletions doc/docker.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ $ cd docker
$ ./run.sh --docker-gpu -1 --docker-egs an4/asr1 --ngpu 0
```

The script will build a docker if your are using a `user` different from `root` user. To use containers with `root` access
The script will build a docker if your are using a `user` different from `root` user. To use containers with `root` access
add the flag `--is-root` to the command line.


Expand Down Expand Up @@ -104,4 +104,4 @@ Pytorch 1.3.1, No warp-ctc:
Pytorch 1.0.1, warp-ctc:

- [`cuda10.0-cudnn7` (*docker/prebuilt/gpu/10.0/cudnn7/Dockerfile*)](https://github.com/espnet/espnet/tree/master/docker/prebuilt/devel/gpu/10.0/cudnn7/Dockerfile)
- [`cpu-u18` (*docker/prebuilt/devel/Dockerfile*)](https://github.com/espnet/espnet/tree/master/docker/prebuilt/devel/Dockerfile)
- [`cpu-u18` (*docker/prebuilt/devel/Dockerfile*)](https://github.com/espnet/espnet/tree/master/docker/prebuilt/devel/Dockerfile)
44 changes: 22 additions & 22 deletions doc/espnet2_task.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# Task class and data input system for training
## Task class

In ESpnet1, we have too many duplicated python modules.
One of the big purposes of ESPnet2 is to provide a common interface and
In ESpnet1, we have too many duplicated python modules.
One of the big purposes of ESPnet2 is to provide a common interface and
enable us to focus more on the unique parts of each task.

`Task` class is a common system to build training tools for each task,
ASR, TTS, LM, etc. inspired by `Fairseq Task` idea.
`Task` class is a common system to build training tools for each task,
ASR, TTS, LM, etc. inspired by `Fairseq Task` idea.
To build your task, only you have to do is just inheriting `AbsTask` class:

```python
Expand Down Expand Up @@ -57,7 +57,7 @@ if __name__ == "__main__":
Espnet2 also provides a command line interface to describe the training corpus.
On the contrary, unlike `fairseq` or training system such as `pytorch-lightning`,
our `Task` class doesn't have an interface for building the dataset explicitly.
This is because we aim at the task related to speech/text only,
This is because we aim at the task related to speech/text only,
so we don't need such general system so far.

The following is an example of the command lint arguments:
Expand All @@ -82,15 +82,15 @@ for batch in iterator:

Where the `model` is same as the model built by `Task.build_model()`.

You can flexibly construct this mini-batch object
You can flexibly construct this mini-batch object
using `--*_data_path_and_name_and_type`.
`--*_data_path_and_name_and_type` can be repeated as you need and
`--*_data_path_and_name_and_type` can be repeated as you need and
each `--*_data_path_and_name_and_type` corresponds to an element in the mini-batch.
Also, keep in mind that **there is no distinction between input and target data**.


The argument of `--train_data_path_and_name_and_type`
should be given as three values separated by commas,
The argument of `--train_data_path_and_name_and_type`
should be given as three values separated by commas,
like `<file-path>,<key-name>,<file-format>`.

- `key-name` specify the key of dict
Expand All @@ -106,8 +106,8 @@ python -m espnet2.bin.asr_train --help
```

Almost all formats are referred as `scp` file according to Kaldi-ASR.
`scp` is just a text file which has two columns for each line:
The first indicates the sample id and the second is some value.
`scp` is just a text file which has two columns for each line:
The first indicates the sample id and the second is some value.
e.g. file path, transcription, a sequence of numbers.


Expand Down Expand Up @@ -139,11 +139,11 @@ e.g. file path, transcription, a sequence of numbers.
### `required_data_names()` and `optional_data_names()`
Though an arbitrary dictionary can be created by this system,
each task assumes that the specific key is given for a specific purpose.
Though an arbitrary dictionary can be created by this system,
each task assumes that the specific key is given for a specific purpose.
e.g. ASR Task requires `speech` and `text` keys and
each value is used for input data and target data respectively.
See again the methods of `Task` class:
each value is used for input data and target data respectively.
See again the methods of `Task` class:
`required_data_names()` and `optional_data_names()`.
Expand Down Expand Up @@ -176,7 +176,7 @@ python -m new_task \
--train_data_path_and_name_and_type=filepath,unknown,sometype
```

The intention of this system is just an assertion check, so if feel unnecessary,
The intention of this system is just an assertion check, so if feel unnecessary,
you can turn off this checking with `--allow_variable_data_keys true`.

```bash
Expand All @@ -197,7 +197,7 @@ class NewTask(AbsTask):
...
```

`collcate_fn` is an argument of `torch.utils.data.DataLoader` and
`collcate_fn` is an argument of `torch.utils.data.DataLoader` and
it can modify the data which is received from data-loader. e.g.:

```python
Expand All @@ -212,8 +212,8 @@ for modified_data in data_loader:
...
```

The type of argument is determined by the input `dataset` class and
our dataset is always `espnet2.train.dataset.ESPnetDataset`,
The type of argument is determined by the input `dataset` class and
our dataset is always `espnet2.train.dataset.ESPnetDataset`,
which the return value is a tuple of sample id and a dict of tensor,

```python
Expand All @@ -230,16 +230,16 @@ data = [
]
```

The return type of collate_fn is supposed to be a tuple of list and a dict of tensor in espnet2,
The return type of collate_fn is supposed to be a tuple of list and a dict of tensor in espnet2,
so the collcate_fn for `Task` must transform the data type to it.

```python
for ids, batch in data_loader:
model(**batch)
```

We provide common collate_fn and this function can support many cases,
so you might not need to customize it.
We provide common collate_fn and this function can support many cases,
so you might not need to customize it.
This collate_fn is aware of variable sequence features for seq2seq task:

- The first axis of the sequence tensor from dataset must be length axis: e.g. (Length, Dim), (Length, Dim, Dim2), or (Length, ...)
Expand Down
Loading

0 comments on commit 9d40b6b

Please sign in to comment.