Skip to content

Commit

Permalink
[doc] fix broken links and mddocs rendering issues
Browse files Browse the repository at this point in the history
Change-Id: I08dfa11d0698d5a666e121ce49e88d19647397cb
  • Loading branch information
frankfliu committed Jul 15, 2020
1 parent a35702d commit f145b2d
Show file tree
Hide file tree
Showing 37 changed files with 221 additions and 84 deletions.
1 change: 1 addition & 0 deletions 3rdparty/aws-ai/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ this module in your project, [ModelZoo](https://javadoc.io/doc/ai.djl/api/latest
can load models from your s3 bucket.

The following pseudocode demonstrates how to load model from S3:

```java
Criteria<Image, Classifications> criteria =
Criteria.builder()
Expand Down
2 changes: 2 additions & 0 deletions 3rdparty/hadoop/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ HDFS is widely used in Spark applications. We introduce HDFS integration for DJL
With this module, you can directly load model from HDFS url.

The following pseudocode demonstrates how to load model from HDFS url:

```java
Criteria<Image, Classifications> criteria =
Criteria.builder()
Expand Down Expand Up @@ -40,6 +41,7 @@ You can also build the latest javadocs locally using the following command:
```sh
./gradlew javadoc
```

The javadocs output is built in the build/doc/javadoc folder.


Expand Down
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ To send us a pull request, please:

1. Fork the repository.
2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change.
3. Look at the [contributor documentation](https://github.com/awslabs/djl/tree/master/docs/development), especially the docs in bold, for help setting up your development environment and information about various conventions.
3. Look at the [contributor documentation](https://github.com/awslabs/djl/tree/master/docs/development/README.md), especially the docs in bold, for help setting up your development environment and information about various conventions.
4. Ensure local tests pass.
5. Commit to your fork using clear commit messages.
6. Send us a pull request, answering any default questions in the pull request interface.
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,7 @@ Once you have checked out the code locally, you can build it as follows using Gr
```

To increase build speed, you can use the following command to skip unit tests:

```sh
./gradlew build -x test
```
Expand Down
1 change: 1 addition & 0 deletions android/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ You need to have Android SDK and Android NDK installed on your machine.
The minimum API level for DJL Android is 26.

In gradle, you can include the snapshot repository and add the 4 modules in your dependencies:

```
dependencies {
implementation "ai.djl:api:0.6.0"
Expand Down
1 change: 1 addition & 0 deletions docs/development/add_model_to_model-zoo.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@ The official DJL ML repository is located on an S3 bucket managed by the AWS DJL
For non-team members, coordinate with a team member in your pull request to coordinate adding the necessary files.

For AWS team members, run the following command to upload your model to the S3 bucket:

```shell
$ ./gradlew syncS3
```
Expand Down
1 change: 1 addition & 0 deletions docs/development/configure_logging.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ For Gradle:
```

Then you can use system properties to configure slf4j-simple log level:

```
-Dorg.slf4j.simpleLogger.defaultLogLevel=debug
```
Expand Down
12 changes: 11 additions & 1 deletion docs/development/development_guideline.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ When writing code for DJL, we usually try to follow standard Java coding convent
Alongside these conventions, we have a number of checks that are run including PMD, SpotBugs, and Checkstyle. These can all be verified by running the gradle `build` target. Instructions for fixing any problems will be given by the relevant tool.

We also follow the [AOSP Java Code Style](https://source.android.com/setup/contribute/code-style). See [here](https://github.com/google/google-java-format) for plugins that can help setup your IDE to use this style. The formatting is checked very strictly. Failing the formatting check will look like:

```
> Task :api:verifyJava FAILED
Expand All @@ -31,6 +32,7 @@ Execution failed for task ':api:verifyJava'.
> File not formatted: /Volumes/Unix/projects/Joule/api/src/main/java/ai/djl/nn/convolutional/Conv2d.java
See https://github.com/awslabs/djl/blob/master/docs/development/development_guideline.md#coding-conventions for formatting instructions
```

If you do fail the format check, the easiest way to resolve it is to run the gradle `formatJava` target to reformat your code. It may be helpful to just run the formatter before you build the project rather than waiting for the formatting verification to fail.

## Commit message conventions
Expand Down Expand Up @@ -71,6 +73,7 @@ For larger topics which do not have a corresponding javadoc section, they should
## Build

This project uses a gradle wrapper, so you don't have to install gradle on your machine. You can just call the gradle wrapper using the following command:

```
./gradlew
```
Expand All @@ -85,6 +88,7 @@ There are several gradle build targets you can use. The following are the most c
You can also run this from a subfolder to build for only the module within that folder.

Run the following command to list all available tasks:

```sh
./gradlew tasks --all
```
Expand All @@ -95,21 +99,27 @@ Sometimes you may need to run individual tests or examples.
If you are developing with an IDE, you can run a test by selecting the test and clicking the "Run" button.

From the command line, you can run the following command to run a test:

```
./gradlew :<module>:run -Dmain=<class_name> --args ""
```

For example, if you would like to run the complete integration test, you can use the following command:

```
./gradlew :integration:run -Dmain=ai.djl.integration.IntegrationTest
```

To run an individual integration test from the command line, use the following:

```
./gradlew :integration:run --args="-c <class_name> -m <method_name>"
```

## Logging

To get a better understanding of your problems when developing, you can enable logging by adding the following parameter to your test command:

```
-Dai.djl.logging.level=debug
```
Expand Down Expand Up @@ -149,7 +159,7 @@ You can create your own NDArray renderer as follows:
Please make sure to:

- Check the "On-demand" option, which causes IntelliJ to only render the NDArray when you click on the variable.
- Change the "Use following expression" field to something like [toDebugString(100, 10, 10, 20)](https://javadoc.io/static/ai.djl/api/0.6.0/ai/djl/ndarray/NDArray.html#toDebugString-ai.djl.ndarray.NDArray-int-int-int-int-)
- Change the "Use following expression" field to something like [toDebugString(100, 10, 10, 20)](https://javadoc.io/static/ai.djl/api/0.6.0/ai/djl/ndarray/NDArray.html#toDebugString-int-int-int-int-)
if you want to adjust the range of NDArray's debug output.

## Common Problems
Expand Down
16 changes: 16 additions & 0 deletions docs/development/how_to_use_dataset.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ the RandomAccessDataset.

The following code illustrates an implementation of ArrayDataset.
The ArrayDataset is recommended only if your dataset is small enough to fit in memory.

```
// given you have data1, data2 and label1, label2, label3
ArrayDataset dataset = new ArrayDataset.Builder()
Expand All @@ -31,6 +32,7 @@ ArrayDataset dataset = new ArrayDataset.Builder()
.build();
```

When you get the `Batch` from `trainer.iterateDataset(dataset)`,
you can use ``batch.getData()`` to get a NDList with size 2. You can then use `NDList.get(0)` to get your first data and `NDList.get(1)` to get your second data.
Similarly, you can use `batch.getLabels()` to get a NDList with size 3.
Expand All @@ -43,6 +45,7 @@ The ImageFolder dataset is recommended only if you want to iterate through your

### Step 1: Prepare Images
Arrange your image folder structure as follows:

```
dataset_root/shoes/Aerobic Shoes1.png
dataset_root/shoes/Aerobic Shose2.png
Expand All @@ -61,11 +64,13 @@ dataset_root/pumps/Red Pumps
dataset_root/pumps/Pink Pumps
...
```

The dataset will take the folder name e.g. `boots`, `pumps`, `shoes` in sorted order as your labels
**Note:** Nested folder structures are not currently supported

### Step 2: Use the Dataset
Add the following code snippet to your project to use the ImageFolder dataset.

```
ImageFolder dataset =
new ImageFolder.Builder()
Expand Down Expand Up @@ -104,23 +109,27 @@ The CSV file has the following format.
| sample.url.bad.com | 1 |

We'll also use the 3rd party [Apache Commons](https://commons.apache.org/) library to read the CSV file. To use the library, include the following dependency:

```
api group: 'org.apache.commons', name: 'commons-csv', version: '1.7'
```

### Step 2: Implementation
In order to extend the dataset, the following dependencies are required:

```
api "ai.djl:api:0.6.0"
api "ai.djl:basicdataset:0.6.0"
```

There are three parts we need to implement for CSVDataset.

1. Constructor and Builder
First, we need a private field that holds the CSVRecord list from the csv file.
We create a constructor and pass the CSVRecord list from builder to the class field.
For builder, we have all we need in `BaseBuilder` so we only need to include the two minimal methods as shown.
In the *build()* method, we take advantage of CSVParser to get the record of each CSV file and put them in CSVRecord list.

```
public class CSVDataset extends RandomAccessDataset {
Expand Down Expand Up @@ -158,10 +167,12 @@ public class CSVDataset extends RandomAccessDataset {
}
```

2. Getter
The getter returns a Record object which contains encoded inputs and labels.
Here, we use simple encoding to transform the url String to an int array and create a NDArray on top of it.
The reason why we use NDList here is that you might have multiple inputs and labels in different tasks.

```
@Override
public Record get(NDManager manager, long index) {
Expand All @@ -172,17 +183,21 @@ public Record get(NDManager manager, long index) {
return new Record(new NDList(datum), new NDList(label));
}
```

3. Size
The size of the dataset
Here, we can directly use the size of the List<CSVRecord>.

```
@Override
public long size() {
return csvRecords.size();
}
```

Done!
Now, you can use the CSVDataset with the following code snippet:

```
CSVDataset dataset = new CSVDataset.Builder().setSampling(batchSize, false).build();
for (Batch batch : dataset.getData(model.getNDManager())) {
Expand All @@ -194,5 +209,6 @@ for (Batch batch : dataset.getData(model.getNDManager())) {
batch.close();
}
```

Full example code could be found in [CSVDataset.java](https://github.com/awslabs/djl/blob/master/docs/development/CSVDataset.java).

16 changes: 11 additions & 5 deletions docs/development/memory_management.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,13 @@ Memory is one of the biggest challenge in area of Deep Learning. There are sever
Firstly, GC(Garbage Collector) doesn't have the control over the native memory.
Secondly, to close every AutoClosable manually makes the code boilerplate and not practical.
Thirdly, the system lacks in support of releasing a group of native resources.
As a result, we create the [NDManager](https://github.com/awslabs/djl/blob/master/api/src/main/java/ai/djl/ndarray/NDManager.java) to help us release the native memory.
As a result, we create the [NDManager](https://javadoc.io/doc/ai.djl/api/latest/ai/djl/ndarray/NDManager.html)
to help us release the native memory.

We design the NDManager in tree structure. It provides fine-grained control of native resource and manage the resource scope in more effectively way.
NDManager can make any kind of tree. However, using the Predictor/Trainer classes will automatically create a certain kind of tree.
The structure of the NDManager for the classic inference case is like ![structure of the NDManager](https://github.com/awslabs/djl/blob/master/docs/development/img/ndmanager_structure_for_inference.png).
The structure of the NDManager for the classic training case is like ![structure of the NDManager](https://github.com/awslabs/djl/blob/master/docs/development/img/ndmanager_structure_for_training.png).
The structure of the NDManager for the classic inference case is like ![structure of the NDManager](https://raw.githubusercontent.com/awslabs/djl/master/docs/development/img/ndmanager_structure_for_inference.png).
The structure of the NDManager for the classic training case is like ![structure of the NDManager](https://github.com/awslabs/djl/blob/master/docs/development/img/ndmanager_structure_for_training.png?raw=true).
The topmost is System NDManager. The model, which is one layer below, contains the weight and bias of the Neural Network.
The bottommost NDManager takes care of the intermediate NDArrays we would like to close as soon as the program exit the scope of the functions that use them.

Expand All @@ -29,10 +30,15 @@ Here is the reference code [processInput](https://github.com/awslabs/djl/blob/72
Note that if you don't specify NDManager in a NDArray operation, it uses the NDManger from the input NDArray.

## Training
The intermediate NDArrays involving in training case are usually 1. a batch of the dataset 2. custom operation you write.
The intermediate NDArrays involving in training case are usually

1. a batch of the dataset
2. custom operation you write

In general, all the parameters in the model should be associated with Model level NDManager.
All of the input and output NDArrays should be associated with one NDManager which is one level down to the model NDManager.
Please check if you call [batch.close()](https://github.com/awslabs/djl/blob/master/examples/src/main/java/ai/djl/examples/training/util/TrainingUtils.java#L35-L39) to release one batch of the dataset at the end of each batch.
Please check if you call [batch.close()](https://github.com/awslabs/djl/blob/468ce0d686758c46b3a62f6c18a084e80846bd8d/api/src/main/java/ai/djl/training/EasyTrain.java#L41)
to release one batch of the dataset at the end of each batch.
If you still see the memory grows as the training process goes, it is most likely that intermediate NDArrays are attached to the Model(Block) parameter level.
As a result, those NDArrays would not closed until the training is finished.
On the other hand, if you implement your own block, be aware of not attaching the parameter to batch level NDManager.
48 changes: 28 additions & 20 deletions docs/development/release_process.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@ Edit [README Release Notes section](../../README.md#release-notes) to add link t
Many of the documents are still pointing to the previous released version. We should update them the
version we are going to release. For example, current version is 0.5.0-SNAPSHOT, but many of documents
are using 0.4.0 (X.X.X), we should update build version with the following command:
```shell script

```shell
cd djl
# replace X.X.X with the previous released version number
./gradlew -PpreviousVersion=X.X.X iFV
Expand All @@ -29,28 +30,32 @@ If nothing changes between previous and current version, you don't need to do th
#### MXNet

Run the following command to trigger mxnet-native publishing job:
```shell script

```shell
curl -XPOST -u "USERNAME:PERSONAL_TOKEN" -H "Accept: application/vnd.github.everest-preview+json" -H "Content-Type: application/json" https://api.github.com/repos/awslabs/djl/dispatches --data '{"event_type": "mxnet-staging-pub"}'
```

#### PyTorch

Run the following command to trigger pytorch-native publishing job:
```shell script

```shell
curl -XPOST -u "USERNAME:PERSONAL_TOKEN" -H "Accept: application/vnd.github.everest-preview+json" -H "Content-Type: application/json" https://api.github.com/repos/awslabs/djl/dispatches --data '{"event_type": "pytorch-staging-pub"}'
```

### Step 1.3: Publish DJL library to sonatype staging server

Run the following command to trigger DJL publishing job:
```shell script

```shell
curl -XPOST -u "USERNAME:PERSONAL_TOKEN" -H "Accept: application/vnd.github.everest-preview+json" -H "Content-Type: application/json" https://api.github.com/repos/awslabs/djl/dispatches --data '{"event_type": "release-build"}'
```

### Step 1.4: Remove -SNAPSHOT in examples and jupyter notebooks

Run the following command with correct version value:
```shell script

```shell
cd djl
git clean -xdff
./gradlew release
Expand All @@ -64,15 +69,16 @@ git push origin vX.X.X
Login to https://oss.sonatype.org/, and find out staging repo name.

Run the following command to point maven repository to staging server:
```shell script

```shell
cd djl
git checkout vX.X.X
./gradlew -PstagingRepo=aidjl-XXXX staging
```

### Step 2.1: Validate examples project are working fine

```shell script
```shell
cd examples
# By default it uses mxnet-engine
# Please switch to pytorch, tensorflow engine to make sure all the engines pass the test
Expand All @@ -84,7 +90,8 @@ mvn exec:java -Dexec.mainClass="ai.djl.examples.inference.ObjectDetection"
### Step 2.2: Validate jupyter notebooks

Make sure jupyter notebook and running properly and all javadoc links are accessible.
```shell script

```shell
cd jupyter
jupyter notebook
```
Expand Down Expand Up @@ -122,7 +129,7 @@ will be published sonatype staging server.

### Step 6.1: Upgrade version for next release

```shell script
```shell
cd djl
./gradlew -PtargetVersion=X.X.X iBV
```
Expand All @@ -132,7 +139,8 @@ Create a PR to get reviewed and merge into github.
### Step 6.2: Publish new snapshot to sonatype

Manually trigger a nightly build with the following command:
```shell script

```shell
curl -XPOST -u "USERNAME:PERSONAL_TOKEN" -H "Accept: application/vnd.github.everest-preview+json" -H "Content-Type: application/json" https://api.github.com/repos/awslabs/djl/dispatches --data '{"event_type": "nightly-build"}'
```

Expand All @@ -141,16 +149,16 @@ curl -XPOST -u "USERNAME:PERSONAL_TOKEN" -H "Accept: application/vnd.github.ever
After verifying packages are available in maven central, click the following links to trigger javadoc.io to fetch latest DJL libraries.
Verify the following link works, and update the website for java doc links accordingly.

* [api](https://javadoc.io/doc/ai.djl/api/0.6.0/index.html)
* [basicdataset](https://javadoc.io/doc/ai.djl/basicdataset/0.6.0/index.html)
* [model-zoo](https://javadoc.io/doc/ai.djl/model-zoo/0.6.0/index.html)
* [mxnet-model-zoo](https://javadoc.io/doc/ai.djl.mxnet/mxnet-model-zoo/0.6.0/index.html)
* [mxnet-engine](https://javadoc.io/doc/ai.djl.mxnet/mxnet-engine/0.6.0/index.html)
* [pytorch-model-zoo](https://javadoc.io/doc/ai.djl.pytorch/pytorch-model-zoo/0.6.0/index.html)
* [pytorch-engine](https://javadoc.io/doc/ai.djl.pytorch/pytorch-engine/0.6.0/index.html)
* [tensorflow-model-zoo](https://javadoc.io/doc/ai.djl.tensorflow/tensorflow-model-zoo/0.6.0/index.html)
* [tensorflow-engine](https://javadoc.io/doc/ai.djl.tensorflow/tensorflow-engine/0.6.0/index.html)
* [fasttext-engine](https://javadoc.io/doc/ai.djl.fasttext/fasttext-engine/0.6.0/index.html)
* [api](https://javadoc.io/doc/ai.djl/api/latest/index.html)
* [basicdataset](https://javadoc.io/doc/ai.djl/basicdataset/latest/index.html)
* [model-zoo](https://javadoc.io/doc/ai.djl/model-zoo/latest/index.html)
* [mxnet-model-zoo](https://javadoc.io/doc/ai.djl.mxnet/mxnet-model-zoo/latest/index.html)
* [mxnet-engine](https://javadoc.io/doc/ai.djl.mxnet/mxnet-engine/latest/index.html)
* [pytorch-model-zoo](https://javadoc.io/doc/ai.djl.pytorch/pytorch-model-zoo/latest/index.html)
* [pytorch-engine](https://javadoc.io/doc/ai.djl.pytorch/pytorch-engine/latest/index.html)
* [tensorflow-model-zoo](https://javadoc.io/doc/ai.djl.tensorflow/tensorflow-model-zoo/latest/index.html)
* [tensorflow-engine](https://javadoc.io/doc/ai.djl.tensorflow/tensorflow-engine/latest/index.html)
* [fasttext-engine](https://javadoc.io/doc/ai.djl.fasttext/fasttext-engine/latest/index.html)

### Step 6.4: Check broken links

Expand Down
Loading

0 comments on commit f145b2d

Please sign in to comment.