Merge pull request #482 from rsepassi/push

v1.4
tensorflow · Dec 21, 2017 · 758991d · 758991d
2 parents 5c80095 + bac1321
commit 758991d
Show file tree

Hide file tree

Showing 96 changed files with 8,664 additions and 2,637 deletions.
diff --git a/.travis.yml b/.travis.yml
@@ -14,9 +14,9 @@ env:
     - T2T_DATA_DIR=/tmp/t2t-data
     - T2T_TRAIN_DIR=/tmp/t2t-train
 script:
-  - pytest --ignore=tensor2tensor/utils/registry_test.py --ignore=tensor2tensor/utils/trainer_utils_test.py --ignore=tensor2tensor/problems_test.py --ignore=tensor2tensor/tpu/tpu_trainer_lib_test.py
+  - pytest --ignore=tensor2tensor/utils/registry_test.py --ignore=tensor2tensor/problems_test.py --ignore=tensor2tensor/tpu/tpu_trainer_lib_test.py
   - pytest tensor2tensor/utils/registry_test.py
-  - pytest tensor2tensor/utils/trainer_utils_test.py
+  - pytest tensor2tensor/tpu/tpu_trainer_lib_test.py
   - t2t-datagen 2>&1 | grep translate && echo passed
   - python -c "from tensor2tensor.models import transformer; print(transformer.Transformer.__name__)"
   - t2t-trainer --registry_help

diff --git a/README.md b/README.md
@@ -1,4 +1,4 @@
-# T2T: Tensor2Tensor Transformers
+# Tensor2Tensor
 
 [![PyPI
 version](https://badge.fury.io/py/tensor2tensor.svg)](https://badge.fury.io/py/tensor2tensor)
@@ -10,11 +10,18 @@ welcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg)](CO
 [![License](https://img.shields.io/badge/License-Apache%202.0-brightgreen.svg)](https://opensource.org/licenses/Apache-2.0)
 [![Travis](https://img.shields.io/travis/tensorflow/tensor2tensor.svg)](https://travis-ci.org/tensorflow/tensor2tensor)
 
-[T2T](https://github.com/tensorflow/tensor2tensor) is a modular and extensible
-library and binaries for supervised learning with TensorFlow and with support
-for sequence tasks. It is actively used and maintained by researchers and
-engineers within the Google Brain team. You can read more about Tensor2Tensor in
-the recent [Google Research Blog post introducing
+[Tensor2Tensor](https://github.com/tensorflow/tensor2tensor), or
+[T2T](https://github.com/tensorflow/tensor2tensor) for short, is a library
+of deep learning models and datasets. It has binaries to train the models and
+to download and prepare the data for you. T2T is modular and extensible and can
+be used in [notebooks](https://goo.gl/wkHexj) for prototyping your own models
+or running existing ones on your data. It is actively used and maintained by
+researchers and engineers within
+the [Google Brain team](https://research.google.com/teams/brain/) and was used
+to develop state-of-the-art models for translation (see
+[Attention Is All You Need](https://arxiv.org/abs/1706.03762)), summarization,
+image generation and other tasks. You can read
+more about T2T in the [Google Research Blog post introducing
 it](https://research.googleblog.com/2017/06/accelerating-deep-learning-research.html).
 
 We're eager to collaborate with you on extending T2T, so please feel
@@ -29,8 +36,14 @@ You can chat with us and other users on
 [Google Group](https://groups.google.com/forum/#!forum/tensor2tensor) to keep up
 with T2T announcements.
 
-Here is a one-command version that installs tensor2tensor, downloads the data,
+### Quick Start
+
+[This iPython notebook](https://goo.gl/wkHexj) explains T2T and runs in your
+browser using a free VM from Google, no installation needed.
+
+Alternatively, here is a one-command version that installs T2T, downloads data,
 trains an English-German translation model, and evaluates it:
+
 ```
 pip install tensor2tensor && t2t-trainer \
   --generate_data \
@@ -53,11 +66,17 @@ t2t-decoder \
   --decode_interactive
 ```
 
-See the [Walkthrough](#walkthrough) below for more details on each step.
+See the [Walkthrough](#walkthrough) below for more details on each step
+and [Suggested Models](#suggested-models) for well performing models
+on common tasks.
 
 ### Contents
 
 * [Walkthrough](#walkthrough)
+* [Suggested Models](#suggested-models)
+  * [Translation](#translation)
+  * [Summarization](#summarization)
+  * [Image Classification](#image-classification)
 * [Installation](#installation)
 * [Features](#features)
 * [T2T Overview](#t2t-overview)
@@ -132,6 +151,33 @@ cat $DECODE_FILE.$MODEL.$HPARAMS.beam$BEAM_SIZE.alpha$ALPHA.decodes
 
 ---
 
+## Suggested Models
+
+Here are some combinations of models, hparams and problems that we found
+work well, so we suggest to use them if you're interested in that problem.
+
+### Translation
+
+For translation, esp. English-German and English-French, we suggest to use
+the Transformer model in base or big configurations, i.e.
+for `--problems=translate_ende_wmt32k` use `--model=transformer` and
+`--hparams_set=transformer_base`. When trained on 8 GPUs for 300K steps
+this should reach a BLEU score of about 28.
+
+### Summarization
+
+For summarization suggest to use the Transformer model in prepend mode, i.e.
+for `--problems=summarize_cnn_dailymail32k` use `--model=transformer` and
+`--hparams_set=transformer_prepend`.
+
+### Image Classification
+
+For image classification suggest to use the ResNet or Xception, i.e.
+for `--problems=image_imagenet` use `--model=resnet50` and
+`--hparams_set=resnet_base` or `--model=xception` and
+`--hparams_set=xception_base`.
+
+
 ## Installation
 
 ```

diff --git a/docs/cloud_tpu.md b/docs/cloud_tpu.md
@@ -3,15 +3,19 @@
 Tensor2Tensor supports running on Google Cloud Platforms TPUs, chips specialized
 for ML training.
 
-Not all models are supported but we've tested so far with Transformer (sequence
-model) as well as Xception (image model).
+Models and hparams that are known to work on TPU:
+* `transformer` with `transformer_tpu`
+* `xception` with `xception_base`
+* `resnet50` with `resnet_base`
 
 To run on TPUs, you need to be part of the alpha program; if you're not, these
 commands won't work for you currently, but access will expand soon, so get
 excited for your future ML supercomputers in the cloud.
 
 ## Tutorial: Transformer En-De translation on TPU
 
+Update `gcloud`: `gcloud components update`
+
 Set your default zone to a TPU-enabled zone. TPU machines are only available in
 certain zones for now.
 ```
@@ -40,29 +44,32 @@ gcloud alpha compute tpus create \
 To see all TPU instances running: `gcloud alpha compute tpus list`.  The
 `TPU_IP` should be unique amongst the list and follow the format `10.240.i.2`.
 
-Generate data to GCS
-If you already have the data locally, use `gsutil cp` to cp to GCS.
+SSH in with port forwarding for TensorBoard
 ```
-DATA_DIR=gs://my-bucket/t2t/data/
-t2t-datagen --problem=translate_ende_wmt8k --data_dir=$DATA_DIR
+gcloud compute ssh $USER-vm -- -L 6006:localhost:6006
 ```
 
-SSH in with port forwarding for TensorBoard
+Now that you're on the cloud instance, install T2T:
 ```
-gcloud compute ssh $USER-vm -L 6006:localhost:6006
+pip install tensor2tensor --user
+# If your python bin dir isn't already in your path
+export PATH=$HOME/.local/bin:$PATH
 ```
 
-Now that you're on the cloud instance, install T2T:
+Generate data to GCS
+If you already have the data, use `gsutil cp` to copy to GCS.
 ```
-pip install tensor2tensor
+GCS_BUCKET=gs://my-bucket
+DATA_DIR=$GCS_BUCKET/t2t/data/
+t2t-datagen --problem=translate_ende_wmt8k --data_dir=$DATA_DIR
 ```
 
 Setup some vars used below. `TPU_IP` and `DATA_DIR` should be the same as what
 was used above. Note that the `DATA_DIR` and `OUT_DIR` must be GCS buckets.
 ```
 TPU_IP=<IP of TPU machine>
-DATA_DIR=gs://my-bucket/t2t/data/
-OUT_DIR=gs://my-bucket/t2t/training/
+DATA_DIR=$GCS_BUCKET/t2t/data/
+OUT_DIR=$GCS_BUCKET/t2t/training/
 TPU_MASTER=grpc://$TPU_IP:8470
 ```
 
@@ -73,25 +80,26 @@ tensorboard --logdir=$OUT_DIR > /tmp/tensorboard_logs.txt 2>&1 &
 
 Train and evaluate.
 ```
-t2t-tpu-trainer \
-  --master=$TPU_MASTER \
-  --data_dir=$DATA_DIR \
-  --output_dir=$OUT_DIR \
-  --problems=translate_ende_wmt8k \
+t2t-trainer \
   --model=transformer \
-  --hparams_set=transformer_tiny_tpu \
+  --hparams_set=transformer_tpu \
+  --problems=translate_ende_wmt8k \
   --train_steps=10 \
   --eval_steps=10 \
   --local_eval_frequency=10 \
-  --iterations_per_loop=10
+  --iterations_per_loop=10 \
+  --master=$TPU_MASTER \
+  --use_tpu=True \
+  --data_dir=$DATA_DIR \
+  --output_dir=$OUT_DIR
 ```
 
 The above command will train for 10 steps, then evaluate for 10 steps. You can
 (and should) increase the number of total training steps with the
 `--train_steps` flag. Evaluation will happen every `--local_eval_frequency`
 steps, each time for `--eval_steps`. When you increase then number of training
 steps, also increase `--iterations_per_loop`, which controls how frequently the
-TPU machine returns control to the Python code (1000 seems like a fine number).
+TPU machine returns control to the host machine (1000 seems like a fine number).
 
 Back on your local machine, open your browser and navigate to `localhost:6006`
 for TensorBoard.