Skip to content
This repository has been archived by the owner on Jul 7, 2023. It is now read-only.

Commit

Permalink
Merge pull request #482 from rsepassi/push
Browse files Browse the repository at this point in the history
v1.4
  • Loading branch information
lukaszkaiser authored Dec 21, 2017
2 parents 5c80095 + bac1321 commit 758991d
Show file tree
Hide file tree
Showing 96 changed files with 8,664 additions and 2,637 deletions.
4 changes: 2 additions & 2 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,9 @@ env:
- T2T_DATA_DIR=/tmp/t2t-data
- T2T_TRAIN_DIR=/tmp/t2t-train
script:
- pytest --ignore=tensor2tensor/utils/registry_test.py --ignore=tensor2tensor/utils/trainer_utils_test.py --ignore=tensor2tensor/problems_test.py --ignore=tensor2tensor/tpu/tpu_trainer_lib_test.py
- pytest --ignore=tensor2tensor/utils/registry_test.py --ignore=tensor2tensor/problems_test.py --ignore=tensor2tensor/tpu/tpu_trainer_lib_test.py
- pytest tensor2tensor/utils/registry_test.py
- pytest tensor2tensor/utils/trainer_utils_test.py
- pytest tensor2tensor/tpu/tpu_trainer_lib_test.py
- t2t-datagen 2>&1 | grep translate && echo passed
- python -c "from tensor2tensor.models import transformer; print(transformer.Transformer.__name__)"
- t2t-trainer --registry_help
Expand Down
62 changes: 54 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# T2T: Tensor2Tensor Transformers
# Tensor2Tensor

[![PyPI
version](https://badge.fury.io/py/tensor2tensor.svg)](https://badge.fury.io/py/tensor2tensor)
Expand All @@ -10,11 +10,18 @@ welcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg)](CO
[![License](https://img.shields.io/badge/License-Apache%202.0-brightgreen.svg)](https://opensource.org/licenses/Apache-2.0)
[![Travis](https://img.shields.io/travis/tensorflow/tensor2tensor.svg)](https://travis-ci.org/tensorflow/tensor2tensor)

[T2T](https://github.com/tensorflow/tensor2tensor) is a modular and extensible
library and binaries for supervised learning with TensorFlow and with support
for sequence tasks. It is actively used and maintained by researchers and
engineers within the Google Brain team. You can read more about Tensor2Tensor in
the recent [Google Research Blog post introducing
[Tensor2Tensor](https://github.com/tensorflow/tensor2tensor), or
[T2T](https://github.com/tensorflow/tensor2tensor) for short, is a library
of deep learning models and datasets. It has binaries to train the models and
to download and prepare the data for you. T2T is modular and extensible and can
be used in [notebooks](https://goo.gl/wkHexj) for prototyping your own models
or running existing ones on your data. It is actively used and maintained by
researchers and engineers within
the [Google Brain team](https://research.google.com/teams/brain/) and was used
to develop state-of-the-art models for translation (see
[Attention Is All You Need](https://arxiv.org/abs/1706.03762)), summarization,
image generation and other tasks. You can read
more about T2T in the [Google Research Blog post introducing
it](https://research.googleblog.com/2017/06/accelerating-deep-learning-research.html).

We're eager to collaborate with you on extending T2T, so please feel
Expand All @@ -29,8 +36,14 @@ You can chat with us and other users on
[Google Group](https://groups.google.com/forum/#!forum/tensor2tensor) to keep up
with T2T announcements.

Here is a one-command version that installs tensor2tensor, downloads the data,
### Quick Start

[This iPython notebook](https://goo.gl/wkHexj) explains T2T and runs in your
browser using a free VM from Google, no installation needed.

Alternatively, here is a one-command version that installs T2T, downloads data,
trains an English-German translation model, and evaluates it:

```
pip install tensor2tensor && t2t-trainer \
--generate_data \
Expand All @@ -53,11 +66,17 @@ t2t-decoder \
--decode_interactive
```

See the [Walkthrough](#walkthrough) below for more details on each step.
See the [Walkthrough](#walkthrough) below for more details on each step
and [Suggested Models](#suggested-models) for well performing models
on common tasks.

### Contents

* [Walkthrough](#walkthrough)
* [Suggested Models](#suggested-models)
* [Translation](#translation)
* [Summarization](#summarization)
* [Image Classification](#image-classification)
* [Installation](#installation)
* [Features](#features)
* [T2T Overview](#t2t-overview)
Expand Down Expand Up @@ -132,6 +151,33 @@ cat $DECODE_FILE.$MODEL.$HPARAMS.beam$BEAM_SIZE.alpha$ALPHA.decodes

---

## Suggested Models

Here are some combinations of models, hparams and problems that we found
work well, so we suggest to use them if you're interested in that problem.

### Translation

For translation, esp. English-German and English-French, we suggest to use
the Transformer model in base or big configurations, i.e.
for `--problems=translate_ende_wmt32k` use `--model=transformer` and
`--hparams_set=transformer_base`. When trained on 8 GPUs for 300K steps
this should reach a BLEU score of about 28.

### Summarization

For summarization suggest to use the Transformer model in prepend mode, i.e.
for `--problems=summarize_cnn_dailymail32k` use `--model=transformer` and
`--hparams_set=transformer_prepend`.

### Image Classification

For image classification suggest to use the ResNet or Xception, i.e.
for `--problems=image_imagenet` use `--model=resnet50` and
`--hparams_set=resnet_base` or `--model=xception` and
`--hparams_set=xception_base`.


## Installation

```
Expand Down
48 changes: 28 additions & 20 deletions docs/cloud_tpu.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,19 @@
Tensor2Tensor supports running on Google Cloud Platforms TPUs, chips specialized
for ML training.

Not all models are supported but we've tested so far with Transformer (sequence
model) as well as Xception (image model).
Models and hparams that are known to work on TPU:
* `transformer` with `transformer_tpu`
* `xception` with `xception_base`
* `resnet50` with `resnet_base`

To run on TPUs, you need to be part of the alpha program; if you're not, these
commands won't work for you currently, but access will expand soon, so get
excited for your future ML supercomputers in the cloud.

## Tutorial: Transformer En-De translation on TPU

Update `gcloud`: `gcloud components update`

Set your default zone to a TPU-enabled zone. TPU machines are only available in
certain zones for now.
```
Expand Down Expand Up @@ -40,29 +44,32 @@ gcloud alpha compute tpus create \
To see all TPU instances running: `gcloud alpha compute tpus list`. The
`TPU_IP` should be unique amongst the list and follow the format `10.240.i.2`.

Generate data to GCS
If you already have the data locally, use `gsutil cp` to cp to GCS.
SSH in with port forwarding for TensorBoard
```
DATA_DIR=gs://my-bucket/t2t/data/
t2t-datagen --problem=translate_ende_wmt8k --data_dir=$DATA_DIR
gcloud compute ssh $USER-vm -- -L 6006:localhost:6006
```

SSH in with port forwarding for TensorBoard
Now that you're on the cloud instance, install T2T:
```
gcloud compute ssh $USER-vm -L 6006:localhost:6006
pip install tensor2tensor --user
# If your python bin dir isn't already in your path
export PATH=$HOME/.local/bin:$PATH
```

Now that you're on the cloud instance, install T2T:
Generate data to GCS
If you already have the data, use `gsutil cp` to copy to GCS.
```
pip install tensor2tensor
GCS_BUCKET=gs://my-bucket
DATA_DIR=$GCS_BUCKET/t2t/data/
t2t-datagen --problem=translate_ende_wmt8k --data_dir=$DATA_DIR
```

Setup some vars used below. `TPU_IP` and `DATA_DIR` should be the same as what
was used above. Note that the `DATA_DIR` and `OUT_DIR` must be GCS buckets.
```
TPU_IP=<IP of TPU machine>
DATA_DIR=gs://my-bucket/t2t/data/
OUT_DIR=gs://my-bucket/t2t/training/
DATA_DIR=$GCS_BUCKET/t2t/data/
OUT_DIR=$GCS_BUCKET/t2t/training/
TPU_MASTER=grpc://$TPU_IP:8470
```

Expand All @@ -73,25 +80,26 @@ tensorboard --logdir=$OUT_DIR > /tmp/tensorboard_logs.txt 2>&1 &

Train and evaluate.
```
t2t-tpu-trainer \
--master=$TPU_MASTER \
--data_dir=$DATA_DIR \
--output_dir=$OUT_DIR \
--problems=translate_ende_wmt8k \
t2t-trainer \
--model=transformer \
--hparams_set=transformer_tiny_tpu \
--hparams_set=transformer_tpu \
--problems=translate_ende_wmt8k \
--train_steps=10 \
--eval_steps=10 \
--local_eval_frequency=10 \
--iterations_per_loop=10
--iterations_per_loop=10 \
--master=$TPU_MASTER \
--use_tpu=True \
--data_dir=$DATA_DIR \
--output_dir=$OUT_DIR
```

The above command will train for 10 steps, then evaluate for 10 steps. You can
(and should) increase the number of total training steps with the
`--train_steps` flag. Evaluation will happen every `--local_eval_frequency`
steps, each time for `--eval_steps`. When you increase then number of training
steps, also increase `--iterations_per_loop`, which controls how frequently the
TPU machine returns control to the Python code (1000 seems like a fine number).
TPU machine returns control to the host machine (1000 seems like a fine number).

Back on your local machine, open your browser and navigate to `localhost:6006`
for TensorBoard.
Expand Down
Loading

0 comments on commit 758991d

Please sign in to comment.