Neural Commit Suggester

A machine translation system that translates git commit patches to diff messages.

This is an evolution of an implementation presented at FOSDEM 19 for Neural commit message suggester: Proposing git commit messages with neural networks, originally based on Google Neural Machine Translation and achieving BLEU 37.6 on Jiang et al. 2017 dataset.

It has been ported to Sockeye 2, it's now based on the Transformer and is over BLEU 40.

Installation

Install Sockeye the framework through:

pip install sockeye -r requirements.gpu-cu100.txt

(more instruction about where to fetch GPU enabling requirements file on https://awslabs.github.io/sockeye/setup.html)

Train the MT

Data has already been tokenized.

Execute ./train-suggester-optimal.sh sockeye-commit-suggester train.26208 valid.3000.

It will search for train.26208.msg, train.26208.diff, valid.3000.msg and valid.3000.diff.

If you have a GPU, comment out --use-cpu flag.
It will take approximately 15GB of GPU RAM on a Tesla P100. Parallel multiple GPUs are supported. Training lasts approximately 1 hour on 3 GPUs.

Predict a commit message

You can predict a commit message by specifying your model directory with --model parameter and by sending in diffs in a file passed as --input parameter, one diff per line (keep reading for the exact specification). Output commit messages go in a file specified by --output parameter.

sockeye-translate --models sockeye-commit-suggester --use-cpu --input valid.3000.diff --output valid.raw.out

Diff format translates by replacing newlines with <nl> and +++/--- signs in front of file names with ppp/mmm. Everything must be whitespace tokenized. Lines pointing to char references (typically beginning and ending with @@) are stripped.

So, the following:

--- a/kubernetes/ansible/ansible_config/tasks/docker.yml
+++ b/kubernetes/ansible/ansible_config/tasks/docker.yml
@@ -1,5 +1,8 @@
 - name: Create docker default nexus auth
   template:
     src: ../../ansible/roles/docker/files/docker-config_staging.json.j2
-    dest: ../../ansible/roles/docker/files/docker-config_staging.json
+    dest: "{{item}}"
     force: true
+  with_items:
+    - ../../ansible/roles/jenkins/files/docker-config.json
+    - ../../ansible/roles/docker/files/docker-config_staging.json

gets crammed to a single line like:

mmm a / kubernetes / ansible / ansible_config / tasks / docker . yml <nl> ppp b / kubernetes / ansible / ansible_config / tasks / docker . yml <nl>  - name :  Create docker default nexus auth <nl>    template :  <nl>      src :   .  .  /  .  .  / ansible / roles / docker / files / docker-config_staging . json . j2 <nl> -    dest :   .  .  /  .  .  / ansible / roles / docker / files / docker-config_staging . json <nl> +    dest :  "{{item}}" <nl>      force :  true <nl> +  with_items :  <nl> +    -  .  .  /  .  .  / ansible / roles / jenkins / files / docker-config . json <nl> +    -  .  .  /  .  .  / ansible / roles / docker / files / docker-config_staging . json <nl>

The original dataset and convertion script was provided by Siyuan Jiang for CommitGen.
An improved version is available at https://github.com/aijanai/commit-suggester-dataset-builder.

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
jiang-dataset-lower		jiang-dataset-lower
jiang-dataset		jiang-dataset
liu-dataset-nobin		liu-dataset-nobin
liu-dataset		liu-dataset
results		results
.gitignore		.gitignore
Plots.ipynb		Plots.ipynb
README.md		README.md
binary_remover.py		binary_remover.py
evaluate.sh		evaluate.sh
generate_perplexity.sh		generate_perplexity.sh
generate_translations.sh		generate_translations.sh
grid-to-csv.sh		grid-to-csv.sh
multi-bleu.perl		multi-bleu.perl
prune_params.sh		prune_params.sh
requirements.gpu-cu100.txt		requirements.gpu-cu100.txt
results.csv		results.csv
train-suggester-grid.sh		train-suggester-grid.sh
train-suggester-optimal.sh		train-suggester-optimal.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Neural Commit Suggester

Installation

Train the MT

Predict a commit message

About

Languages

aijanai/vanilla-neural-commit-suggester

Folders and files

Latest commit

History

Repository files navigation

Neural Commit Suggester

Installation

Train the MT

Predict a commit message

About

Topics

Resources

Stars

Watchers

Forks

Languages