MRC for Dependency Parsing

Introduction

This repo contains code for paper Dependency Parsing as MRC-based Span-Span Prediction.

@article{gan2021dependency,
  title={Dependency Parsing as MRC-based Span-Span Prediction},
  author={Gan, Leilei and Meng, Yuxian and Kuang, Kun and Sun, Xiaofei and Fan, Chun and Wu, Fei and Li, Jiwei},
  journal={arXiv preprint arXiv:2105.07654},
  year={2021}
}

Results

Table 1: Results for different model on PTB and CTB.

	PTB		CTB
	UAS	LAS	UAS	LAS
StackPTR	95.87	94.19	90.59	89.29
GNN	95.87	94.15	90.78	89.50
+Pretrained Models
with additional labelled constituency parsing data
HPSG♭	97.20	95.72	-	-
HPSG+LA♭	97.42	96.26	94.56	89.28
without additional labelled constituency parsing data
Biaffine	96.87	95.34	92.45	90.48
CVT	96.60	95.00	-	-
MP2O	96.91	95.34	92.55	91.69
Ours-Proj	97.24	95.49	92.68	90.91
	(+0.33)	(+0.15)	(+0.13)	(-0.78)
Ours-Nproj	97.14	95.39	92.58	90.83
	(+0.23)	(+0.06)	(+0.03)	(-0.86)

Table 2: LAS for different model on UD. We use ISO 639-1 codes to represent languages from UD.

	bg	ca	cs	de	en	es	fr	it	nl	no	ro	ru	Avg.
projective%	99.8	99.6	99.2	97.7	99.6	99.6	99.7	99.8	99.4	99.3	99.4	99.2	99.4
GNN	90.33	92.39	90.95	79.73	88.43	91.56	87.23	92.44	88.57	89.38	85.26	91.20	89.37
+Pretrained Models
MP2O	91.30	93.60	92.09	82.00	90.75	92.62	89.32	93.66	91.21	91.74	86.40	92.61	91.02
Biaffine	93.04	94.15	93.57	84.84	91.93	92.64	91.64	94.07	92.78	94.17	88.66	94.91	92.15
Ours-Proj	93.61	94.04	93.1	84.97	91.92	92.32	91.69	94.86	92.51	94.07	88.76	94.66	92.21
	(+0.57)	(-0.11)	(-0.47)	(+0.13)	(-0.01)	(-0.32)	(+0.05)	(+0.79)	(-0.27)	(-0.10)	(+0.10)	(-0.25)	(+0.06)
Ours-NProj	93.76	94.38	93.72	85.23	91.95	92.62	91.76	94.79	92.97	94.50	88.67	95.00	92.45
	(+0.72)	(+0.23)	(+0.15)	(+0.39)	(+0.02)	(-0.02)	(+0.12)	(+0.72)	(+0.19)	(+0.33)	(+0.01)	(+0.09)	(+0.30)

Usage

Requirements

python>=3.6
pip install -r requirements.txt

We build our project on pytorch-lightning. If you want to know more about the arguments used in our training scripts, please refer to pytorch-lightning documentation.

Dataset Preparation

We follow this repo for PTB/CTB data preprocessing.

We follow Ma et al. (2018) to preprocess data in UD dataset.

Preprocessed Data Download

The preprocessed data for PTB/CTB/UD can be downloaded here

Note: some languages(e.g. czech) in UD have more than one dataset. For these languages, we select and merge datasets using the same strategy with Ma et al. (2018), and put them under directory ud2.2/merge_dataset

Pretrained Models Preparation

For PTB, we use RoBERTa-Large.

For CTB, we use RoBERTa-wwm-ext-large.

For UD, we use XLM-RoBERTa-large.

Reproduction

Train

proposal model: scripts/s2s/*/proposal.sh
s2s model: scripts/s2t/*/s2s.sh

Note that you should change MODEL_DIR, BERT_DIR and OUTPUT_DIR to your own path.

Evaluate

Choose the best span-proposal model and s2s model according to topk accuracy and UAS respectively, and run

parser/s2s_evaluate_dp.py \
--proposal_hparams <your best proposal model hparams file> \
--proposal_ckpt <your best proposal model ckpt> \
--s2s_ckpt <your best s2s query model hparams file> \
--s2s_hparams <your best s2s query model ckpt> \
--topk <use topk spans for evaluating>

Related Works

We re-implement Deep Biaffine Attention for Neural Dependency Parsing (Dozat and Manning, 2016) as our baseline. The scripts to reproduce this baseline are in biaf_README.md

Contact

If you have any issues or questions about this repo, feel free to contact [email protected].

License

Apache License 2.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MRC for Dependency Parsing

Introduction

Results

Usage

Requirements

Dataset Preparation

Preprocessed Data Download

Pretrained Models Preparation

Reproduction

Train

Evaluate

Related Works

Contact

License

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 165 Commits
parser		parser
scripts		scripts
LICENSE		LICENSE
README.md		README.md
biaf_README.md		biaf_README.md
requirements.txt		requirements.txt

License

ShannonAI/mrc-for-dependency-parsing

Folders and files

Latest commit

History

Repository files navigation

MRC for Dependency Parsing

Introduction

Results

Usage

Requirements

Dataset Preparation

Preprocessed Data Download

Pretrained Models Preparation

Reproduction

Train

Evaluate

Related Works

Contact

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages