Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

conll03.sh can't reproduce the f1 score in paper #112

Open
Senwang98 opened this issue Feb 21, 2022 · 0 comments
Open

conll03.sh can't reproduce the f1 score in paper #112

Senwang98 opened this issue Feb 21, 2022 · 0 comments

Comments

@Senwang98
Copy link

Senwang98 commented Feb 21, 2022

@xiaoya-li
Hi, I used conll03.sh and bert-large-cased to reproduce your f1 score in acl20 paper, but failed.
here is my config:

#!/usr/bin/env bash
# -*- coding: utf-8 -*-


TIME=0901
FILE=conll03_cased_large
REPO_PATH=/opt/tiger/ws/mrc-for-flat-nested-ner
export PYTHONPATH="$PYTHONPATH:$REPO_PATH"

DATA_DIR=/opt/tiger/ws/ner_dataset/conll03
BERT_DIR=/opt/tiger/ws/bert-large-cased
OUTPUT_BASE=${REPO_PATH}/outputs

BATCH=10
GRAD_ACC=4
BERT_DROPOUT=0.1
MRC_DROPOUT=0.3
LR=3e-5
LR_MINI=3e-7
LR_SCHEDULER=polydecay
SPAN_WEIGHT=0.1
WARMUP=0
MAX_LEN=200
MAX_NORM=1.0
MAX_EPOCH=20
INTER_HIDDEN=2048
WEIGHT_DECAY=0.01
OPTIM=torch.adam
VAL_CHECK=0.2
PREC=16
SPAN_CAND=pred_and_gold


OUTPUT_DIR=${OUTPUT_BASE}/mrc_ner/${FILE}_cased_large_lr${LR}_drop${MRC_DROPOUT}_norm${MAXNORM}_weight${SPAN_WEIGHT}_warmup${WARMUP}_maxlen${MAXLEN}
mkdir -p ${OUTPUT_DIR}


CUDA_VISIBLE_DEVICES=0,1 python ${REPO_PATH}/train/mrc_ner_trainer.py \
--data_dir ${DATA_DIR} \
--bert_config_dir ${BERT_DIR} \
--max_length ${MAX_LEN} \
--batch_size ${BATCH} \
--gpus="2" \
--precision=${PREC} \
--progress_bar_refresh_rate 1 \
--lr ${LR} \
--val_check_interval ${VAL_CHECK} \
--accumulate_grad_batches ${GRAD_ACC} \
--default_root_dir ${OUTPUT_DIR} \
--mrc_dropout ${MRC_DROPOUT} \
--bert_dropout ${BERT_DROPOUT} \
--max_epochs ${MAX_EPOCH} \
--span_loss_candidates ${SPAN_CAND} \
--weight_span ${SPAN_WEIGHT} \
--warmup_steps ${WARMUP} \
--distributed_backend=ddp \
--max_length ${MAX_LEN} \
--gradient_clip_val ${MAX_NORM} \
--weight_decay ${WEIGHT_DECAY} \
--optimizer ${OPTIM} \
--lr_scheduler ${LR_SCHEDULER} \
--classifier_intermediate_hidden_size ${INTER_HIDDEN} \
--flat \
--lr_mini ${LR_MINI}

and the final test f1 result:

Testing: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊| 1380/1382 [01:06<00:00, 21.21it/s]

TEST INFO -> test_f1 is: 0.8897400498390198 precision: 0.8865866661071777, recall: 0.8929159641265869

do you have any suggestionns?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant