nlp

RE

[std::regex/Boost.Regex-c++]
[hyperscan-c++/python], a large number of regular expressions, only for x86
[QRegExp-c++]
[re-python]
[PCRE/PCRE++-perl/c++]
[google/re2-c++/go/python], a large number of regular expressions
comparision

LAC

Chinese Lexical Analysis with Deep Bi-GRU-CRF Network -baidu, arxiv2018

pretrain models

[thulac]
[baidu/lac]
HIT-SCIR/ltp
spacy
stanza
[hanlp]

Machine Reading Comprehension

Improving Machine Reading Comprehension with Single-choice Decision and Transfer Learning -tencent, arxiv2020
DUMA: Reading Comprehension with Transposition Thinking -huawei, arxiv2020
DCMN+: Dual co-matching network for multi-choice reading comprehension -cloudwalk, AAAI2020
Albert: A lite bert for self-supervised learning of language representations -google, ICLR2020
Dual co-matching network for multi-choice reading comprehension -cloudwalk, arxiv2019
Option comparison network for multiple-choice reading comprehension -tencent, arxiv2019
Neural Machine Reading Comprehension: Methods and Trends -S Liu, AppliedSciences2019
Applying deep learning to answer selection: A study and an open task -IBM, ASRU2015

databse

DREAM
RACE
[SQuAD2.0]
[ARC]
[CoQA]

NER

A survey on deep learning for named entity recognition -TKDE2020

database

Ontonotes release 4.0/5.0
MSRA, Word segmentation and named entity recognition
Weibo NER, recognition for Chinese social media with jointly trained embeddings
人民日报
 BosonNLP_NER_6C, bosonnlp
CCKS2017/2018/2019/2020电子病历实体标注
 WikiANN/PAN-X
XGLUE
CLUENER2020

pretrain models

baidu/ERNIE
baidu/lac
HIT-SCIR/ltp
spacy
stanza
腾讯UER
CLUEPretrainedModels
Chinese-BERT-wwm
google

Dependency Parsing

Efficient Second-Order TreeCRF for Neural Dependency Parsing -SoochowUniversity, ACL2020, code
Deep Biaffine Attention for Neural Dependency Parsing -Stanford, ICLR2017

database

pretrain models

baidu/DDParser
HIT-SCIR/ltp
spacy
stanza

Post Editing

A survey on non-autoregressive generation for neural machine translation and beyond -msra, PAMI2023, linker
Directed Acyclic Transformer Pre-training for High-quality Non-autoregressive Text Generation -tsinghua, ACL2023, code
Directed Acyclic Transformer for Non-Autoregressive Machine Translation -bytedance, ICML2022
Hierarchical Context Tagging for Utterance Rewriting -tencent, AAAI2022
Text generation with text-editing models -NAACL2022
EdiT5: Semi-Autoregressive Text-Editing with T5 Warm-Start -google, EMNLP2022, code
LEWIS: Levenshtein Editing for Unsupervised Text Style Transfer -ACL2021
LayoutReader: Pre-training of Text and Layout for Reading Order Detection -EMNLP2021, code&dataset
Softcorrect: Error correction with soft detection for automatic speech recognition -microsoft, AAAI2023
FastCorrect2: Fast Error Correction on Multiple Candidates for Automatic Speech Recognition -microsoft, EMNLP2021, code
FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition -microsoft, NeurIPS2021, code
FELIX: Flexible Text Editing Through Tagging and Insertion -google, EMNLP2020, code
Seq2Edits: Sequence transduction using span-level edit operations -google, EMNLP2020
Spelling Error Correction with Soft-Masked BERT -bytedance, arxiv2020
SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check -alibaba, ACL2020
Encode, Tag, Realize: High-Precision Text Editing -google, EMNLP2019
Levenshtein transformer -facebook, NIPS2019
Unified Language Model Pre-training for Natural Language Understanding and Generation -NIPS2019
A spelling correction model for end-to-end speech recognition -google, ICASSP 2019
Automatic Spelling Correction with Transformer for CTC-based End-to-End Speech Recognition -alibaba, arxiv2019

dict

CLUECorpus2020 Google原始中文词表

pretrain

CLUECorpus2020
brightmart
人民日报1998版本
人民日报2014版本

CLUE

中文医疗信息处理挑战榜CBLUE, database

QA

english

CLUE benchmark google, Natural Questions: a Benchmark for Question Answering Research

chinese

哈工大、讯飞CMRC DRCD

cls

CLUE benchmark 清华大学开源的文本分类数据集THUCTC

labeling tools

YEDDA

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nlp

RE

LAC

pretrain models

Machine Reading Comprehension

databse

NER

database

pretrain models

Dependency Parsing

database

pretrain models

Post Editing

dict

pretrain

CLUE

QA

english

chinese

cls

labeling tools

About

Releases

Packages

yflv-yanxia/nlp

Folders and files

Latest commit

History

Repository files navigation

nlp

RE

LAC

pretrain models

Machine Reading Comprehension

databse

NER

database

pretrain models

Dependency Parsing

database

pretrain models

Post Editing

dict

pretrain

CLUE

QA

english

chinese

cls

labeling tools

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages