Skip to content

Latest commit

 

History

History
20 lines (14 loc) · 924 Bytes

README.md

File metadata and controls

20 lines (14 loc) · 924 Bytes

Patogen identification NER

License Python 3.7 scikit-learn 0.23.2 Solvve

Description

Named entity recognition using Spacy Ner, BERT. We follow the next steps:

  1. EDA
  2. Data preprocessing
  3. Spacy NER, BERT_NER Modeling

Dataset

  1. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3951655/

NCBI disease corpus, a collection of 793 PubMed abstracts fully annotated at the mention and concept level to serve as a research resource for the biomedical natural language processing community.