The goal is to add entity recognition to the logstash before storing the information in the index.
We used OpenNLP to enrich data sent to the ingestion api.
Manual training was done during the hackathon.
The model was then used to enrich Europa Data arriving in the ingestion API.
The following fields were detected on webpages and documents:
- DG's
- Titles
- Keywords
A Kibana dashboard was created to view the data graphically.