Named Entities Recognition (Rasa-like syntax) Annotator for the vim editor.
This vim plugin helps to annotate named entities using simple entity annotation syntax, using inline-text mark-up tags, following the format used in RASA YAML files.
The final goal is to possibly demonstrate how fast is annotate with a text editor (specifically vim) a text file of intents + entities examples of a training set.
What the named entity [entity_value](entity_label)
syntax format is?
[entity_value](entity_label)
^ ^
| |
| entity name (label)
|
value (sequence of characters/words) for entity referenced with `entity_label`
Where:
-
entity_value
- is any sequence of characters or words,
- delimited by characters
[
and]
-
entity_label
- is the entity name (label)
is a string of kind "variable name" in a programming language style,
by example the label is made by alphabet letters and the character
_
- the label is delimited by characters
(
e)
- is the entity name (label)
is a string of kind "variable name" in a programming language style,
by example the label is made by alphabet letters and the character
By example, given the sentence
my name is Giorgio Robino and I live in Genova, corso Magenta 35/4
You want to annotate three entities (entity_label = entity_value):
person
=Giorgio Robino
city
=Genova
address
=corso Magenta 35/4
Using above described syntax, the annotated sentence is:
my name is [Giorgio Robino](person) and I live in [Genova](city), [corso Magenta 35/4](address)
What the plugin does?
With the vim plugin command :NeraSet
,
you can map up to 12 function keys (<F1>
,...,<F12>
) to a syntax substitution/decoration "macro"
that add a entity label
- to a visual selected text,
- or to the current word and a configurable number of adjacent words, setting the cursor to the start of the word (entity) you want to tag.
In vim command mode (:
) these commands are available:
command | description |
---|---|
:NeraSet functionKey label [contiguousWords] |
maps the specified functionKey to a substitution macro with argument label, and optional argument contiguousWords. functionKey valid values are number 1 ... 12 or strings F1 ... F12 , or <F1> ... <F12> key pressing.label is the entity name (single word in camelCase or snake_case). contiguousWords is a number of contiguous words to be selected, this is an optional argument (default value is 1). |
Utilities:
command | description |
---|---|
:NeraMapping |
shows function keys mapping |
:NeraLoad command_script_file |
load and execute a script file containing Nera commands or any other vim : commands |
:NeraLabels label ... [label] |
Set a list of labels, to be used afterward with NeraSet |
:NeraLabelsClear |
Clear the list of preset labels, to be used afterward with NeraSet |
Given the sentence (line):
my name is Giorgio Robino and I live in Genova, corso Magenta 35/4
To assign to function key <F1>
a substitution for visual mode and single word selection:
-
assign a new "macro" substitution to
<F1>
:NeraSet f1 person_name
-
put the cursor at the begin of the word you want to annotate:
my name is Giorgio Robino nd I live in Genova, corso Magenta 35/4 ^ | set the vim cursor here
-
press
<F1>
. The line is updated with the entity notation syntax decoration:my name is [Giorgio](person_name) Robino and I live in Genova, corso Magenta 35/4
Maybe the example above is not what you exactly want, because a full person name is usually composed
by two consecutive words (Giorgio Robino),
so you maybe want to preset (another or the same) function key <F1>
to automatically substitute the current and the successive word.
In this case, set the mapping with argument contiguousWords
set to 2
:
-
assign a new "macro" substitution to
<F2>
:NeraSet F2 person_name 2
-
again put the cursor at the begin of the word you want to annotate:
my name is Giorgio Robino and I live in Genova, corso Magenta 35/4 ^ | set the vim cursor here
-
press
<F2>
. The line is updated and in this casemy name is [Giorgio Robino](person_name) and I live in Genova, corso Magenta 35/4
Anyway, even if you do not specify the words number
argument,
you can proceed withe visual selection mode. So:
-
assign a new "macro" substitution to
<F3>
:NeraSet <F3> address
-
go in vim visual mode (pressing
v
) and select the span you want to annotate:my name is Giorgio Robino and I live in Genova, corso Magenta 35/4 ^ ^ | | start visual selection end visual selection
-
press
esc
and<F3>
. The line is updated and in this casemy name is [Giorgio Robino](person_name) and I live in [Genova, corso Magenta 35/4](address)
You want to prepare a precise (short) list of labels
you will use afterward to annotate with NeraSet
:
:NeraLabels name surname address city age gender
This list act as the reference list, to validate NeraSet
label argument.
By example,
NeraSet <F4> job
generates a warning message, because you are setting a label not previously declared:
warning: label 'job' is not one of the configured labels: name surname address city age gender
functionKey: <F4>, label: job, contiguous words: 1
Press ENTER or type command to continue
Execute all Nera commands previously saved in specified script file.
-
you create your script file
examples/my_project_configs.vim
containing Nera or other vim commands, by example:" " my_project_configs.vim " " F1 - F4 NeraSet <F1> name 1 NeraSet <F2> address 1 NeraSet <F3> company 1 NeraSet <F4> location 1 " F5 - F8 NeraSet <F5> email NeraSet <F6> name 2 NeraSet <F7> name 3 NeraSet <F8> address 1 " F9 - F12 NeraSet <F9> gender NeraSet <F10> address 3 NeraSet <F12> company 2
-
Afterward you run the script from command mode:
:NeraLoad examples/my_project_configs.vim
Suppose you run commands:
:NeraSet <F1> name 1
:NeraSet <F2> address 1
:NeraSet <F3> company 1
:NeraSet <F4> location 1
:NeraSet <F5> email
Afterward, you want to show the key mappings:
:NeraMapping
<F1> c1w[<C-R><C-O>"](name)<Esc>
<F2> c1w[<C-R><C-O>"](address)<Esc>
<F3> c1w[<C-R><C-O>"](company)<Esc>
<F4> c1w[<C-R><C-O>"](location)<Esc>
<F5> c1w[<C-R><C-O>"](email)<Esc>
<F6>
<F6>
<F7>
<F8>
<F9>
<F10>
<F11>
<F12>
Press ENTER or type command to continue
-
Commands arguments auto completion
When using command
NeraSet
you can use arguments auto completion (function key, labels, etc.). When using commandNeraLoad
you can exploit file name argument auto completion -
Undo labeling
If you are unhappy with your
NeraSet
labeling, just undo in vim as usual, pressingu
in normal mode! -
Visual mode is always on
Any time you assign a key with
NeraSet
, you set the word mode for a specified number of contiguous words, but you also enable the visual mode! You can optionally- select set the cursor at the start of word and press afterward the function key
- select in visual mode a span of words and press afterward the function key
Using vim-plug, in your .vimrc
file:
Plug 'solyarisoftware/nera.vim'
Some files available in examples directory of this repo. Here a live demo of this plugin commands usage to annotate entities:
This project is a work-in-progress proof-of-concept.
I'm not a vimscript expert, so any coding contribute is welcome.
For any proposal and issue, please submit here on github issues for bugs, suggestions, etc. You can also contact me via email ([email protected]).
I'm especially interested in any markup-based entity syntax formats alternative/different from RASA. Please let me know. Do not esitate to open a 'change request' issue.
IF YOU LIKE THE PROJECT, PLEASE βοΈSTAR THIS REPOSITORY TO SHOW YOUR SUPPORT! π
- add a help / online tutorial command
- improve arguments validation
- extend syntax, managing not only RASA-like style syntax annotation, but also other different systems' syntax:
-
v. 0.5.0
- new commands
NeraLabels
andNeraLabelsClear
NeraSet
arguments auto-completion
- new commands
-
v. 0.4.1
NeraLoad
new command to load script of commandsNeraMapping
has now a cleaner list of key mappingsNeraSet
now accept the function key argument just pressing the corresponding function key!
- I made another plugin possibly complementary: Highlight.vim to colorize pattern of texts, with a random or specified background colors. A Possible usage is to highlight entity names and entity labels as show here: highlight entities having RASA-YAML entity annotation syntax
MIT (c) Giorgio Robino