v.0.1.0: Mixcase tasks and more 🥳
This release of TextMachina includes:
- Allow to pass parameters to the extractors out from the prompt templates. The templates must be used only to define placeholders.
- Add
MixCaseDatasetGenerator
to generate datasets for mixcase tasks (detection tagging). Other datasets like mixcase classification can be built out of TextMachina, using the datasets generated by this one. - Add
sentence_gap
andword_gap
extractors for mixcase tasks. - Refactor interactive exploration. Now we have one class per task, and each one must build its own panels.
- Added exploration for mixcase datasets.
- Added a
TokenClassificationMetric
to evaluate HF models on mixcase and boundary tasks. - Better structured and documented examples. Now we have
examples/learning
to illustrate how to use providers/tasks/extractors andexamples/use_cases
with additional config files. - Minor changes to improve quality of life: force to pass
task_type
in the CLI to prevent potential confusions, disablerandom_sample_human
on boundary detection tasks, etc. - Document all the new code and improve existing documentation.
- Extend the README to talk about mixcase tasks, include figures to visualize each type of task.