Skip to content

v.0.1.0: Mixcase tasks and more 🥳

Compare
Choose a tag to compare
@jogonba2 jogonba2 released this 19 Jan 16:40
· 74 commits to main since this release
48c68a5

This release of TextMachina includes:

  • Allow to pass parameters to the extractors out from the prompt templates. The templates must be used only to define placeholders.
  • Add MixCaseDatasetGenerator to generate datasets for mixcase tasks (detection tagging). Other datasets like mixcase classification can be built out of TextMachina, using the datasets generated by this one.
  • Add sentence_gap and word_gap extractors for mixcase tasks.
  • Refactor interactive exploration. Now we have one class per task, and each one must build its own panels.
  • Added exploration for mixcase datasets.
  • Added a TokenClassificationMetric to evaluate HF models on mixcase and boundary tasks.
  • Better structured and documented examples. Now we have examples/learning to illustrate how to use providers/tasks/extractors and examples/use_cases with additional config files.
  • Minor changes to improve quality of life: force to pass task_type in the CLI to prevent potential confusions, disable random_sample_human on boundary detection tasks, etc.
  • Document all the new code and improve existing documentation.
  • Extend the README to talk about mixcase tasks, include figures to visualize each type of task.