Skip to content

jordimas/MLSUM-Catalan

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MLSUM-Catalan

A Catalan corpus based on https://github.com/recitalAI/MLSUM concepts.

Original context is from Vilaweb licensed under Attribution-NonCommercial-NoDerivs which allows sharing.

Files:

  • URLs used at urls/train.ca.txt.urls
  • Text and summaries: processed/ca_train.txt (2678 entries)

The text and summaries are in the same format that MLSum corpus (tab separated).

Releases

No releases published

Packages

No packages published