Skip to content

Latest commit

 

History

History
34 lines (24 loc) · 2.58 KB

LANGUAGES.md

File metadata and controls

34 lines (24 loc) · 2.58 KB

List of Accepted Languages

Countries Total #Languages Official, indigenous Official, non-indigenous Sign languages Creole
Brunei 9 0 2 0 0
Cambodia 21 1 0 1 0
East Timor 22 1 1 0 0
Indonesia 719 1 0 1 0
Laos 73 1 0 1 0
Malaysia 114 1 1 3 2
Myanmar 116 1 0 1 0
Philippines 178 1 1 1 0
Singapore 6 0 4 1 0
Thailand 51 1 0 3 0
Vietnam 94 1 0 3 0

We seek datasets of languages spoken in the South East Asia. The total number of unique languages spoken in this region is 1304 languages. Please refer to this spreadsheet to check the full list.

FAQ

Are non-indigenous but major languages in SEA also accepted?

Yes! We accept non-indigenous languages spoken in SEA regions, e.g., English, Portuguese, Mandarin Chinese, Tamil, Cantonese, etc. Since this initiative is mainly to represent SEA, please make sure that those datasets are either collected: 1) from speakers in SEA or 2) in SEA regions.

What about creoles?

A creole is a language that comes from a simplified version of another language, or the mix of two or more languages. In Singapore, people speak a creole that's mostly based on English. Any creoles that are based on SEA official and/or indigenous languages (e.g., Singlish is based on English) are welcome!

I want to submit a SEA language/dialect/creole that doesn't exist in the existing options.

Please contact the moderators on Discord so we can help to add it.