You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For broad categorization purposes, @jgbourns has suggested indicating the apparent dialect of a particular document with consideration to who the author(s) are and their origins. This is a more basic treatment than we ultimately hope to do, but I want to gather thoughts on this now. What can we use such information for? Does including this in document metadata get us any closer to a more local consideration of the authors and spaces they occupied deeper than simply "Oklahoma" or "North Carolina"?
The text was updated successfully, but these errors were encountered:
From a sociolinguistic lens, this is a heavy topic. Since we are working with Standard language forms, and intend to use this as an educational resource, I think we are treading on thin ice. That being said, the furthest I believe we could distinguish reliably is NC vs OK broadly, and that is as far as we ought to go given our data and positioning.
After discussion, we landed on at least adding an optional textual description of the location that the document's forms were recorded in. This will help us distinguish between documents of different speech communities, without necessarily prescribing dialect (group).
For broad categorization purposes, @jgbourns has suggested indicating the apparent dialect of a particular document with consideration to who the author(s) are and their origins. This is a more basic treatment than we ultimately hope to do, but I want to gather thoughts on this now. What can we use such information for? Does including this in document metadata get us any closer to a more local consideration of the authors and spaces they occupied deeper than simply "Oklahoma" or "North Carolina"?
The text was updated successfully, but these errors were encountered: