Skip to content

Annotation and Analysis

Taylor Snead edited this page Jul 30, 2021 · 4 revisions

Overview

This page provides a broad description of DAILP's data annotation and analysis processes. Currently, our data analysis and annotation centers around two types of data: Manuscript images and Audio recordings.

Manuscript Annotation and Analysis

DAILP's manuscript annotation and analysis process enriches a manuscript and connects the original writing to a broad English translation. We create an interlinear glossed text, containing:

  1. Transcription of the source syllabary layer
  2. Transliteration into a pedagogical orthography called Simple Phonetics
  3. More complex sound layer (called the Phonemic layer)
  4. Word parts layer (also called the morphemic layer)
  5. Grammar layer (also called the morpheme gloss layer)
  6. Word-by-word English translation
  7. Additional commentary for each word

This analysis and annotation is the source of the data used on the DAILP website and exposed on DAILP's API.

Read More ➡️

Audio Recordings

DAILP's audio recording analysis and annotation process involves:

  1. Recording of manuscript readings
  2. Time-marked annotation of words in the recordings
  3. Archiving of audio files

Currently, the process ends at archiving; however, archived audio segments will soon be displayed alongside stories, story segments, and words on the DAILP website.

Read More ➡️