The ISC is part of the Johns Hopkins Applied Physics Laboratory and will follow all current policies. Please visit the JHU/APL page for more information on the Lab's visitor guidance.

2020

Dragonfly: Advances in Non-Speaker Annotation for Low Resource Languages


Abstract

Dragonfly is an open source software tool that supports annotation of text in a low resource language by non-speakers of the language. Using semantic and contextual information, non-speakers of a language familiar with the Latin script can produce high quality named entity annotations to support construction of a name tagger. We describe a procedure for annotating low resource languages using Dragonfly that others can use, which we developed based on our experience annotating data in more than ten languages. We also present performance comparisons between models trained on native speaker and non-speaker annotations.

Citation

article: costello2020dragonfly title=Dragonfly - Advances in Non-Speaker Annotation for Low Resource Languages author=Costello Cash and Anderson Shelby and Bishop Caitlyn and Mayfield James and McNamee Paul journal=LREC pages=6983--6987 year=2020

Citation

article: costello2020dragonfly title=Dragonfly - Advances in Non-Speaker Annotation for Low Resource Languages author=Costello Cash and Anderson Shelby and Bishop Caitlyn and Mayfield James and McNamee Paul journal=LREC pages=6983--6987 year=2020