Digitala Vetenskapliga Arkivet

Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
MasakhaNER: Named Entity Recognition for African Languages
Spoken Language Systems Group (LSV), Saarland University, Germany; Masakhane NLP.
Retro Rabbit, South Africa; Masakhane NLP.
Language Technologies Institute, Carnegie Mellon University, United States.
ProQuest, United States; Masakhane NLP.
Show others and affiliations
2021 (English)In: Transactions of the Association for Computational Linguistics, E-ISSN 2307-387X, Vol. 9, p. 1116-1131Article in journal (Refereed) Published
Abstract [en]

We take a step towards addressing the under-representation of the African continent in NLP research by bringing together different stakeholders to create the first large, publicly available, high-quality dataset for named entity recognition (NER) in ten African languages. We detail the characteristics of these languages to help researchers and practitioners better understand the challenges they pose for NER tasks. We analyze our datasets and conduct an extensive empirical evaluation of state-of-the-art methods across both supervised and transfer learning settings. Finally, we release the data, code, and models to inspire future research on African NLP.

Place, publisher, year, edition, pages
MIT Press, 2021. Vol. 9, p. 1116-1131
Keywords [en]
NER, Low resource, NLP
National Category
Language Technology (Computational Linguistics)
Research subject
Machine Learning
Identifiers
URN: urn:nbn:se:ltu:diva-87532DOI: 10.1162/tacl_a_00416ISI: 000751952200066Scopus ID: 2-s2.0-85119703625OAI: oai:DiVA.org:ltu-87532DiVA, id: diva2:1603787
Funder
EU, Horizon 2020, 3081705
Note

Validerad;2021;Nivå 1;2021-10-25 (alebob)

Available from: 2021-10-18 Created: 2021-10-18 Last updated: 2024-01-08Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Adewumi, Tosin
By organisation
Embedded Internet Systems Lab
In the same journal
Transactions of the Association for Computational Linguistics
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 156 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf