MasakhaNER: Named Entity Recognition for African Languages

David Ifeoluwa Adelani; Jade Abbott; Graham Neubig; Daniel D'souza; Julia Kreutzer; Constantine Lignos; Chester Palen-Michel; Happy Buzaaba; Shruti Rijhwani; Sebastian Ruder; Stephen Mayhew; Israel Abebe Azime; Shamsuddeen Muhammad; Chris Chinenye Emezue; Joyce Nakatumba-Nabende; Perez Ogayo; Anuoluwapo Aremu; Catherine Gitau; Derguene Mbaye; Jesujoba Alabi; Seid Muhie Yimam; Tajuddeen Gwadabe; Ignatius Ezeani; Rubungo Andre Niyongabo; Jonathan Mukiibi; Verrah Otiende; Iroro Orife; Davis David; Samba Ngom; Tosin Adewumi; Paul Rayson; Mofetoluwa Adeyemi; Gerald Muriuki; Emmanuel Anebi; Chiamaka Chukwuneke; Nkiruka Odu; Eric Peter Wairagala; Samuel Oyerinde; Clemencia Siro; Tobius Saul Bateesa; Temilola Oloyede; Yvonne Wambui; Victor Akinode; Deborah Nabagereka; Maurice Katusiime; Ayodele Awokoya; Mouhamadane MBOUP; Dibora Gebreyohannes; Henok Tilaye; Kelechi Nwaike; Degaga Wolde; Abdoulaye Faye; Blessing Sibanda; Orevaoghene Ahia; Bonaventure F. P. Dossou; Kelechi Ogueji; Thierno Ibrahima DIOP; Abdoulaye Diallo; Adewale Akinfaderin; Tendai Marengereke; Salomey Osei

MasakhaNER: Named Entity Recognition for African Languages

David Ifeoluwa Adelani

Jade Abbott

Graham Neubig

Daniel D'souza

Julia Kreutzer

Constantine Lignos

Chester Palen-Michel

Happy Buzaaba

Shruti Rijhwani

Sebastian Ruder

Stephen Mayhew

Israel Abebe Azime

Shamsuddeen Muhammad

Chris Chinenye Emezue

Joyce Nakatumba-Nabende

Perez Ogayo

Anuoluwapo Aremu

Catherine Gitau

Derguene Mbaye

Jesujoba Alabi

Seid Muhie Yimam

Tajuddeen Gwadabe

Ignatius Ezeani

Rubungo Andre Niyongabo

Jonathan Mukiibi

Verrah Otiende

Iroro Orife

Davis David

Samba Ngom

Tosin Adewumi

Paul Rayson

Mofetoluwa Adeyemi

Gerald Muriuki

Emmanuel Anebi

Chiamaka Chukwuneke

Nkiruka Odu

Eric Peter Wairagala

Samuel Oyerinde

Clemencia Siro

Tobius Saul Bateesa

Temilola Oloyede

Yvonne Wambui

Victor Akinode

Deborah Nabagereka

Maurice Katusiime

Ayodele Awokoya

Mouhamadane MBOUP

Dibora Gebreyohannes

Henok Tilaye

Kelechi Nwaike

Degaga Wolde

Abdoulaye Faye

Blessing Sibanda

Orevaoghene Ahia

Bonaventure F. P. Dossou

Kelechi Ogueji

Thierno Ibrahima DIOP

Abdoulaye Diallo

Adewale Akinfaderin

Tendai Marengereke

Salomey Osei

TACL (2021)

Download Google Scholar

Abstract

We take a step towards addressing the underrepresentation of the African continent in NLP research by creating the first large publicly available highquality dataset for named entity recognition (NER) in ten African languages, bringing together a variety of stakeholders. We detail characteristics of the languages to help researchers understand the challenges that these languages pose for NER. We analyze our datasets and conduct an extensive empirical evaluation of stateoftheart methods across both supervised and transfer learning settings. We release the data, code, and models in order
to inspire future research on African NLP.

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

MasakhaNER: Named Entity Recognition for African Languages

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs