Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Bootstrapping Language Description: The case of Mpiemo (Bantu A, Central African Republic)
Department of Computing Science, Chalmers University, Gothenburg.
Department of African Languages, Gothenburg University, Gothenburg.
Department of African Languages, Gothenburg University, Gothenburg.
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
2008 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Linguists have long been producing grammatical decriptions of yet undescribed languages. This is a time-consuming process, which has already adapted to improved technology for recording and storage. We present here a novel application of NLP techniques to bootstrap analysis of collected data and speed-up manual selection work. To be more precise, we argue that unsupervised induction of morphology and part-of-speech analysis from raw text data is mature enough to produce useful results. Experiments with Latent Semantic Analysis were less fruitful. We exemplify this on Mpiemo, a so-far essentially undescribed Bantu language of the Central African Republic, for which raw text data was available.

Place, publisher, year, edition, pages
2008.
Keywords [en]
Mpiemo, Bantu A, Central African Republic, NLP, Latent Semantic Analysis, bootstrapping
National Category
Specific Languages Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:uu:diva-126666OAI: oai:DiVA.org:uu-126666DiVA, id: diva2:326014
Conference
Sixth international conference on Language Resources and Evaluation, LREC 2008, 28-30 May 2008, Marrakech
Available from: 2010-06-30 Created: 2010-06-21 Last updated: 2018-12-06Bibliographically approved

Open Access in DiVA

fulltext(153 kB)181 downloads
File information
File name FULLTEXT01.pdfFile size 153 kBChecksum SHA-512
1003785b34ed450fe11dcc96e0a7b606b55f1b40919d417fb02764670007aa786fb2ae8302fccdfa937ea1083e0262933fb53a4be264757235f8218b498d8324
Type fulltextMimetype application/pdf

Other links

http://www.lrec-conf.org/proceedings/lrec2008/pdf/848_paper.pdf

Search in DiVA

By author/editor
Hammarström, HaraldWesterlund, Torbjörn
By organisation
Department of Linguistics and Philology
Specific LanguagesLanguage Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar
Total: 181 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 500 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf