Digitala Vetenskapliga Arkivet

Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Contribution to Terminology Internationalization by Word Alignment in Parallel Corpora
INSERM, U729, Paris, France.
Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
INSERM, U729, Paris, France.
2006 (English)In: AMIA 2006 Symposium Proceedings, Washington D.C., USA: AMIA , 2006, p. 185-189Conference paper, Published paper (Refereed)
Abstract [en]

Background and objectives

Creating a complete translation of a large vocabulary is a time-consuming task, which requires skilled and knowledgeable medical translators. Our goal is to examine to which extent such a task can be alleviated by a specific natural language processing technique, word alignment in parallel corpora. We experiment with translation from English to French.

Methods

Build a large corpus of parallel, English-French documents, and automatically align it at the document, sentence and word levels using state-of-the-art alignment methods and tools. Then project English terms from existing controlled vocabularies to the aligned word pairs, and examine the number and quality of the putative French translations obtained thereby. We considered three American vocabularies present in the UMLS with three different translation statuses: the MeSH, SNOMED CT, and the MedlinePlus Health Topics.

Results

We obtained several thousand new translations of our input terms, this number being closely linked to the number of terms in the input vocabularies.

Conclusion

Our study shows that alignment methods can extract a number of new term translations from large bodies of text with a moderate human reviewing effort, and thus contribute to help a human translator obtain better translation coverage of an input vocabulary. Short-term perspectives include their application to a corpus 20 times larger than that used here, together with more focused methods for term extraction.

Place, publisher, year, edition, pages
Washington D.C., USA: AMIA , 2006. p. 185-189
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:liu:diva-35773PubMedID: 17238328Local ID: 28508OAI: oai:DiVA.org:liu-35773DiVA, id: diva2:256621
Conference
AMIA 2006 Annual SymposiumWashington, DC, USANovember 11, 2006 - November 15, 2006
Available from: 2009-10-10 Created: 2009-10-10 Last updated: 2018-01-13

Open Access in DiVA

No full text in DiVA

Other links

PubMedLink to publication

Search in DiVA

By author/editor
Merkel, Magnus
By organisation
NLPLAB - Natural Language Processing LaboratoryThe Institute of Technology
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

pubmed
urn-nbn

Altmetric score

pubmed
urn-nbn
Total: 143 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf