Exploiting Syntax when Detecting Protein Names in Text
Number of Authors: 5
2002 (English)In: Proceedings of FMI Workshop on Natural Language Processing in Biomedical Applications, 2002, 1, , 6 p.Conference paper (Refereed)
This paper presents work on a method to detect names of proteins in running text. Our system - Yapex - uses a combination of lexical and syntactic knowledge, heuristic filters and a local dynamic dictionary. The syntactic information given by a general-purpose off-the-shelf parser supports the correct identification of the boundaries of protein names, and the local dynamic dictionary finds protein names in positions incompletely analysed by the parser. We present the different steps involved in our approach to protein tagging, and show how combinations of them influence recall and precision. We evaluate the system on a corpus of MEDLINE abstracts and compare it with the KeX system (Fukuda et al., 1998) along four different notions of correctness.
Place, publisher, year, edition, pages
2002, 1. , 6 p.
Computer and Information Science
IdentifiersURN: urn:nbn:se:ri:diva-22240OAI: oai:DiVA.org:ri-22240DiVA: diva2:1041785
EFMI Workshop on Natural Language Processing in Biomedical Applications