Pattern Acquisition Methods for Information Extraction Systems
Independent thesis Advanced level (degree of Master (One Year))Student thesis
This master thesis treats about Event Recognition in the reports of Polish stockholders. Event Recognition is one of the Information Extraction tasks. This thesis provides a comparison of two approaches to Event Recognition: manual and automatic. In the manual approach regular expressions are used. Regular expressions are used as a baseline for the automatic approach. In the automatic approach three Machine Learning methods were applied. In the initial experiment the Decision Trees, naive Bayes and Memory Based Learning methods are compared. A modification of the standard Memory Based Learning method is presented which goal is to create a classifier that uses only positives examples in the classification task. The performance of the modified Memory Based Learning method is presented and compared to the baseline and also to other Machine Learning methods. In the initial experiment one type of annotation is used and it is the meeting date annotation. The final experiment is conducted using three types of annotations: the meeting time, the meeting date and the meeting place annotation. The experiments show that the classification can be performed using only one class of instances with the same level of performance.
Place, publisher, year, edition, pages
2007. , 68 p.
Natural Language Processing, Information Extraction, Patterns Acquisition, Linguistic Patterns, Memory Based Learning, Event Recognition
Computer Science Software Engineering
IdentifiersURN: urn:nbn:se:bth-4291Local ID: oai:bth.se:arkivex449AF033BA2EF5ACC125736E0045D807OAI: oai:DiVA.org:bth-4291DiVA: diva2:831623