Instance-based ontology alignment using decision trees
Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Using ontologies is a key technology in the semantic web. The semantic web helps people to store their data on the web, build vocabularies, and has written rules for handling these data and also helps the search engines to distinguish between the information they want to access in web easier. In order to use multiple ontologies created by different experts we need matchers to find the similar concepts in them to use it to merge these ontologies.
Text based searches use the string similarity functions to find the equivalent concepts inside ontologies using their names.This is the method that is used in lexical matchers. But a global standard for naming the concepts in different research area does not exist or has not been used. The same name may refer to different concepts while different names may describe the same concept.
To solve this problem we can use another approach for calculating the similarity value between concepts which is used in structural and constraint-based matchers. It uses relations between concepts, synonyms and other information that are stored in the ontologies. Another category for matchers is instance-based that uses additional information like documents related to the concepts of ontologies, the corpus, to calculate the similarity value for the concepts.
Decision trees in the area of data mining are used for different kind of classification for different purposes. Using decision trees in an instance-based matcher is the main concept of this thesis. The results of this implemented matcher using the C4.5 algorithm are discussed. The matcher is also compared to other matchers. It also is used for combination with other matchers to get a better result.
Place, publisher, year, edition, pages
2012. , 50 p.
Biomedical ontologies, ontology alignment, decision tree classifier
IdentifiersURN: urn:nbn:se:liu:diva-84918ISRN: LIU-IDA/LITH-EX-A--12/055—SEOAI: oai:DiVA.org:liu-84918DiVA: diva2:562968
Subject / course
Computer and information science at the Institute of Technology
2012-10-17, Muhammad al-Khwarizmi, Hus B, bottenvåningen (plan 2), Linköping, 13:00 (English)
Lambrix, Patrick, professor