Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
The effect of background knowledge in graph-based learning in the chemoinformatics domain
Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
University of Skövde, Sweden.
2008 (engelsk)Inngår i: Trends in Intelligent Systems and Computer Engineering / [ed] Oscar Castillo, Li Xu, Sio-Iong Ao, Springer, 2008, 141-153 s.Kapittel i bok, del av antologi (Fagfellevurdert)
Abstract [en]

Typical machine learning systems often use a set of previous experiences (examples) to learn concepts, patterns, or relations hidden within the data [1]. Current machine learning approaches are challenged by the growing size of the data repositories and the growing complexity of those data [1, 2]. In order to accommodate the requirement of being able to learn from complex data, several methods have been introduced in the field of machine learning [2]. Based on the way the input and resulting hypotheses are represented, two main categories of such methods exist, namely, logic-based and graph-based methods [3]. The demarcation line between logic- and graph-based methods lies in the differences of their data representation methods, hypothesis formation, and testing as well as the form of the output produced.

The main purpose of our study is to investigate the effect of incorporating background knowledge into graph learning methods. The ability of graph learning methods to obtain accurate theories with a minimum of background knowledge is of course a desirable property, but not being able to effectively utilize additional knowledge that is available and has been proven important is clearly a disadvantage. Therefore we examine how far additional, already available, background knowledge can be effectively used for increasing the performance of a graph learner. Another contribution of our study is that it establishes a neutral ground to compare classifi- cation accuracies of the two closely related approaches, making it possible to study whether graph learning methods actually would outperform ILP methods if the same background knowledge were utilized [9].

The rest of this chapter is organized as follows. The next section discusses related work concerning the contribution of background knowledge when learning from complex data. Section 10.3 provides a description of the graph learning method that is used in our study. The experimental setup, empirical evaluation, and the results from the study are described in Sect. 10.4. Finally, Sect. 10.5 provides conclusions from the experiments and points out interesting extensions of the work reported in this study.

sted, utgiver, år, opplag, sider
Springer, 2008. 141-153 s.
Serie
Lecture Notes in Electrical Engineering, ISSN 1876-1100 ; 6
HSV kategori
Forskningsprogram
data- och systemvetenskap
Identifikatorer
URN: urn:nbn:se:su:diva-101073DOI: 10.1007/978-0-387-74935-8_10ISBN: 978-0-387-74934-1 (tryckt)ISBN: 978-0-387-74935-8 (tryckt)OAI: oai:DiVA.org:su-101073DiVA: diva2:698681
Tilgjengelig fra: 2014-02-24 Laget: 2014-02-24 Sist oppdatert: 2014-02-26bibliografisk kontrollert
Inngår i avhandling
1. Learning predictive models from graph data using pattern mining
Åpne denne publikasjonen i ny fane eller vindu >>Learning predictive models from graph data using pattern mining
2014 (engelsk)Doktoravhandling, med artikler (Annet vitenskapelig)
Abstract [en]

Learning from graphs has become a popular research area due to the ubiquity of graph data representing web pages, molecules, social networks, protein interaction networks etc. However, standard graph learning approaches are often challenged by the computational cost involved in the learning process, due to the richness of the representation. Attempts made to improve their efficiency are often associated with the risk of degrading the performance of the predictive models, creating tradeoffs between the efficiency and effectiveness of the learning. Such a situation is analogous to an optimization problem with two objectives, efficiency and effectiveness, where improving one objective without the other objective being worse off is a better solution, called a Pareto improvement. In this thesis, it is investigated how to improve the efficiency and effectiveness of learning from graph data using pattern mining methods. Two objectives are set where one concerns how to improve the efficiency of pattern mining without reducing the predictive performance of the learning models, and the other objective concerns how to improve predictive performance without increasing the complexity of pattern mining. The employed research method mainly follows a design science approach, including the development and evaluation of artifacts. The contributions of this thesis include a data representation language that can be characterized as a form in between sequences and itemsets, where the graph information is embedded within items. Several studies, each of which look for Pareto improvements in efficiency and effectiveness are conducted using sets of small graphs. Summarizing the findings, some of the proposed methods, namely maximal frequent itemset mining and constraint based itemset mining, result in a dramatically increased efficiency of learning, without decreasing the predictive performance of the resulting models. It is also shown that additional background knowledge can be used to enhance the performance of the predictive models, without increasing the complexity of the graphs.

sted, utgiver, år, opplag, sider
Stockholm: Department of Computer and Systems Sciences, Stockholm University, 2014. 118 s.
Serie
Report Series / Department of Computer & Systems Sciences, ISSN 1101-8526 ; 14-003
Emneord
Machine Learning, Graph Data, Pattern Mining, Classification, Regression, Predictive Models
HSV kategori
Forskningsprogram
data- och systemvetenskap
Identifikatorer
urn:nbn:se:su:diva-100713 (URN)978-91-7447-837-2 (ISBN)
Disputas
2014-03-25, room B, Forum, Isafjordsgatan 39, Kista, 13:00 (engelsk)
Opponent
Veileder
Tilgjengelig fra: 2014-03-03 Laget: 2014-02-11 Sist oppdatert: 2014-03-04bibliografisk kontrollert

Open Access i DiVA

Fulltekst mangler

Andre lenker

Forlagets fulltekst

Søk i DiVA

Av forfatter/redaktør
Karunaratne, ThashmeeBoström, Henrik
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar

Altmetric

Totalt: 63 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf