Change search
ReferencesLink to record
Permanent link

Direct link
Classification of tumor samples from expression data using decision trunks
University of Skövde, School of Life Sciences. University of Skövde, The Systems Biology Research Centre.
University of Skövde, School of Life Sciences. University of Skövde, The Systems Biology Research Centre.
University of Skövde, School of Life Sciences. University of Skövde, The Systems Biology Research Centre.
2013 (English)In: Cancer Informatics, ISSN 1176-9351, Vol. 12, 53-66 p.Article in journal (Refereed) Published
Abstract [en]

We present a novel machine learning approach for the classification of cancer samples using expression data. We refer to the method as "decision trunks," since it is loosely based on decision trees, but contains several modifications designed to achieve an algorithm that: (1) produces smaller and more easily interpretable classifiers than decision trees; (2) is more robust in varying application scenarios; and (3) achieves higher classification accuracy. The decision trunk algorithm has been implemented and tested on 26 classification tasks, covering a wide range of cancer forms, experimental methods, and classification scenarios. This comprehensive evaluation indicates that the proposed algorithm performs at least as well as the current state of the art algorithms in terms of accuracy, while producing classifiers that include on average only 2-3 markers. We suggest that the resulting decision trunks have clear advantages over other classifiers due to their transparency, interpretability, and their correspondence with human decision-making and clinical testing practices. © the author(s), publisher and licensee Libertas Academica Ltd.

Place, publisher, year, edition, pages
Libertas Academica Ltd. , 2013. Vol. 12, 53-66 p.
Keyword [en]
Biomarkers, Classification, Gene expression, Machine learning, accuracy, article, classification algorithm, controlled study, decision making, decision tree, intermethod comparison, learning algorithm
National Category
Natural Sciences
Research subject
Natural sciences
URN: urn:nbn:se:his:diva-8394DOI: 10.4137/CIN.S10356PubMedID: 23467331ScopusID: 2-s2.0-84874202131OAI: diva2:639970
Available from: 2013-08-12 Created: 2013-08-12 Last updated: 2016-01-22
In thesis
1. Bioinformatics tools for discovery and evaluation of biomarkers: Applications in clinical assessment of cancer
Open this publication in new window or tab >>Bioinformatics tools for discovery and evaluation of biomarkers: Applications in clinical assessment of cancer
2016 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Cancer is a disease characterized by abnormal proliferation of cells in the body and ranks as the second leading cause of death worldwide. In order to improve cancer patient care, a major focus of cancer research is to discover biomarkers. A biomarker is a biological molecule found in tissues or body fluids and can be used to predict or assess disease states. The aim of this thesis is to develop bioinformatics tools for discovery and evaluation of novel biomarkers from high-throughput datasets.

MicroRNAs (miRNAs) are short non-coding RNAs that function as negative regulators of gene expression. Dysregulation of miRNAs in cancer is frequently reported, making them interesting as biomarker candidates. GenoScan was developed for genome-wide discovery of miRNA-coding genes, as a first step in the identification of novel mi-RNA biomarkers.

High-throughput technologies such as microarrays allow researchers to measure the expression of thousands of genes or miRNAs simultaneously. The Decision Trunk Classifier (DTC) algorithm has been developed to screen datasets from these experiments for biomarker candidates. When applied to a miRNA expression dataset for endometrial cancer (EC) samples vs. controls, a two-marker model with 98 % accuracy was generated. These miRNAs (hsa-miR-183-5p and hsa-miRPlus-C1070) are promising as biomarkers for EC screening.

The miREC database was developed to store gene and miRNA data from curated expression profiling studies of EC, as well as gene-miRNA regulatory connections. Using gene-miRNA interaction networks from miREC, the roles of miRNAs in cancer hallmark acquisition can be clarified. To further support exploratory analysis of expression data, DTC was extended with partial least squares regression models. The resulting PLS-DTC algorithm can be used to gain deeper insights into the perturbation of biological processes and pathways.

Place, publisher, year, edition, pages
Örebro: Örebro University, 2016. 75 p.
, Örebro Studies in Medicine, ISSN 1652-4063 ; 130
Algorithms, biomarkers, machine learning, classification, cancer, microRNA database, microRNA discovery, partial least squares
National Category
Medical and Health Sciences
Research subject
Medical sciences
urn:nbn:se:his:diva-11824 (URN)978-91-7529-111-6 (ISBN)
Public defence
2016-02-03, Insikten (Portalen), Skövde, 23:05 (English)
Available from: 2016-01-22 Created: 2016-01-12 Last updated: 2016-01-22Bibliographically approved

Open Access in DiVA

fulltext(201 kB)283 downloads
File information
File name FULLTEXT01.pdfFile size 201 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Other links

Publisher's full textPubMedScopusLänk till fulltext

Search in DiVA

By author/editor
Ulfenborg, BenjaminKlinga-Levan, KarinOlsson, Björn
By organisation
School of Life SciencesThe Systems Biology Research Centre
In the same journal
Cancer Informatics
Natural Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 283 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 739 hits
ReferencesLink to record
Permanent link

Direct link