Change search
ReferencesLink to record
Permanent link

Direct link
A systematic comparison of genome-scale clustering algorithms
Jackson Lab, USA .
Oak Ridge National Lab, USA .
Pioneer HiBred Int Inc, USA .
Linköping University, Department of Clinical and Experimental Medicine. Linköping University, Faculty of Health Sciences.
Show others and affiliations
2012 (English)In: BMC Bioinformatics, ISSN 1471-2105, Vol. 13Article in journal (Refereed) Published
Abstract [en]

Background: A wealth of clustering algorithms has been applied to gene co-expression experiments. These algorithms cover a broad range of approaches, from conventional techniques such as k-means and hierarchical clustering, to graphical approaches such as k-clique communities, weighted gene co-expression networks (WGCNA) and paraclique. Comparison of these methods to evaluate their relative effectiveness provides guidance to algorithm selection, development and implementation. Most prior work on comparative clustering evaluation has focused on parametric methods. Graph theoretical methods are recent additions to the tool set for the global analysis and decomposition of microarray co-expression matrices that have not generally been included in earlier methodological comparisons. In the present study, a variety of parametric and graph theoretical clustering algorithms are compared using well-characterized transcriptomic data at a genome scale from Saccharomyces cerevisiae. Methods: For each clustering method under study, a variety of parameters were tested. Jaccard similarity was used to measure each clusters agreement with every GO and KEGG annotation set, and the highest Jaccard score was assigned to the cluster. Clusters were grouped into small, medium, and large bins, and the Jaccard score of the top five scoring clusters in each bin were averaged and reported as the best average top 5 (BAT5) score for the particular method. Results: Clusters produced by each method were evaluated based upon the positive match to known pathways. This produces a readily interpretable ranking of the relative effectiveness of clustering on the genes. Methods were also tested to determine whether they were able to identify clusters consistent with those identified by other clustering methods. Conclusions: Validation of clusters against known gene classifications demonstrate that for this data, graph-based techniques outperform conventional clustering approaches, suggesting that further development and application of combinatorial strategies is warranted.

Place, publisher, year, edition, pages
BioMed Central , 2012. Vol. 13
National Category
Medical and Health Sciences
URN: urn:nbn:se:liu:diva-79697DOI: 10.1186/1471-2105-13-S10-S7ISI: 000306140100007OAI: diva2:544111
Available from: 2012-08-13 Created: 2012-08-13 Last updated: 2012-10-31

Open Access in DiVA

fulltext(634 kB)278 downloads
File information
File name FULLTEXT01.pdfFile size 634 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Search in DiVA

By author/editor
Benson, Mikael
By organisation
Department of Clinical and Experimental MedicineFaculty of Health Sciences
In the same journal
BMC Bioinformatics
Medical and Health Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 278 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 56 hits
ReferencesLink to record
Permanent link

Direct link