Change search
ReferencesLink to record
Permanent link

Direct link
MotifLab: a tools and data integration workbench for motif discovery and regulatory sequence analysis
Norwegian University of Science and Technology, Faculty of Medicine, Department of Cancer Research and Molecular Medicine.
Norwegian University of Science and Technology, Faculty of Medicine, Department of Cancer Research and Molecular Medicine.
2013 (English)In: BMC Bioinformatics, ISSN 1471-2105, Vol. 14, 9- p.Article in journal (Refereed) Published
Abstract [en]

Background: Traditional methods for computational motif discovery often suffer from poor performance. In particular, methods that search for sequence matches to known binding motifs tend to predict many non-functional binding sites because they fail to take into consideration the biological state of the cell. In recent years, genome-wide studies have generated a lot of data that has the potential to improve our ability to identify functional motifs and binding sites, such as information about chromatin accessibility and epigenetic states in different cell types. However, it is not always trivial to make use of this data in combination with existing motif discovery tools, especially for researchers who are not skilled in bioinformatics programming. Results: Here we present MotifLab, a general workbench for analysing regulatory sequence regions and discovering transcription factor binding sites and cis-regulatory modules. MotifLab supports comprehensive motif discovery and analysis by allowing users to integrate several popular motif discovery tools as well as different kinds of additional information, including phylogenetic conservation, epigenetic marks, DNase hypersensitive sites, ChIP-Seq data, positional binding preferences of transcription factors, transcription factor interactions and gene expression. MotifLab offers several data-processing operations that can be used to create, manipulate and analyse data objects, and complete analysis workflows can be constructed and automatically executed within MotifLab, including graphical presentation of the results. Conclusions: We have developed MotifLab as a flexible workbench for motif analysis in a genomic context. The flexibility and effectiveness of this workbench has been demonstrated on selected test cases, in particular two previously published benchmark data sets for single motifs and modules, and a realistic example of genes responding to treatment with forskolin. MotifLab is freely available at

Place, publisher, year, edition, pages
BioMed Central, 2013. Vol. 14, 9- p.
URN: urn:nbn:no:ntnu:diva-21232DOI: 10.1186/1471-2105-14-9ISI: 000314184700001OAI: diva2:633063

© 2013 Klepper and Drabløs; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Available from: 2013-06-26 Created: 2013-06-26 Last updated: 2013-10-21Bibliographically approved
In thesis
1. Integrated approaches for motif discovery in genomic regions
Open this publication in new window or tab >>Integrated approaches for motif discovery in genomic regions
2013 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Recipes for all the proteins that are needed by an organism are described in its genes which are encoded in DNA. In order to create a new protein, a copy of the DNA recipe must first be transcribed into RNA and this transcript is subsequently translated into protein. Since different proteins are used in different cell-types and at different times, the process of creating new proteins must be tightly regulated. The first step in this process is mainly regulated by transcription factors which bind to specific sequence patterns in the DNA and help recruit the transcriptional apparatus to the start of the gene and initiate transcription. An important step in elucidating the gene regulatory networks of an organism is thus to determine which sequence pattern each transcription factor binds to (the binding motif) and also the sites where they bind.

Although such motifs and binding sites are best determined experimentally, computational tools for motif discovery seem to offer a convenient, fast and costeffective alternative to experimental methods. Hundreds of software programs have therefore been developed for this purpose. These tools can broadly be divided into two classes: motif scanning tools rely on predefined models of binding motifs and search sequences for matches to these motifs in order to identify potential binding sites. De novo motif discovery methods, on the other hand, aim to find new motifs and binding sites without such prior knowledge by looking for overrepresented patterns in sequences believed to be regulated by common factors.

However, independent assessment studies of computational motif discovery tools have shown that the performance of these methods is limited, especially with respect to predicting functionally active binding sites in real genomic sequences. One reason for this is that most of these tools only base their predictions on information in the DNA sequence itself, but many other aspects besides the presence of a binding motif can influence whether a transcription factor will actually be able to bind and exert its regulatory function, including for instance the local chromatin conformation around the binding site or the presence of cooperative factors binding nearby.

More recent approaches have demonstrated that binding site predictions can be improved by also considering additional information related to e.g. phylogenetic conservation, nucleosome occupancy, DNase hypersensitive sites, epigenetic features, gene expression and transcription factor interactions. To this end we have developed a new software workbench which is able to integrate additional information from a variety of sources into the motif discovery process in a coherent and flexible way.

Place, publisher, year, edition, pages
NTNU: , 2013
Doctoral theses at NTNU, ISSN 1503-8181 ; 2013:143
National Category
Medical technology
urn:nbn:no:ntnu:diva-21234 (URN)978-82-471-4390-2 (printed ver.) (ISBN)978-82-471-4391-9 (electronic ver.) (ISBN)
Public defence
2013-05-21, 12:15
Available from: 2013-06-26 Created: 2013-06-26Bibliographically approved

Open Access in DiVA

fulltekst(4484 kB)62 downloads
File information
File name FULLTEXT01.pdfFile size 4484 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Other links

Publisher's full textBMC Bioinformatics
By organisation
Department of Cancer Research and Molecular Medicine
In the same journal
BMC Bioinformatics

Search outside of DiVA

GoogleGoogle Scholar
Total: 62 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 84 hits
ReferencesLink to record
Permanent link

Direct link