Change search
ReferencesLink to record
Permanent link

Direct link
Mapping the human proteome using bioinformatic methods
KTH, School of Biotechnology (BIO), Proteomics. (Bioinformatics)ORCID iD: 0000-0003-0198-7137
2011 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The fundamental goal of proteomics is to gain an understanding of the expression and function of the proteome on the level of individual proteins, on the level of defined cell types and on the level of the entire organism. In this thesis, the human proteome is explored using membrane protein topology prediction methods to define the human membrane proteome and by global protein expression profiling, which relies on a complex study of the location and expression levels of proteins in tissues and cells.

A whole-proteome analysis was performed based on the predicted protein-coding genes of humans using a selection of membrane protein topology prediction methods. The study used a majority decision-based method, which estimated that approximately 26% of the human genes encode for a membrane protein. The prediction results are displayed in a visualization tool to facilitate the selection of antigens to be used for antibody generation.

Global protein expression profiles in a large number of cells and tissues in the human body were analyzed for more than 4000 protein targets, based on data from the antibody-based immunohistochemistry and immunofluorescence methods within the framework of the Human Protein Atlas project. The results revealed few cell-type specific proteins and a high fraction of human proteins expressed in most cells, suggesting that cell and tissue specificity is attained by a fine-tuned regulation of protein levels. The expression profiles were also used to analyze the relationship between 45 cell lines by hierarchical clustering and principal component analysis. The global protein expression patterns overall reflected the tumor origin of the cells, and also allowed for identification of proteins of importance for distinguishing different categories of cell lines, as defined by phenotype of progenitor cell. In addition, the protein distribution in 16 subcellular compartments in three of the human cell lines was mapped. A large fraction of proteins were localized in two or more compartments and, in line with previous results, a majority of proteins were detected in all three cell lines.

Finally, mass spectrometry-based protein expression levels were compared to RNA-seq-based transcript expression levels in three cell lines. Highly ubiquitous mRNA expression was found and the changes of expression levels between the cell lines showed high correlations between proteins and transcripts. Large general differences in abundance of proteins from various functional classes were observed. A comparison between categories based on expression levels revealed that, in general, genes with varying expression levels between the cell lines or only expressed in one cell line were highly enriched for cell-surface proteins.

These studies show a path for a systematic analysis to characterize the proteome in human cells, tissues and organs.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology. , 2011. , 66 p.
Series
Trita-BIO-Report, ISSN 1654-2312 ; 2011:4
Keyword [en]
proteome, transcriptome, bioinformatics, membrane protein prediction, subcellular localization, protein expression level, cell line, immunohistochemistry, immunofluorescence
National Category
Bioinformatics and Systems Biology
Research subject
SRA - Molecular Bioscience
Identifiers
URN: urn:nbn:se:kth:diva-31477ISBN: 978-91-7415-886-1OAI: oai:DiVA.org:kth-31477DiVA: diva2:404310
Public defence
2011-04-08, F3, Lindstedtsvägen 26, KTH, Stockholm, 14:41 (English)
Opponent
Supervisors
Projects
The Human Protein Atlas project
Funder
Knut and Alice Wallenberg Foundation
Note
QC 20110317Available from: 2011-03-17 Created: 2011-03-16 Last updated: 2011-03-17Bibliographically approved
List of papers
1. Prediction of the human membrane proteome
Open this publication in new window or tab >>Prediction of the human membrane proteome
Show others...
2010 (English)In: Proteomics, ISSN 1615-9853, E-ISSN 1615-9861, Vol. 10, no 6, 1141-1149 p.Article in journal (Refereed) Published
Abstract [en]

Membrane proteins are key molecules in the cell, and are important targets for pharmaceutical drugs. Few three-dimensional structures of membrane proteins have been obtained, which makes computational prediction of membrane proteins crucial for studies of these key molecules. Here, seven membrane protein topology prediction methods based on different underlying algorithms, such as hidden Markov models, neural networks and support vector machines, have been used for analysis of the protein sequences from the 21 416 annotated genes in the human genome. The number of genes coding for a protein with predicted cc-helical transmembrane region(s) ranged from 5508 to 7651, depending on the method used. Based on a majority decision method, we estimate 5539 human genes to code for membrane proteins, corresponding to approximately 26% of the human protein-coding genes. The largest fraction of these proteins has only one predicted transmembrane region, but there are also many proteins with seven predicted transmembrane regions, including the G-protein coupled receptors. A visualization tool displaying the topologies suggested by the eight prediction methods, for all predicted membrane proteins, is available on the public Human Protein Atlas portal (www.proteinatlas.org).

Keyword
Bioinformatics, Human proteome, Membrane protein, Prediction
National Category
Biochemistry and Molecular Biology Biochemistry and Molecular Biology
Identifiers
urn:nbn:se:kth:diva-28356 (URN)10.1002/pmic.200900258 (DOI)000276337800004 ()2-s2.0-77949732088 (ScopusID)
Funder
Knut and Alice Wallenberg Foundation
Note
QC 20110120Available from: 2011-01-20 Created: 2011-01-14 Last updated: 2011-03-17Bibliographically approved
2. A global view of protein expression in human cells, tissues, and organs
Open this publication in new window or tab >>A global view of protein expression in human cells, tissues, and organs
Show others...
2009 (English)In: Molecular Systems Biology, ISSN 1744-4292, Vol. 5Article in journal (Refereed) Published
Abstract [en]

Defining the protein profiles of tissues and organs is critical to understanding the unique characteristics of the various cell types in the human body. In this study, we report on an anatomically comprehensive analysis of 4842 protein profiles in 48 human tissues and 45 human cell lines. A detailed analysis of over 2 million manually annotated, high-resolution, immunohistochemistry- based images showed a high fraction (>65%) of expressed proteins in most cells and tissues, with very few proteins (<2%) detected in any single cell type. Similarly, confocal microscopy in three human cell lines detected expression of more than 70% of the analyzed proteins. Despite this ubiquitous expression, hierarchical clustering analysis, based on global protein expression patterns, shows that the analyzed cells can be still subdivided into groups according to the current concepts of histology and cellular differentiation. This study suggests that tissue specificity is achieved by precise regulation of protein levels in space and time, and that different tissues in the body acquire their unique characteristics by controlling not which proteins are expressed but how much of each is produced. Molecular Systems Biology 5: 337; published online 22 December 2009; doi:10.1038/msb.2009.93

Keyword
antibody-based analysis, bioimaging, global protein expression, immunofluorescence, immunohistochemistry, human genome, antisense transcription, gene-expression, immunohistochemistry, quantification, identification, association, microarrays, prediction, discovery
Identifiers
urn:nbn:se:kth:diva-19103 (URN)10.1038/msb.2009.93 (DOI)000273359200006 ()
Note
QC 20100525Available from: 2010-08-05 Created: 2010-08-05 Last updated: 2011-03-17Bibliographically approved
3. The Global Protein Expression Pattern in Human Cell Lines
Open this publication in new window or tab >>The Global Protein Expression Pattern in Human Cell Lines
Show others...
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Human cancer cell lines grown in vitro are frequently used to decipher basic cell biological phenomena but also to specifically study different forms of cancer. Here we present the first large-scale study of protein expression patterns in cell lines using an antibody-based proteomics approach. We analyzed the expression pattern of 5436 proteins in 45 different cell lines using hierarchical clustering, principal component analysis and two-group comparisons for the identification of differentially expressed proteins. The results show that protein profiles of cell lines, as determined using immunohistochemistry, allow for a hierarchical clustering that overall reflects tumor tissues of origin. Hematological cell lines appear to retain their protein profiles to a higher degree than cell lines established from solid tumors, resulting in a clustering that well reflects progenitor cell types. The discrepancy may reflect different levels of in vitro induced alterations in adherent and suspension grown cell lines, respectively. In addition, multiple myeloma cells and cells of myeloid origin were found to share a protein profile, relative the protein profile of lymphoid leukemia and lymphoma cells, possibly reflecting their common dependency of bone marrow microenvironment.

 

National Category
Other Industrial Biotechnology
Identifiers
urn:nbn:se:kth:diva-31510 (URN)
Available from: 2011-03-17 Created: 2011-03-17 Last updated: 2011-03-17Bibliographically approved
4. Mapping the subcellular protein distribution in three human cell lines
Open this publication in new window or tab >>Mapping the subcellular protein distribution in three human cell lines
Show others...
2011 (English)In: Journal of Proteome Research, ISSN 1535-3893, E-ISSN 1535-3907, Vol. 10, no 8, 3766-3777 p.Article in journal (Refereed) Published
Abstract [en]

The subcellular locations of proteins are closely related to their function and constitute an essential aspect for understanding the complex machinery of living cells. A systematic effort has been initiated to map the protein distribution in three functionally different cell lines with the aim to provide a subcellular localization index for at least one representative protein from all human protein-encoding genes. Here, we present the results of over 4,000 proteins mapped to 16 subcellular compartments. The results indicate a ubiquitous protein expression with a majority of the proteins found in all three cell lines and a large portion localized to two or more compartments. The inter-relationships between the subcellular compartments are visualized in a protein-compartment network based on all detected proteins. Hierarchical clustering was performed to determine how closely related the organelles are in terms of protein constituents and compare the proteins detected in each cell type. Our results show distinct organelle proteomes, well conserved across the cell types, and demonstrate that biochemically similar organelles are grouped together.

Keyword
antibody, organelle, Human Protein Atlas, subcellular atlas, immunofluorescence, confocal microscopy
National Category
Industrial Biotechnology
Identifiers
urn:nbn:se:kth:diva-31514 (URN)10.1021/pr200379a (DOI)000293487900041 ()2-s2.0-79961240625 (ScopusID)
Funder
Knut and Alice Wallenberg FoundationScience for Life Laboratory - a national resource center for high-throughput molecular bioscience
Note
Updated from submitted to publishedAvailable from: 2011-03-17 Created: 2011-03-17 Last updated: 2016-05-16Bibliographically approved
5. Defining the transcriptome and proteome in three functionally different human cell lines
Open this publication in new window or tab >>Defining the transcriptome and proteome in three functionally different human cell lines
Show others...
2010 (English)In: Molecular Systems Biology, ISSN 1744-4292, Vol. 6, 450- p.Article in journal (Refereed) Published
Abstract [en]

An essential question in human biology is how cells and tissues differ in gene and protein expression and how these differences delineate specific biological function. Here, we have performed a global analysis of both mRNA and protein levels based on sequence-based transcriptome analysis (RNA-seq), SILAC-based mass spectrometry analysis and antibody-based confocal microscopy. The study was performed in three functionally different human cell lines and based on the global analysis, we estimated the fractions of mRNA and protein that are cell specific or expressed at similar/different levels in the cell lines. A highly ubiquitous RNA expression was found with > 60% of the gene products detected in all cells. The changes of mRNA and protein levels in the cell lines using SILAC and RNA ratios show high correlations, even though the genome-wide dynamic range is substantially higher for the proteins as compared with the transcripts. Large general differences in abundance for proteins from various functional classes are observed and, in general, the cell-type specific proteins are low abundant and highly enriched for cell-surface proteins. Thus, this study shows a path to characterize the transcriptome and proteome in human cells from different origins.

Keyword
cell lines, expression, human, proteome, transcriptome
National Category
Biochemistry and Molecular Biology
Identifiers
urn:nbn:se:kth:diva-29529 (URN)10.1038/msb.2010.106 (DOI)000285930400006 ()21179022 (PubMedID)2-s2.0-78650642557 (ScopusID)
Funder
Knut and Alice Wallenberg Foundation
Note
QC 20110207Available from: 2011-02-07 Created: 2011-02-07 Last updated: 2012-03-21Bibliographically approved

Open Access in DiVA

fulltext(7083 kB)1664 downloads
File information
File name FULLTEXT02.pdfFile size 7083 kBChecksum SHA-512
2036e6ee79dd212d15b0e78e7bb269108f86cf9c573e986acb1b2904cb68437cec794e9d7095d6c32387033419dd9be281df71feef8e4891063358f225a70888
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Fagerberg, Linn
By organisation
Proteomics
Bioinformatics and Systems Biology

Search outside of DiVA

GoogleGoogle Scholar
Total: 1664 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 250 hits
ReferencesLink to record
Permanent link

Direct link