Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
OnPLS: Orthogonal projections to latent structures in multiblock and path model data analysis
Umeå University, Faculty of Science and Technology, Department of Chemistry.
2012 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The amounts of data collected from each sample of e.g. chemical or biological materials have increased by orders of magnitude since the beginning of the 20th century. Furthermore, the number of ways to collect data from observations is also increasing. Such configurations with several massive data sets increase the demands on the methods used to analyse them. Methods that handle such data are called multiblock methods and they are the topic of this thesis.

Data collected from advanced analytical instruments often contain variation from diverse mutually independent sources, which may confound observed patterns and hinder interpretation of latent variable models. For this reason, new methods have been developed that decompose the data matrices, placing variation from different sources of variation into separate parts. Such procedures are no longer merely pre-processing filters, as they initially were, but have become integral elements of model building and interpretation. One strain of such methods, called OPLS, has been particularly successful since it is easy to use, understand and interpret.

This thesis describes the development of a new multiblock data analysis method called OnPLS, which extends the OPLS framework to the analysis of multiblock and path models with very general relationships between blocks in both rows and columns. OnPLS utilises OPLS to decompose sets of matrices, dividing each matrix into a globally joint part (a part shared with all the matrices it is connected to), several locally joint parts (parts shared with some, but not all, of the connected matrices) and a unique part that no other matrix shares.

The OnPLS method was applied to several synthetic data sets and data sets of “real” measurements. For the synthetic data sets, where the results could be compared to known, true parameters, the method generated global multiblock (and path) models that were more similar to the true underlying structures compared to models without such decompositions. I.e. the globally joint, locally joint and unique models more closely resembled the corresponding true data. When applied to the real data sets, the OnPLS models revealed chemically or biologically relevant information in all kinds of variation, effectively increasing the interpretability since different kinds of variation are distinguished and separately analysed.

OnPLS thus improves the quality of the models and facilitates better understanding of the data since it separates and separately analyses different kinds of variation. Each kind of variation is purer and less tainted by other kinds. OnPLS is therefore highly recommended to anyone engaged in multiblock or path model data analysis.

Place, publisher, year, edition, pages
Umeå: Umeå universitet , 2012. , 76 p.
Keyword [en]
OnPLS, OPLS, O2PLS, PLS, Multivariate analysis, Multiblock and path modelling, Chemometrics
National Category
Chemical Sciences
Identifiers
URN: urn:nbn:se:umu:diva-55438ISBN: 978-91-7459-442-3 (print)OAI: oai:DiVA.org:umu-55438DiVA: diva2:526803
Public defence
2012-06-15, KBC-huset, KB3A9, Umeå universitet, Umeå, 10:00 (English)
Opponent
Supervisors
Available from: 2012-05-16 Created: 2012-05-15 Last updated: 2012-05-15Bibliographically approved
List of papers
1. OnPLS—a novel multiblock method for the modelling of predictive and orthogonal variation
Open this publication in new window or tab >>OnPLS—a novel multiblock method for the modelling of predictive and orthogonal variation
2011 (English)In: Journal of Chemometrics, ISSN 0886-9383, E-ISSN 1099-128X, Vol. 25, no 8, 441-455 p.Article in journal (Refereed) Published
Abstract [en]

This paper presents a new multiblock analysis method called OnPLS, a general extension of O2PLS to the multiblock case. The proposed method is equivalent to O2PLS in cases involving only two matrices, but generalises to cases involving more than two matrices without giving preference to any particular matrix: the method is fully symmetric. OnPLS extracts a minimal number of globally predictive components that exhibit maximal covariance and correlation. Furthermore, the method can be used to study orthogonal variation, i.e. local phenomena captured in the data that are specific to individual combinations of matrices or to individual matrices. The method's utility was demonstrated by its application to three synthetic data sets. It was shown that OnPLS affords a reduced number of globally predictive components and increased intercorrelations of scores, and that it greatly facilitates interpretation of the predictive model.

Place, publisher, year, edition, pages
John Wiley & Sons, Ltd, 2011
Keyword
OnPLS, multiblock analysis, O2PLS, orthogonal projections to latent structures, PLS
National Category
Analytical Chemistry
Identifiers
urn:nbn:se:umu:diva-46406 (URN)10.1002/cem.1388 (DOI)
Note

Article first published online: 25 APR 2011

Available from: 2011-09-01 Created: 2011-09-01 Last updated: 2017-12-08
2. OnPLS path modelling
Open this publication in new window or tab >>OnPLS path modelling
2012 (English)In: Chemometrics and Intelligent Laboratory Systems, ISSN 0169-7439, E-ISSN 1873-3239, Vol. 118, 139-149 p.Article in journal (Refereed) Published
Abstract [en]

OnPLS was recently presented as a general extension of O2PLS to the multiblock case. OnPLS is equivalent to O2PLS in the case of two matrices, but generalises symmetrically to cases with more than two matrices, i.e. without giving preference to any one of the matrices.

This article presents a straight-forward extension to this method and thereby also introduces the OPLS framework to the field of PLS path modelling. Path modelling links a number of data blocks to each other, thereby establishing a set of paths along which information is considered to flow between blocks, representing for instance a known time sequence, an assumed causality order, or some other chosen organising principle. Compared to existing methods for path analysis, OnPLS path modelling extracts a minimum number of predictive components that are maximally covarying with maximised correlation. This is a significant contribution to path modelling, because other methods may yield score vectors with variation that obstructs the interpretation. The method achieves this by extracting a set of "orthogonal" components that capture local phenomena orthogonal to the variation shared with all the connected blocks.

Two applications will be used to illustrate the method. The first is based on a simulated dataset that show how the interpretation is improved by removing orthogonal variation and the second on a real data process for monitoring of protein structure changes during cheese ripening by analysing infrared data.

Place, publisher, year, edition, pages
Elsevier, 2012
Keyword
OnPLS, OPLS, Orthogonal variation, PLS, PLS path model
National Category
Chemical Sciences
Research subject
Statistics; Analytical Chemistry
Identifiers
urn:nbn:se:umu:diva-55431 (URN)10.1016/j.chemolab.2012.08.009 (DOI)
Funder
Swedish Research Council, 2008-3588
Available from: 2012-05-14 Created: 2012-05-14 Last updated: 2017-12-07Bibliographically approved
3. Bi-modal OnPLS
Open this publication in new window or tab >>Bi-modal OnPLS
2012 (English)In: Journal of Chemometrics, ISSN 0886-9383, E-ISSN 1099-128X, Vol. 26, no 6, 236-245 p.Article in journal (Refereed) Published
Abstract [en]

This paper presents an extension to the recently published OnPLS data analysis method. Bi-modal OnPLS allows for arbitrary block relationships in both columns and rows and is able to extract orthogonal variation in both columns and rows without bias towards any particular direction or matrix: the method is fully symmetric with regard to both rows and columns.

Bi-modal OnPLS extracts a minimal number of globally predictive score vectors that exhibit maximal covariance and correlation in the column space and a corresponding set of predictive loading vectors that exhibit maximal correlation in the row space. The method also extracts orthogonal variation (i.e. variation that is not related to all other matrices) in both columns and rows. The method was applied to two synthetic datasets and one real data set regarding sensory information and consumer likings of dairy products. It was shown that Bi-modal OnPLS greatly improves the intercorrelations between both loadings and scores while still finding the correct variation. This facilitates interpretation of the predictive components and makes it possible to study the orthogonal variation in the data.

Place, publisher, year, edition, pages
John Wiley & Sons, 2012
Keyword
PLS, OnPLS, bi-modal analysis, OPLS
National Category
Chemical Sciences
Identifiers
urn:nbn:se:umu:diva-54278 (URN)10.1002/cem.2448 (DOI)000305510100006 ()
Available from: 2012-04-23 Created: 2012-04-23 Last updated: 2017-12-07Bibliographically approved
4. Global, local and unique decompositions in OnPLS for multiblock data analysis
Open this publication in new window or tab >>Global, local and unique decompositions in OnPLS for multiblock data analysis
2013 (English)In: Analytica Chimica Acta, ISSN 0003-2670, E-ISSN 1873-4324, Vol. 791, 13-24 p.Article in journal (Other academic) Published
Abstract [en]

Background OnPLS is an extension of O2PLS that decomposes a set of matrices, in either multiblock or path model analysis, such that each matrix consists of two parts: a globally joint part containing variation shared with all other connected matrices, and another containing unique or locally joint variation, i.e. variation that is specific to a particular matrix or shared with some, but not all, other connected matrices.

Results A further extension of OnPLS suggested here decomposes the non-globally joint parts into locally joint and unique parts, using the OnPLS method to first find and extract a globally joint model, and then applying OnPLS recursively to subsets of matrices containing the non-globally joint variation remaining after the globally joint variation has been extracted. This results in a set of locally joint models. The variation that is left after the globally joint and locally joint variation has been extracted is not related (by definition) to the other matrices and thus represents the strictly unique variation specific to each matrix. The method's utility is demonstrated by its application to both a simulated data set and a real data set acquired from metabolomic, proteomic and transcriptomic profiling of three genotypes of hybrid aspen.

Conclusions The results show that OnPLS can successfully decompose each matrix into global, local and unique models, resulting in lower numbers of globally joint components and higher intercorrelations of scores. OnPLS also increases the interpretability of models of connected matrices, because of the locally joint and unique models it generates.

Keyword
OnPLS, OPLS, O2PLS, Orthogonal variation, PLS, Decomposition
National Category
Chemical Sciences
Research subject
Statistics; Analytical Chemistry; Genetics
Identifiers
urn:nbn:se:umu:diva-55433 (URN)10.1016/j.aca.2013.06.026 (DOI)
Funder
Swedish Research Council, 2011-6044eSSENCE - An eScience Collaboration
Available from: 2012-05-14 Created: 2012-05-14 Last updated: 2017-12-07Bibliographically approved

Open Access in DiVA

OnPLS(1437 kB)1251 downloads
File information
File name FULLTEXT01.pdfFile size 1437 kBChecksum SHA-512
0d8ab3c51919ae731f3640c76d0592ac1e8b1805967d0172c90a77541c15a4b846803f56114d90453d773377fb88c9420e310b3c366200d1f7c4c0518e867156
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Löfstedt, Tommy
By organisation
Department of Chemistry
Chemical Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 1251 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 2048 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf