Digitala Vetenskapliga Arkivet

Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Interoperable and scalable data analysis with microservices: Applications in metabolomics
Uppsala universitet, Medicinska och farmaceutiska vetenskapsområdet, Medicinska fakulteten, Institutionen för medicinska vetenskaper, Klinisk kemi.ORCID-id: 0000-0002-4137-5517
Uppsala universitet, Medicinska och farmaceutiska vetenskapsområdet, Medicinska fakulteten, Institutionen för neurovetenskap, Landtblom: Neurologi.ORCID-id: 0000-0002-7045-1806
Vise andre og tillknytning
2019 (engelsk)Inngår i: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 35, nr 19, s. 3752-3760Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

Motivation

Developing a robust and performant data analysis workflow that integrates all necessary components whilst still being able to scale over multiple compute nodes is a challenging task. We introduce a generic method based on the microservice architecture, where software tools are encapsulated as Docker containers that can be connected into scientific workflows and executed using the Kubernetes container orchestrator.

Results

We developed a Virtual Research Environment (VRE) which facilitates rapid integration of new tools and developing scalable and interoperable workflows for performing metabolomics data analysis. The environment can be launched on-demand on cloud resources and desktop computers. IT-expertise requirements on the user side are kept to a minimum, and workflows can be re-used effortlessly by any novice user. We validate our method in the field of metabolomics on two mass spectrometry, one nuclear magnetic resonance spectroscopy and one fluxomics study. We showed that the method scales dynamically with increasing availability of computational resources. We demonstrated that the method facilitates interoperability using integration of the major software suites resulting in a turn-key workflow encompassing all steps for mass-spectrometry-based metabolomics including preprocessing, statistics and identification. Microservices is a generic methodology that can serve any scientific discipline and opens up for new types of large-scale integrative science.

sted, utgiver, år, opplag, sider
2019. Vol. 35, nr 19, s. 3752-3760
Emneord [en]
Bioinformatics, e-infrastructure, microservices, metabolomics, kubernetes, Docker, container
HSV kategori
Identifikatorer
URN: urn:nbn:se:uu:diva-390670DOI: 10.1093/bioinformatics/btz160ISI: 000499322300026PubMedID: 30851093OAI: oai:DiVA.org:uu-390670DiVA, id: diva2:1342450
Forskningsfinansiär
EU, Horizon 2020, 654241Swedish Research Council FormasÅke Wiberg FoundationSwedish National Infrastructure for Computing (SNIC)
Merknad

Title in thesis list of papers: Interoperable and scalable metabolomics data analysis with microservices

Tilgjengelig fra: 2019-03-09 Laget: 2019-08-13 Sist oppdatert: 2020-01-07bibliografisk kontrollert
Inngår i avhandling
1. Enabling Scalable Data Analysis on Cloud Resources with Applications in Life Science
Åpne denne publikasjonen i ny fane eller vindu >>Enabling Scalable Data Analysis on Cloud Resources with Applications in Life Science
2019 (engelsk)Doktoravhandling, med artikler (Annet vitenskapelig)
Abstract [en]

Over the past 20 years, the rise of high-throughput methods in life science has enabled research laboratories to produce massive datasets of biological interest. When dealing with this "data deluge" of modern biology researchers encounter two major challenges: first, there is a need for substantial technical skills for dealing with Big Data and; second, infrastructure procurement becomes difficult. In connection to this second challenge, the computing model and business trend that was originally popularized by Amazon under the name of cloud computing represents an interesting opportunity. Instead of buying computing infrastructure upfront, cloud providers enable the allocation and release of virtual resources on-demand. These resources are then billed with a pay-per-use pricing model and physical infrastructure management is delegated to the provider. In this thesis, we introduce a number of methods for running Big Data analyses of biological interest using cloud computing. Considerable efforts were made in enabling the application of trusted, bioinformatics software to Big Data scenarios as opposed to reimplementing the existing codebase. Further, we improve the accessibility of the technology with the aim of reducing the entry barrier for biologists. The thesis includes 5 papers. In Papers I and II, we explore the applicability of Apache Spark, one of the leading Big Data analytics platforms in cloud environments, to two drug-discovery use cases. In Paper III, we present a general method for running bioinformatics analyses on the cloud using the microservices-oriented architecture. In Paper IV, we introduce a method that combines microservices and Apache Spark with the aim of providing the best of both technologies. In Paper V, we discuss how to reduce the entry barrier for the allocation of cloud research environments. We show that all of the developed methods scale well and we provide high-level programming interfaces for improving accessibility. We have also made the developed software publicly available.

sted, utgiver, år, opplag, sider
Uppsala: Acta Universitatis Upsaliensis, 2019. s. 71
Serie
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 1846
Emneord
cloud computing, bioinformatics, Big Data, microservices, containers, MapReduce
HSV kategori
Forskningsprogram
Beräkningsvetenskap
Identifikatorer
urn:nbn:se:uu:diva-390666 (URN)978-91-513-0730-5 (ISBN)
Disputas
2019-10-10, B42, Uppsala Biomedicinska Centrum, Husargatan 3, Uppsala, 13:15 (engelsk)
Opponent
Veileder
Tilgjengelig fra: 2019-09-17 Laget: 2019-08-22 Sist oppdatert: 2019-10-15
2. Proteomics Studies of Subjects with Alzheimer’s Disease and Chronic Pain
Åpne denne publikasjonen i ny fane eller vindu >>Proteomics Studies of Subjects with Alzheimer’s Disease and Chronic Pain
2017 (engelsk)Doktoravhandling, med artikler (Annet vitenskapelig)
Abstract [en]

Alzheimer’s disease (AD) is a neurodegenerative disease and the major cause of dementia, affecting more than 50 million people worldwide. Chronic pain is long-lasting, persistent pain that affects more than 1.5 billion of the world population. Overlapping and heterogenous symptoms of AD and chronic pain conditions complicate their diagnosis, emphasizing the need for more specific biomarkers to improve the diagnosis and understand the disease mechanisms.

To characterize disease pathology of AD, we measured the protein changes in the temporal neocortex region of the brain of AD subjects using mass spectrometry (MS). We found proteins involved in exo-endocytic and extracellular vesicle functions displaying altered levels in the AD brain, potentially resulting in neuronal dysfunction and cell death in AD.

To detect novel biomarkers for AD, we used MS to analyze cerebrospinal fluid (CSF) of AD patients and found decreased levels of eight proteins compared to controls, potentially indicating abnormal activity of complement system in AD.

By integrating new proteomics markers with absolute levels of Aβ42, total tau (t-tau) and p-tau in CSF, we improved the prediction accuracy from 83% to 92% of early diagnosis of AD. We found increased levels of chitinase-3-like protein 1 (CH3L1) and decreased levels of neurosecretory protein VGF (VGF) in AD compared to controls.

By exploring the CSF proteome of neuropathic pain patients before and after successful spinal cord stimulation (SCS) treatment, we found altered levels of twelve proteins, involved in neuroprotection, synaptic plasticity, nociceptive signaling and immune regulation.

To detect biomarkers for diagnosing a chronic pain state known as fibromyalgia (FM), we analyzed the CSF of FM patients using MS. We found altered levels of four proteins, representing novel biomarkers for diagnosing FM. These proteins are involved in inflammatory mechanisms, energy metabolism and neuropeptide signaling.

Finally, to facilitate fast and robust large-scale omics data handling, we developed an e-infrastructure. We demonstrated that the e-infrastructure provides high scalability, flexibility and it can be applied in virtually any fields including proteomics. This thesis demonstrates that proteomics is a promising approach for gaining deeper insight into mechanisms of nervous system disorders and find biomarkers for diagnosis of such diseases.

sted, utgiver, år, opplag, sider
Uppsala: Acta Universitatis Upsaliensis, 2017. s. 82
Serie
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine, ISSN 1651-6206 ; 1385
Emneord
Bioinformatics, microservices, biomarkers, Alzheimer's disease, chronic pain, fibromyalgia, neuropathic pain, spinal cord stimulation, cloud computing, proteomics, metabolomics, software, workflows, data analysis, mass spectrometry
HSV kategori
Forskningsprogram
Bioinformatik; Neurologi; Geriatrik
Identifikatorer
urn:nbn:se:uu:diva-331748 (URN)978-91-513-0111-2 (ISBN)
Disputas
2017-12-05, Rosénsalen, Akademiska sjukhuset, Ing 95/96, nbv, Uppsala, 09:00 (engelsk)
Opponent
Veileder
Tilgjengelig fra: 2017-11-14 Laget: 2017-10-17 Sist oppdatert: 2020-01-07

Open Access i DiVA

fulltext(4324 kB)33 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 4324 kBChecksum SHA-512
e97458d11297f07b9de22f888c6d3d596d2de96780ab6547cb5c796822dd46a217cd447af52d35ff409211b2068a70d65c2e0aafe5b6fcda360fa2bac3a57d1c
Type fulltextMimetype application/pdf

Andre lenker

Forlagets fulltekstPubMed

Søk i DiVA

Av forfatter/redaktør
Emami Khoonsari, PayamBurman, JoachimCapuccini, MarcoHerman, StephanieLarsson, AndersKultima, KimSpjuth, Ola
Av organisasjonen
I samme tidsskrift
Bioinformatics

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 33 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

doi
pubmed
urn-nbn

Altmetric

doi
pubmed
urn-nbn
Totalt: 189 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf