Digitala Vetenskapliga Arkivet

Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
miRFA: an automated pipeline for microRNA functional analysis with correlation support from TCGA and TCPA expression data in pancreatic cancer
Umeå University, Faculty of Medicine, Department of Surgical and Perioperative Sciences, Surgery.
Umeå University, Faculty of Medicine, Department of Surgical and Perioperative Sciences, Surgery.ORCID iD: 0000-0002-7516-9543
2019 (English)In: BMC Bioinformatics, E-ISSN 1471-2105, Vol. 20, article id 393Article in journal (Refereed) Published
Abstract [en]

Background: MicroRNAs (miRNAs) are small RNAs that regulate gene expression at a post-transcriptional level and are emerging as potentially important biomarkers for various disease states, including pancreatic cancer. In silico-based functional analysis of miRNAs usually consists of miRNA target prediction and functional enrichment analysis of miRNA targets. Since miRNA target prediction methods generate a large number of false positive target genes, further validation to narrow down interesting candidate miRNA targets is needed. One commonly used method correlates miRNA and mRNA expression to assess the regulatory effect of a particular miRNA.

The aim of this study was to build a bioinformatics pipeline in R for miRNA functional analysis including correlation analyses between miRNA expression levels and its targets on mRNA and protein expression levels available from the cancer genome atlas (TCGA) and the cancer proteome atlas (TCPA). TCGA-derived expression data of specific mature miRNA isoforms from pancreatic cancer tissue was used.

Results: Fifteen circulating miRNAs with significantly altered expression levels detected in pancreatic cancer patients were queried separately in the pipeline. The pipeline generated predicted miRNA target genes, enriched gene ontology (GO) terms and Kyoto encyclopedia of genes and genomes (KEGG) pathways. Predicted miRNA targets were evaluated by correlation analyses between each miRNA and its predicted targets. MiRNA functional analysis in combination with Kaplan-Meier survival analysis suggest that hsa-miR-885-5p could act as a tumor suppressor and should be validated as a potential prognostic biomarker in pancreatic cancer.

Conclusions: Our miRNA functional analysis (miRFA) pipeline can serve as a valuable tool in biomarker discovery involving mature miRNAs associated with pancreatic cancer and could be developed to cover additional cancer types. Results for all mature miRNAs in TCGA pancreatic adenocarcinoma dataset can be studied and downloaded through a shiny web application at https://emmbor.shinyapps.io/mirfa/.

Place, publisher, year, edition, pages
BioMed Central, 2019. Vol. 20, article id 393
Keywords [en]
miRNA functional analysis, miRNA target prediction, Functional enrichment, Mature miRNA, TCGA, TCPA, Pancreatic cancer
National Category
Bioinformatics and Computational Biology
Identifiers
URN: urn:nbn:se:umu:diva-161899DOI: 10.1186/s12859-019-2974-3ISI: 000475761100001PubMedID: 31311505Scopus ID: 2-s2.0-85069159500OAI: oai:DiVA.org:umu-161899DiVA, id: diva2:1341174
Available from: 2019-08-07 Created: 2019-08-07 Last updated: 2025-02-07Bibliographically approved
In thesis
1. In search of early biomarkers in pancreatic ductal adenocarcinoma using multi-omics and bioinformatics
Open this publication in new window or tab >>In search of early biomarkers in pancreatic ductal adenocarcinoma using multi-omics and bioinformatics
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Alternative title[sv]
På jakt efter tidiga biomarkörer i bukspottkörtelcancer med hjälp av multi-omik och bioinformatik
Abstract [en]

Background: Pancreatic ductal adenocarcinoma (PDAC) is a very aggressive malignancy with a 5-year survival of 10 %. Surgery is the only curative treatment. Unfortunately, few patients are eligible for surgery due to late detection. Thus, we need ways to detect the disease at an earlier stage and for that good screening biomarkers could be used. Previous studies have analyzed circulating analytes in prospective studies to identify early PDAC signals. One such class is microRNAs (miRNAs). MicroRNAs are non-coding RNAs of around 22 nucleotides that act as post- transcriptional regulators by interaction with messenger RNAs (mRNAs). The function of a miRNA can be elucidated by target prediction, to identify its potential targets, followed by enrichment analysis of the predicted targets. Challenges with this approach includes a lot of false positives being generated and that miRNAs can perform their role in a tissue- or disease-specific manner. Other classes of analytes that have previously been studied in prospective PDAC cohorts are metabolites and proteins. 

Aims: This thesis has three aims. First, to build a miRNA functional analysis pipeline with correlation support between miRNA and its predicted target genes. Second, to identify potential circulating biomarkers for early detection of PDAC using multi-omics. Third, to identify potential prognostic metabolites in a prospective PDAC cohort.

Methods: We used publicly available data from the cancer genome atlas-pancreatic adenocarcinoma (TCGA-PAAD) and pre-diagnostic plasma samples from the Northern Sweden Health and Disease Study. We built a pipeline in R including miRNA, mRNA, and protein expression data from TCGA-PAAD for in silico miRNA functional analysis. Pre- diagnostic plasma samples from future PDAC patients as well as matched healthy controls were analyzed using multi- omics. Tissue polypeptide specific antigen (TPS) was analyzed by enzyme linked immunosorbent assay in 267 future PDAC samples and 320 healthy controls. Metabolomics and clinical biomarkers (carbohydrate antigen (CA) 19-9, carcinoembryonic antigen (CEA), and CA 15-3) were profiled in 100 future PDAC samples and 100 healthy controls using liquid chromatography-mass spectrometry (MS), gas chromatography-MS, and multi-plex technology. Of these, a subset of 39 future PDAC patients and 39 healthy controls were profiled for 2083 microRNAs using targeted sequencing and 644 proteins using proximity extension assays. Circulating levels of multi-omics analytes were analyzed using conditional or unconditional logistic regression. Least absolute shrinkage and selection operator (LASSO) in combination with 500 bootstrap iterations identified the most informative variables. The prognostic value of metabolites was assessed using cox regression. Multi-omics factor analysis (MOFA) and data integration analysis for biomarker discovery using latent components (DIABLO) were used for multi-omics integration analyses.

Results: An automated pipeline was built consisting of 1) miRNA target prediction, 2) correlation analyses between miRNA and its targets on mRNA and protein expression levels, and 3) functional enrichment of correlated targets to identify enriched Kyoto encyclopedia of genes and genomes (KEGG) pathways and gene ontology (GO) terms for a specific miRNA. The pipeline was run for all microRNAs (~700) detected in the TCGA-PAAD cohort. These results can be downloaded from a shiny app (https://emmbor.shinyapps.io/mirfa/). TPS was not altered in pre-diagnostic PDAC patients up to 24 years prior to diagnosis, but increased at diagnosis (OR = 1.03, 95 % CI: 1.01-1.05). Internal area under curves of 0.74, 0.80, and 0.88 were achieved for five metabolites, two proteins, and two miRNAs that were selected by LASSO and bootstrap iterations, in combination with CA 19-9. Neither MOFA nor DIABLO separated well between future PDAC cases and healthy controls. 

Conclusions: Our bioinformatics pipeline for in silico functional analysis of microRNAs successfully identifies enriched KEGG pathways and GO terms for miRNA isoforms. The investigated plasma samples are heterogeneous, but among the analyzed variables, we identified five metabolites, two proteins, and two microRNAs with highest potential for early PDAC detection. CA 19-9 levels increased closer to diagnosis. We identified five fatty acids that could be studied in a diagnostic PDAC cohort as prognostic biomarkers. 

Place, publisher, year, edition, pages
Umeå: Umeå University, 2022. p. 69
Series
Umeå University medical dissertations, ISSN 0346-6612 ; 2211
Keywords
Pancreatic cancer, microRNA functional analysis, TPS, miRNomics, metabolomics, proteomics, plasma samples, early detection, risk, survival
National Category
Cancer and Oncology Bioinformatics and Computational Biology
Identifiers
urn:nbn:se:umu:diva-201158 (URN)978-91-7855-929-9 (ISBN)978-91-7855-928-2 (ISBN)
Public defence
2022-12-16, Hörsal B Norrlands universitetssjukhus, Unod T9, Norrlands universitetssjukhus, Umeå, 13:00 (English)
Opponent
Supervisors
Available from: 2022-11-25 Created: 2022-11-22 Last updated: 2025-02-05Bibliographically approved

Open Access in DiVA

fulltext(4584 kB)425 downloads
File information
File name FULLTEXT01.pdfFile size 4584 kBChecksum SHA-512
1f99c597231fe35dbf345d045b941232030cea5a20e9d2b2acc21cbbf69cf3e8c0cbbef8ac150d7f83fa33fca18ec25dc1968c74122567d2b3f25b9c0f0bb839
Type fulltextMimetype application/pdf

Other links

Publisher's full textPubMedScopus

Search in DiVA

By author/editor
Borgmästars, EmmySund, Malin
By organisation
Surgery
In the same journal
BMC Bioinformatics
Bioinformatics and Computational Biology

Search outside of DiVA

GoogleGoogle Scholar
Total: 425 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 1375 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf