Digitala Vetenskapliga Arkivet

Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Weather Forecasting with Multimodal Fusion Architecture Search
KTH, School of Electrical Engineering and Computer Science (EECS).
2024 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesisAlternative title
Väderprognos med multimodal fusion och arkitektursökning (Swedish)
Abstract [en]

Weather forecasting plays a crucial role in both science and society, with ac- curate predictions offering significant social and economic benefits. However, existing weather forecasting systems based on Numerical Weather Prediction (NWP) models are computationally expensive and suffer from long inference times, limiting their practical applications. In recent years, deep learning- based models have emerged as a promising alternative, particularly for short- term precipitation forecasting. Despite this promise, these models have yet to outperform NWP systems for longer-term predictions due to the complexity of the task. Recent research has focused on developing multimodal video prediction models for weather forecasting by integrating various input modalities. One of the latest proposed architectures in this domain is CrevNet. In this thesis, we introduce a novel architecture that combines CrevNet with the multimodal neural architecture search framework BM-NAS. Since weather data can be treated as a form of video, the proposed architecture may also be suitable for other multimodal video prediction applications. Our results indicate that the proposed CrevNet combined with BM-NAS did not outperform the standard CrevNet for weather forecasting. This may be attributed to the increased complexity introduced by BM-NAS and the need for further hyperparameter tuning. However, given the growing interest in models capable of incorporating data from multiple modalities, we believe the pro- posed CrevNet-BM-NAS architecture serves as a valuable starting point for exploring multimodal fusion architecture search in video prediction applica- tions beyond weather forecasting.

Abstract [sv]

Väderprognoser spelar en avgörande roll inom både vetenskapen och samhäl- let, och exakta prognoser erbjuder betydande sociala och ekonomiska fördelar. Dock är de nuvarande väderprognossystemen, som bygger på Numerical Weat- her Prediction (NWP)-modeller, beräkningsmässigt kostsamma och har långa slutledningstider, vilket begränsar deras praktiska användbarhet. Under de se- naste åren har djupinlärningsbaserade modeller framstått som ett lovande al- ternativ, särskilt för kortsiktiga nederbördsprognoser. På grund av problemets komplexitet har dessa modeller dock ännu inte överträffat NWP-systemen vid långsiktiga prognoser. Nyligen har forskare börjat utveckla multimodala videoprediktionsmodel- ler för väderprognoser genom att kombinera olika modaliteter. En av de se- naste föreslagna arkitekturerna inom detta område är CrevNet. I denna mas- teruppsats introducerar vi en ny arkitektur som kombinerar CrevNet med det multimodala neurala arkitektursökningsramverket BM-NAS. Eftersom väder- data kan betraktas som video, förväntas den föreslagna arkitekturen även vara lämplig för andra multimodala videoprediktionsapplikationer. Våra resultat visar att det föreslagna CrevNet i kombination med BM-NAS inte överträffade det vanliga CrevNet för väderprognoser. Detta kan tillskrivas den ökade komplexiteten som introduceras av BM-NAS och behovet av ytter- ligare justering av hyperparametrar. Men eftersom möjligheten att integrera indata från olika modaliteter i en enda modell är av stort intresse för forskar- samhället, tror vi att det föreslagna CrevNet-BM-NAS kan fungera som en utgångspunkt för att utforska multimodal-fusion-arkitektursökning inom vi- deoprediktioner bortom väderprognoser.

Place, publisher, year, edition, pages
2024. , p. 46
Series
TRITA-EECS-EX ; 2024:918
Keywords [en]
Weather Prediction, Multimodal Data Fusion, Neural Architecture Search, Video Prediction
Keywords [sv]
Väderprognos, Multimodal datafusion, Neural arkitektursökning, Videopre- diktion
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:kth:diva-361031OAI: oai:DiVA.org:kth-361031DiVA, id: diva2:1943523
External cooperation
Peltarion
Supervisors
Examiners
Available from: 2025-03-14 Created: 2025-03-11 Last updated: 2025-03-14Bibliographically approved

Open Access in DiVA

fulltext(984 kB)67 downloads
File information
File name FULLTEXT02.pdfFile size 984 kBChecksum SHA-512
3f892ae3594ed7d565f001a96c496416b9bb1ceaa7d3fb0ffad1260fd3027de46afe6013bf720077ac8c03a7ee382d70fdce2a8a664f43aa8a1e3714d7af56fb
Type fulltextMimetype application/pdf

By organisation
School of Electrical Engineering and Computer Science (EECS)
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 67 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 446 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf