Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Gating Networks in Learning Machines for Multimodal Data: Decision Fusion on Single Modality Classifiers
KTH, School of Electrical Engineering and Computer Science (EECS).
2019 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Different architectures of gating networks that aggregate information from multiple modalities and their suitability for decision fusion is investigated. The research question, how does a gating network for decision fusion in multimodal classification problem compare to other alternatives, is answered by a quantitative and inductive reasoning approach. This is done by training different machine learning methods on individual modalities and fusing their predictions forthe final classification using M-MNIST, a new data set with three modalities (image, audio, and text). The gating networks achieve greater classification accuracy when fusing information from all modalities, in contrast to considering only one modality, or without fusion. The gating network potential is demonstrated by training it on modalities with different levels of classification accuracy where it achieves the highest average normalized gain when scoring the highest validation accuracy of the three fusion methods, where the results indicate that the gating network can suppress noise in the data. Moreover, by adding an additional weak modality to the gating network, the classification accuracy is improved, hinting at that there might be an incentive to use many weak modalities instead of a few strong ones.

Abstract [sv]

Olika arkitekturer för gating-nätverk som aggregerar information från flera olika modaliteter undersöks här, liksom deras lämplighet för användning för att förena olika beslutsunderlag. Forskningsfrågan ”Hur bra står sig ett gating- nätverk för att ensa beslutsunderlag i multimodala klassificeringsproblem?” besvaras med ett kvantitativt och induktivt tillvägagångssätt. Olika maskininlärningsmetoder har tränats på singulära modaliteter och sedan ensa deras prediktioner för klassificering i M-MNIST: en ny ansamling data med tre modaliteter (bild, ljud och text). Nätverket uppnår bättre resultat i klassificeringen när information från alla modaliteter används, än när endast en modalitet används (eller utan ensning). Nätverkets potential har kunnat illustreras genom träning på modaliteter med olika nivåer av klassificeringskapacitet. Det får bästa resultat, mätt i högsta genomsnittliga normaliserade ökning, i samband med högsta valideringsresultat av de tre metoderna för ensning. Här indikerar resultaten att gating-nätverket kan undertrycka brus i datat. Genom att lägga till ytterligare en (svag) modalitet till nätverket så kan klassificeringens kvalitet ökas på, vilket antyder att det kan finnas skäl att använda många svaga modaliteter iställer för få starka modaliteter.

Place, publisher, year, edition, pages
2019. , p. 31
Series
TRITA-EECS-EX ; 2019:159
Keywords [en]
Multimodality; Gating Networks; Decision Fusion; Learning Machines; Internet-Based Cognitive Therapy;
Keywords [sv]
Multimodal, Gating-nätverk, Decision Fusion, lärande maskiner, Internet-baserad kognitiv terapi
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:kth:diva-252921OAI: oai:DiVA.org:kth-252921DiVA, id: diva2:1322736
Examiners
Available from: 2019-06-11 Created: 2019-06-11 Last updated: 2019-06-11Bibliographically approved

Open Access in DiVA

fulltext(887 kB)20 downloads
File information
File name FULLTEXT01.pdfFile size 887 kBChecksum SHA-512
82702f88ec7be8b62b9e744b1c90ac3a4074de68b620fbffdc4e03f2fee021ef1ba8929b2e5aadeb17971979edc419c7004fa97edf14080cd40833c3ad8d4cf2
Type fulltextMimetype application/pdf

By organisation
School of Electrical Engineering and Computer Science (EECS)
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 20 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 39 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf