Adaptive Fusion Techniques for Effective Multimodal Deep Learning

dc.contributor.advisorVechtomova, Olga
dc.contributor.authorSahu, Gaurav
dc.date.accessioned2020-08-28T19:40:04Z
dc.date.available2020-08-28T19:40:04Z
dc.date.issued2020-08-28
dc.date.submitted2020-08-18
dc.description.abstractEffective fusion of data from multiple modalities, such as video, speech, and text, is a challenging task due to the heterogeneous nature of multimodal data. In this work, we propose fusion techniques that aim to model context from different modalities effectively. Instead of defining a deterministic fusion operation, such as concatenation, for the network, we let the network decide “how” to combine given multimodal features more effectively. We propose two networks: 1) Auto-Fusion network, which aims to compress information from different modalities while preserving the context, and 2) GAN-Fusion, which regularizes the learned latent space given context from complementing modalities. A quantitative evaluation on the tasks of multimodal machine translation and emotion recognition suggests that our adaptive networks can better model context from other modalities than all existing methods, many of which employ massive transformer-based networks.en
dc.identifier.urihttp://hdl.handle.net/10012/16194
dc.language.isoenen
dc.pendingfalse
dc.publisherUniversity of Waterlooen
dc.subjectmultimodal deep learningen
dc.subjectmultimodal fusionen
dc.subjectgenerative adversarial networksen
dc.subjectmultimodal machine translationen
dc.subjectspeech emotion recognitionen
dc.titleAdaptive Fusion Techniques for Effective Multimodal Deep Learningen
dc.typeMaster Thesisen
uws-etd.degreeMaster of Mathematicsen
uws-etd.degree.departmentDavid R. Cheriton School of Computer Scienceen
uws-etd.degree.disciplineComputer Scienceen
uws-etd.degree.grantorUniversity of Waterlooen
uws.contributor.advisorVechtomova, Olga
uws.contributor.affiliation1Faculty of Mathematicsen
uws.peerReviewStatusUnrevieweden
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Sahu_Gaurav.pdf
Size:
2.35 MB
Format:
Adobe Portable Document Format
Description:
Main article

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
6.4 KB
Format:
Item-specific license agreed upon to submission
Description: