Publication | Closed Access
Jointly Trained Variational Autoencoder for Multi-Modal Sensor Fusion
17
Citations
18
References
2019
Year
Unknown Venue
Sensor SetupMachine VisionMachine LearningData ScienceEngineeringAutoencodersSpatiotemporal Data FusionVariational AutoencoderMultimodal Sensor FusionFusion LearningMulti-sensor Information FusionMultimodal Signal ProcessingComputer ScienceBayesian Information FusionCoherent LatentSensor FusionDeep LearningComputer Vision
This work presents the novel multi-modal Variational Autoencoder approach <tex>$\mathbf{M}^{\mathbf{2}}\mathbf{VAE}$</tex> which is derived from the complete marginal joint log-likelihood. This allows the end-to-end training of Bayesian information fusion on raw data for all subsets of a sensor setup. Furthermore, we introduce the concept of in-place fusion – applicable to distributed sensing - where latent embeddings of observations need to be fused with new data. To facilitate in-place fusion even on raw data, we introduced the concept of a re-encoding loss that stabilizes the decoding and makes visualization of latent statistics possible. We also show that the <tex>$\mathbf{M}^{\mathbf{2}}\mathbf{VAE}$</tex> finds a coherent latent embedding, such that a single naïve Bayes classifier performs equally well on all permutations of a bi-modal Mixture-of-Gaussians signal. Finally, we show that our approach outperforms current VAE approaches on a bi-modal MNIST & fashion-MNIST data set and works sufficiently well as a preprocessing on a tri-modal simulated camera & LiDAR data set from the Gazebo simulator.
| Year | Citations | |
|---|---|---|
Page 1
Page 1