Multimodal spatio-temporal-spectral fusion for deep learning applications in physiological time series processing

Physiological signals processing brings challenges including dimensionality (due to the number of channels), heterogeneity (due to the different range of values) and multimodality (due to the different sources). In this regard, the current study intended, first, to use time-frequency ridge mapping in exploring the use of fused information from joint EEG-ECG recordings in tracking the transition between different states of anesthesia. Second, it investigated the effectiveness of pre-trained state-of-the-art deep learning architectures for learning discriminative features in the fused data in order to classify the states during anesthesia. Experimental data from healthy-brain patients undergoing operation (N = 20) were used for this study. Data was recorded from the BrainStatus device with single ECG and 10 EEG channels. The obtained results support the hypothesis that not only can ridge fusion capture temporal-spectral progression patterns across all modalities and channels, but also this simplified interpretation of time-frequency representation accelerates the training process and yet improves significantly the efficiency of deep models. Classification outcomes demonstrates that this fusion could yields a better performance, in terms of 94.14% precision and 0.28 s prediction time, compared to commonly used data-level fusing methods. To conclude, the proposed fusion technique provides the possibility of embedding time-frequency information as well as spatial dependencies over modalities and channels in just a 2D array. This integration technique shows significant benefit in obtaining a more unified and global view of different aspects of physiological data at hand, and yet maintaining the desired performance level in decision making.