Abstract
In our quest to decode the visual processing of the human brain, we aim to reconstruct dynamic visual experiences from brain activities, a task both challenging and intriguing. Although recent advances have made significant strides in reconstructing static images from non-invasive brain recordings, the translation of continuous brain activities into video formats has not been extensively explored. Our study intro- duces NeuralFlix, a simple but effective dual-phase frame- work designed to address the inherent challenges in decoding fMRI data, such as noise, spatial redundancy, and temporal lags. The framework employs spatial and temporal augmentation for contrastive learning of fMRI representations, and a diffusion model enhanced with dependent prior noise for generating videos. Tested on a publicly available fMRI dataset, NeuralFlix demonstrates promising results, significantly out- performing previous state-of-the-art models by margins of 20.97%, 31.00%, and 12.30%, respectively, in decoding the brain activities of three subjects individually, as measured by SSIM
Original language | English |
---|---|
Title of host publication | Annual AAAI Conference on Artificial Intelligence |
Publication status | Accepted/In press - 25 Feb 2025 |