Inter-individual deep image reconstruction
The sensory cortex is characterized by general organizational principles such as topography and hierarchy. However, measured brain activity given identical input exhibits substantially different patterns across individuals. While anatomical and functional alignment methods have been proposed in functional magnetic resonance imaging (fMRI) studies, it remains unclear whether and how hierarchical and fine-grained representations can be converted between individuals while preserving the encoded perceptual contents. In this study, we evaluated machine learning models called neural code converters that predict one's brain activity pattern (target) from another's (source) given the same stimulus by the decoding of hierarchical visual features and the reconstruction of perceived images. The training data for converters consisted of fMRI data obtained with identical sets of natural images presented to pairs of individuals. Converters were trained using the whole visual cortical voxels from V1 through the ventral object areas, without explicit labels of visual areas. We decoded the converted brain activity patterns into hierarchical visual features of a deep neural network (DNN) using decoders pre-trained on the target brain and then reconstructed images via the decoded features. Without explicit information about visual cortical hierarchy, the converters automatically learned the correspondence between the visual areas of the same levels. DNN feature decoding at each layer showed higher decoding accuracies from corresponding levels of visual areas, indicating that hierarchical representations were preserved after conversion. The viewed images were faithfully reconstructed with recognizable silhouettes of objects even with relatively small amounts of data for converter training. The conversion also allows pooling data across multiple individuals, leading to stably high reconstruction accuracy compared to those converted between individuals. These results demonstrate that the conversion learns hierarchical correspondence and preserves the fine-grained representations of visual features, enabling visual image reconstruction using decoders trained on other individuals.