Learning Cortical Parcellations Using Graph Neural Networks
Learning Cortical Parcellations Using Graph Neural Networks
Learning Cortical Parcellations Using Graph Neural Networks
Deep learning has been applied to magnetic resonance imaging (MRI) for a variety of
purposes, ranging from the acceleration of image acquisition and image denoising to
tissue segmentation and disease diagnosis. Convolutional neural networks have been
particularly useful for analyzing MRI data due to the regularly sampled spatial and
temporal nature of the data. However, advances in the field of brain imaging have led
to network- and surface-based analyses that are often better represented in the graph
domain. In this analysis, we propose a general purpose cortical segmentation method
that, given resting-state connectivity features readily computed during conventional
MRI pre-processing and a set of corresponding training labels, can generate cortical
parcellations for new MRI data. We applied recent advances in the field of graph
Edited by: neural networks to the problem of cortical surface segmentation, using resting-state
Jordi Solé-Casals,
Universitat de Vic - Universitat Central
connectivity to learn discrete maps of the human neocortex. We found that graph
de Catalunya, Spain neural networks accurately learn low-dimensional representations of functional brain
Reviewed by: connectivity that can be naturally extended to map the cortices of new datasets.
Sun Zhe,
After optimizing over algorithm type, network architecture, and training features, our
RIKEN, Japan
Shijie Zhao, approach yielded mean classification accuracies of 79.91% relative to a previously
Northwestern Polytechnical University, published parcellation. We describe how some hyperparameter choices including
China
training and testing data duration, network architecture, and algorithm choice affect
*Correspondence:
David R. Haynor
model performance.
haynor@uw.edu
Keywords: graph neural network, parcellation, functional connectivity, representation learning, segmentation,
brain, human
Specialty section:
This article was submitted to
Brain Imaging Methods,
a section of the journal
1. INTRODUCTION
Frontiers in Neuroscience
Neural network approaches such as multi-layer feed-forward networks have been applied to a wide
Received: 18 October 2021 variety of tasks in medical imaging, ranging from disease classification to tissue segmentation.
Accepted: 03 December 2021
However, these networks do not always take into account the true spatial relationships between
Published: 24 December 2021
data points. Convolutional neural network approaches, such as those applied to static images or
Citation: dynamic video streams, learn translationally-invariant, multidimensional kernel filters over the
Eschenburg KM, Grabowski TJ and
data domain. Both these methods assume that the data is sampled regularly in space, allowing
Haynor DR (2021) Learning Cortical
Parcellations Using Graph Neural
convolution and pooling of information from fixed neighborhood topologies. However, real-world
Networks. data, such as graph-structured data, is often sampled on irregular domains. Data sampled from
Front. Neurosci. 15:797500. graph domains often contains non-uniform topology—individual data points can vary in their
doi: 10.3389/fnins.2021.797500 neighborhood structure, and notions of direction (e.g., up, down, left, right) do not generalize
well to graphs. This makes learning filters to process graph- 3D-volumetric convolution kernels. However, these approaches
structured data very difficult with conventional neural network are not easily applied to data distributed over 2-D manifolds like
approaches. the cortical surface. Additionally, more recent large-scale studies
Graph neural networks are a class of neural network models interpolate neurological signals, like cortical activation patterns
that operate on data distributed over a graph domain. Data are or various histological scalar measures, onto the cortical manifold
sampled from a graph with an explicit structure defined by a to mitigate the potential for mixing signals from anatomically
set of nodes and edges. These models have been shown to be close yet geodesically distant cortical regions, e.g., across sulci
useful for graph and node classification tasks, along with learning (Yeo et al., 2011; Glasser et al., 2013). These studies could also
generative models of data distributed over graphs (Kipf and benefit from methods that operate directly on graphs.
Welling, 2016b; Hamilton et al., 2017; Zhao et al., 2019; Zeng With the growth of large-scale open-source brain imaging
et al., 2020). Graph convolution networks (GCN), proposed in databases [ADNI (Petersen et al., 2010), ABCD (Hagler et al.,
Defferrard et al. (2016), generalized the idea of convolutional 2019), HCP (Glasser et al., 2013)], neuroscientists now have
networks on grid-like data to data distributed over irregular access to high-quality data that can be used for training models
domains by applying Chebyshev polynomial approximations of that can then be applied to new datasets. We leveraged the
spectral filters to graph data. Graph attention networks (GAT) statistical properties of these high-quality datasets to inform
are based on the idea of an attention function, a learned global the segmentation of new data using multiple variants of graph
function that selectively aggregates information across node neural networks. We considered graph convolution networks
neighborhoods. The attention function maps a query and set of and two variants of graph attention networks: standard attention
key-value pairs to an output (Vaswani et al., 2017). The output networks (Velickovic et al., 2018), and attention networks with
is defined as a weighted sum of the values, where weights are adaptive network depth weighting (a.k.a. jumping-knowledge
computed using some similarity or kernel function of the key- networks, Xu et al., 2018). We examined how algorithm
value pairs. choice and network parameterization affect cortical segmentation
It is believed that biological signals distributed over the performance. We trained our classification models on high-
cortical manifold are locally stationary. Given a small cortical quality open-source imaging data, and tested them on two
patch, voxels sampled from the patch will display similar datasets with unique spatial and temporal resolutions and
functional and structural connectivity patterns, cortical thickness different pre-processing pipelines. Other methods have been
and myelin density measures, and gene expression profiles, proposed for delineating the cortex using various registration
among various other signals (Glasser and van Essen, 2011; (Fischl et al., 2004; Robinson et al., 2018), neural network
Amunts et al., 2020; Wagstyl et al., 2020). Prior studies have (Hacker et al., 2013; Glasser et al., 2016), label fusion (Asman
attempted to delineate and map the cortex by identifying and Landman, 2012, 2014; Liu et al., 2016), and even graph
contiguous cortical subregions that are characterized by relative neural network approaches (Cucurull et al., 2018; Gopinath et al.,
uniformity of these signals (Blumensath et al., 2013; Arslan et al., 2019). To the best of our knowledge, this is the first attempt to
2015; Baldassano et al., 2015; Gordon et al., 2016). This work is examine the performance of common variants of graph neural
based on the fundamental idea that contiguous regions of the networks in a whole-brain cortical classification setting and
cortex with similar connectivity and histological properties will explore their ability to generalize to new datasets using functional
tend to function as coherent units. Biological signals distributed magnetic resonance imaging (fMRI). While other studies have
over the cortex exhibit local but not global stationarity, so proposed the use of graph neural networks to delineate cortical
any attempt to parcellate the cortex must take both properties areas, these studies did not perform in-depth analyses on how
into account. network architecture, algorithm parameter choices, feature type,
Most brain imaging studies utilize cortical atlases—template and training and testing data parameters impact the predicted
maps of the cortex that can be deformed and mapped to cortical maps (Cucurull et al., 2018; Gopinath et al., 2019). To
individual subjects’ brains—to discretize the cortical manifold this end, we studied how each of these different variables impacts
and simplify downstream analyses (Fischl et al., 2004; Bullmore model performance and prediction reliability.
and Sporns, 2012). However, it remains an open question how
to “apply” existing cortical maps to unmapped data. A recent
study identified considerable variability in the size, topological 2. BACKGROUND
organization, and existence of cortical areas defined by functional
connectivity across individuals, raising the question of how best 2.1. Graph Convolution Networks
to utilize the biological properties of any given unmapped dataset Convolution filters over graphs using spectral graph theory were
to drive the application of a cortical atlas to this new data introduced by Defferrard et al. (2016). For a graph G = (V, E)
(Glasser et al., 2016). with N nodes and symmetric normalized graph Laplacian, L,
Here, we developed an approach to perform cortical define the eigendecomposition of L = U3U T , where the
segmentation—a node classification problem—using graph columns of U are the spectral eigenfunctions of G. Given a graph
neural networks. The cerebral cortex is often represented as signal x ∈ RN distributed over G, the graph Fourier transform of
a folded sheet, and a usable parcellation approach must be x is defined as x̃ = U T x, and its inverse graph Fourier transform
applicable to this sort of data. Neural networks can be extended as x = U x̃. Graph filtering of x is then defined as gθ (L)x =
to account for non-stationarity in MRI volumes by incorporating Ugθ (3)U T x, where gθ is an arbitrary function of the eigenvalues.
FIGURE 2 | Graph attention network employing a jumping-knowledge mechanism. The network takes as input the graph adjacency structure and the nodewise
feature matrix, and outputs a node-by-label logit matrix. Each GATConv block is composed of multiple attention heads. Arrows indicate the direction of processing.
Aggregation function, g(x), which takes as input the embeddings from each GATConv block, learns a convex combination of the layer-wise embeddings.
3. DATA session was roughly 15-min in length. These images were pre-
processed using a custom pipeline developed by the HCP (Glasser
The data used in this study come from the Human Connectome et al., 2013). BOLD images were denoised using subject-ICA
Project (HCP) (Glasser et al., 2013, 2016) and from the Midnight (Beckmann et al., 2005) and FIX (Salimi-Khorshidi et al., 2014)
Scan Club (MSC) (Gordon et al., 2017). We were specifically to automatically identify and remove spurious noise components,
interested in examining how models trained on one dataset and motion parameters were regressed out. No additional global
would perform on another dataset. Specifically, we trained signal regression, tissue regression, temporal filtering, or motion
models on data from the HCP (Glasser et al., 2013), one of scrubbing were performed. Denoised voxel time series were
the highest quality MRI datasets to date in terms of spatial and interpolated onto the fsaverage_LR32k surface mesh using a
temporal sampling of brain signals. We then tested our models barycentric averaging algorithm, and then smoothed at FWHM =
on images from both the HCP and MSC datasets. 2 mm to avoid the mixing of signals across gyri. Surface-mapped
BOLD signals were brought into register across subjects using a
3.1. HCP Dataset multi-modal surface matching algorithm (Robinson et al., 2014)
The HCP consortium collected data on a set of 1,200 young to the fsaverage_LR32 space and vectorized to CIFTI format,
adult subjects 21–35 years of age. We utilized a subset of 268 mapping each surface vertex to an index in a vector (toward the
of these datasets (22–35 years; 153 female) from the S500 data end of this work, we learned that different HCP data releases were
release. The HCP acquired high-resolution 0.7 mm isotropic T1w processed using different versions of this surface registration
(TI = 1,000 ms, TR = 2,400 ms, TE = 2.14 ms, FA = 8◦ , FOV algorithm; we discuss this in more depth in section 5.5). CIFTI
= 224 mm, matrix = 320, 256 saggital slices) and T2w images vector indices, referred to as “grayordinates” by the HCP, are in
(TR = 3,200 ms, TE = 565 ms, FOV = 224 mm, matrix = 320). spatial correspondence across subjects (i.e., index i in subjects
T1w and T2w data were pre-processed using a custom pipeline s and t correspond to roughly the same anatomical location),
developed by the HCP (Glasser et al., 2013) using FreeSurfer such that each subject shares the same mesh topology and
(Fischl et al., 2004) to generate highly refined cortical surface adjacency structure. Time-series for each session were demeaned
meshes at the white/gray and pial/CSF interfaces. The surface and temporally concatenated.
meshes were spatially normalized to Montreal Neurological The HCP consortium developed a pipeline to generate high-
Institute (MNI) space and resampled to have 32k vertices. The resolution multi-modal cortical parcellations (MMP) with 180
pipeline also generated four surface-based scalar maps: cortical cortical areas using a spatial derivative based algorithm (Glasser
thickness, Gaussian curvature along the cortical manifold, sulcal et al., 2016) computed from resting and task-based fMRI
depth of the cortical gyri and sulci, and a myelin density map signals, cortical thickness, myelin content, and cortical curvature.
characterizing the spatially-varying myelin content of the gray Manual editing was performed on the group-average gradient-
matter (Glasser and van Essen, 2011). based parcellation to ensure that boundaries conformed across
For each subject, the HCP acquired four resting-state feature types. Using a set of 210 independent subjects as training
functional MRI (rs-fMRI) images: TR = 0.720 s, TE = 33 ms, data, the authors trained a 3-layer neural network model to
multi-band factor = 8, FA = 52◦ , FOV = 208 × 180 mm, Matrix learn these boundary-based regions. The authors trained 180
= 104 × 90 × 72, voxel size: 2 × 2 × 2 mm. The authors refer to classifiers, one for each cortical area, to distinguish a single
these four acquisitions as: REST1_LR, REST1_RL, REST2_LR, cortical area from its immediately adjacent neighborhood (using
REST2_RL. The images were acquired over two separate a 30 mm radius neighborhood size) in a binary classification
days, such that REST1_LR / REST1_RL were acquired on 1 setting. At test time, the authors compared the probabilities of
day, and REST2_LR / REST2_RL were acquired on another. the predicted areal class across all classifiers in a single find-the-
Each session acquired 1,200 time-points, such that each BOLD biggest operation. Label predictions were regularized to minimize
spurious predictions and “holes” in the final parcellation. 4.1. Regional Functional Connectivity
Apart from the 30 mm radius around each group-level area, As mentioned above in sections 3.1 and 3.2, the MSC and
the classifiers did not incorporate any spatial information at HCP studies aligned cortical surfaces to the fsaverage_LR surface
training or test time. Predictions generated from subjects in the space. The result is such that, given two meshes S and T, the
training set were used to compute a group-average multi-modal anatomical location of grayordinate i in mesh S corresponds
parcellation which can be freely downloaded here: https://balsa. to generally the same anatomical location as grayordinate i in
wustl.edu/DLabel/show/nn6K. The individual parcellations and mesh T, allowing for direct comparisons between the same
the classifier itself have not yet been publically released. grayordinates across individual surfaces.
We utilized the subject-level cortical parcellations generated In cases where spatial normalization of surfaces has not been
by the HCP as the training set for our models. Subject-level performed, it would be incorrect to assume that two grayordinate
parcellations for a subset of 449 subjects were made available by indices correspond to the same anatomical locations across
an HCP investigator (see Acknowledgements). subjects. In order to alleviate the requirement of explicit vertex-
wise correspondence across training, validation, and testing
3.2. Midnight Scan Club Dataset datasets, we assume that most imaging studies will first run
The Midnight Scan Club dataset consists of MRI data acquired FreeSurfer to generate subject-specific folding-based cortical
on ten individual subjects (5 female) ranging in age from 24 parcellations (Desikan et al., 2006; Destrieux et al., 2010). We can
to 34 years of age: https://openneuro.org/datasets/ds000224/ then aggregate the high-dimensional vertex-wise connectivity
versions/1.0.3 (Gordon et al., 2017). The MCP study acquired features over one of these cortical atlases, as in Eschenburg
5 h of resting-state data on each participant in ten 30-min et al. (2018), and simultaneously reduce the feature vector
acquisitions, with the goal being to develop high-precision, dimension. This guarantees that column indices of feature
individual-specific functional connectomes to yield deeper vectors represent anatomically comparable variables across
insight into the reproducibility and inter-subject differences in individuals corresponding to connectivity to whole cortical
functional connectivity. areas rather than explicit vertex-vertex connections. These low-
The MSC dataset preprocessing followed a roughly similar dimensional vectors are agnostic to the original mesh resolution
pipeline to that of the HCP dataset. Four 0.8 mm isotropic T1w and degree of spatial normalization. As long as resting-state
images (TI = 1,000 ms, TR = 2,400 ms, TE = 3.74 ms, FA = 8◦ , data are collected for a given study, and that good spatial
matrix = 224, saggital) and four 0.8 mm isotropic T2w images correspondence between the T1w and BOLD image can be
(TR = 3,200 ms, TE = 479 ms, matrix = 224 slices, saggital) achieved, we can apply our processing steps to this data.
were acquired. T1w images were processed using FreeSurfer to Given a BOLD time series matrix T ∈ R32k×t and cortical atlas
generate refined cortical mesh representations of the white/gray with k regions, we consider the set of vertices assigned to region
and pial/CSF tissue interfaces, which were subsequently warped k and compute the mean time-series of region k as:
to the fsaverage_LR brain surface using the FreeSurfer shape- 1 X
based spherical registration method, and resampled to 164K and T̂k,t = Ti,t (4)
|k|
32k vertex resolutions. The authors performed myelin mapping i∈k
by computing the volumetric T1/T2 ratio and interpolating the
voxel-wise myelin densities onto the 32k surface mesh. where T̂ ∈ RK×t is the matrix of mean regional time-series. We
MSC resting-state data were acquired using gradient-echo compute R ∈ R32k×K , the Pearson cross-correlation between T
EPI sequences with the following parameters: TR = 2.2 s, TE and T̂, where Ri,k represents the temporal correlation between a
= 27 ms, FA = 90◦ , voxel size = 4 × 4 × 4 mm. The MSC vertex i and cortical region k. These cross-correlation vectors are
applied slice timing correction, and distortion correction using used as features to train our models.
subject-specific mean field maps. Images were demeaned and In this analysis, we generated connectivity features using
detrended, and global, ventricular, and white matter signals the Destrieux atlas (Destrieux et al., 2010) with 75 regions per
were regressed out. Images were interpolated using least squares hemisphere, as it is computed by FreeSurfer and represents a
spectral estimation and band-pass filtered (0.009 Hz < f < reasonably high-resolution partition of the cortical surface that
0.08 Hz), and then scrubbed of high-motion volumes. Denoised we hypothesize captures vertex-to-vertex functional variability
volumetric resting-state data were then interpolated onto the well. In section 5.5, we show how classification performance
midthickness 32k vertex mesh. The MSC study did not perform depends on which cortical atlas we regionalize over, and on which
subject-ICA and FIX to remove spurious noise components from representation of functional connectivity models are trained on.
the temporal signals. We also examined segmentation performance when models
were trained on continuous representations of functional
connectivity, computed by group-ICA and dual regression. As
4. METHODS part of their preprocessing, the HCP applied group-ICA to
a set of 1,003 subjects using MELODIC’s Incremental Group
Here, we describe processing steps applied to the HCP and MSC PCA (MIGP) algorithm to compute group-ICA components
fMRI datasets for this analysis. We begin with the minimally of dimensions 15, 25, 50, and 100 (Smith et al., 2014). We
pre-processed BOLD and scalar data interpolated onto the 32k dual-regressed these group-level components onto each subject’s
surface mesh. resting-state data to generate subject-level ICA components.
These subject-level regression coefficients were fed into our cortex. We then applied the singular value decomposition as
models as alternative representations of functional connectivity. R = USV T , where S is the diagonal matrix of singular values
σ1 , σ2 . . . σN Gordon et al. (2016) defined homogeneity as ρl =
.. P
4.2. Markers of Global Spatial Position 100 ∗ (σ12 k 2
i=1 σi ), the percent of variance explained by
We also included measures of position in grayordinate space
the first principal component. The variance captured by the
(global spatial position) as model features (Cucurull et al., 2018;
first component describes how well a single vector explains
Gopinath et al., 2019). Surface mesh Laplacian eigenvectors
the functional connectivity profiles of a given cortical parcel—
represent a spatial variance decomposition of the cortical
the larger the variance explained, the more homogeneous the
mesh into orthogonal bases along the cortical manifold. We
parcel connectivity. We computed an estimate of functional
retained the first three eigenvectors corresponding to eigenvalues
homogeneity for each parcel and averaged the estimates across
λ1 , λ2 , λ3 . The eigenfunctions represent an intrinsic coordinate
all parcels.
system of the surface that is invariant to rotations and
For scalar features (e.g., myelin density), we estimated
translations of the surface mesh.
homogeneity as the ratio of within-parcel variance to between-
The eigendecomposition computes eigenvectors up to a sign
parcel variance. For each parcel l ∈ L and feature F ∈
flip (that is, the positive/negative direction of an eigenvector
R32k , we computed the mean, µl , and variance, σl2 of the
is arbitrary), and eigenvector ordering is not guaranteed to be
parcel-wise features. Homogeneity is estimated as Li=1 (σl2 −
P
equivalent across individuals. We chose a template subject and .P
flipped (multiplied by −1) and reordered the eigenvectors of all σ¯2 ) i=1
L
l(µ − µ̄)2 , where σ¯2 and µ̄ are the average variance
remaining subjects with respect to this template subject via the and average mean estimates across all parcels. A smaller
Hungarian algorithm, to identify the lowest cost vector matching value represents more homogeneous parcels. This measure of
for every template-test pair (here, we minimized the Pearson homogeneity is a dimensionless quantity that allows for the
correlation distance). comparison of estimates across datasets and features.
5. RESULTS
We first examine the best performing model of those we
a mean classification accuracy of 79.91% on the S1200 subjects.
considered in our analysis, and discuss the classification accuracy
We henceforth refer to this model as the “optimal” model, and
and reproducibility of parcellations predicted by this model in
discuss results associated with this model below.
relation to parcellations computed by Glasser et al. (2016), which
In Figure 3A, we show predicted parcellations computed
we call “ground truth” in what follows. We define classification
using this model for exemplar HCP and MSC test subjects.
accuracy as the percentage of correctly predicted vertex labels
Predicted subject-level parcellations closely resemble the
relative to the ground truth maps. We then show broadly how
“ground truth” maps generated by Glasser et al. (2016) (see
algorithm choice, network architecture, and training and testing
Supplementary Material for additional examples of predictions
image scan duration affect overall model performance. Finally,
generated by each model). No specific contiguity constraint was
we illustrate how classification performance is related to the
imposed on the parcellations; it is inherent in the graph neural
features used during model training and testing.
network models. Subjects from the MSC dataset do not have
corresponding ground truth maps against which to compare
5.1. Prediction Accuracy in the Best their predictions. In Figure 3B, we show consensus predictions
Performing Model for each dataset, compared against the publicly released HCP-
Network optimization was performed using labels provided by MMP atlas. Consensus predictions were computed by assigning
Matthew Glasser (see section Acknowledgments) using subject a vertex to the label most frequently assigned to that vertex
data from the S500 HCP release. As mentioned in section 3.1, the across the individual test subject predictions. We see that
S1200 data release uses a different surface registration algorithm, both consensus predictions closely resemble the HCP-MMP
producing subject-level resting-state data that is better aligned atlas—however, the consensus map derived from the MSC
with the labels provided by Glasser. Final model evaluation was subjects shows noisy parcel boundaries and disconnected areal
performed using this S1200 data. The best performing model components (lateral and medial prefrontal areas).
was the 6-layer graph attention network (GAT), with 4 attention Figure 4 shows the spatial distribution of classification
heads per layer, 32 hidden channels per layer, and a dropout accuracy rates averaged across all subjects in the HCP test
rate of 0.1, and incorporated a spatial prior at test time. When set. Average accuracy is shown as a map distributed over the
trained on features computed using ICA, this model achieved cortex, with values ranging between 0 (blue; vertex incorrectly
FIGURE 5 | Mean model probabilities for a subset of cortical areas for the HCP (top) and MSC (middle) datasets computed using the optimal model, and the MMP
binary class probabilities from Glasser et al. (2016) and Coalson et al. (2018) (bottom). Probabilistic maps are illustrated for areas V1, 46, TE1a, LIPv, MT, RSC, and
10r. These maps are thresholded at a minimum probability value of 0.005, the probability of randomly assigning a vertex to one of the 180 cortical areas.
images. The Dice coefficient between sets J and K is defined as subject MSC08 reported restlessness, displayed considerable head
motion, and repeatedly fell asleep during the scanning sessions.
2 ∗ |J ∩ K| Area-level topologies were also reproducible across scanning
Dice(J, K) = (5)
|J| + |K| sessions (Supplementary Material). Glasser et al. (2016)
identified three unique topologies of area 55b, corresponding to
Figure 6 shows the mean areal Dice coefficients for each dataset a “typical,” “shifted,” or “split” organization pattern, relative to
from predictions computed using the optimal model. Predictions the group-average cortical map. We were able to identify these
made on the HCP dataset were more reproducible across same unique topologies in individual subjects, indicating that
the entire cortex than predictions on the MSC dataset. In graph neural networks are identifying the unique connectivity
both datasets, sensory/motor and areas near the angular and fingerprints of each cortical area, and not simply learning where
supramarginal gyri were most reproducible. The visual cortex the parcel is. When we examined the predictions generated by
showed high reproducibility in area V1, while areas V2-V4 were the optimal model on the four independent 15-min scanning
less reproducible. sessions, we found that, within a given subject, the topological
Figure 7A, shows mean reproducibility estimates computed organization of area 55b was reproducible. Allowing for
on the HCP and MSC datasets. Predictions for both datasets some variability in prediction boundaries and location due to
were highly reproducible across repeated scanning sessions, and resampling of the connectivity data and partial volume effects,
reproducibility increased with increasing scan duration. Mean this indicates that the graph neural networks are learning
Dice coefficient estimates in the HCP dataset were 0.81 and subject-specific topological layouts that incorporate their unique
0.86 for the 15- and 30-min durations. In the MSC dataset, the connectivity and histology patterns.
mean Dice coefficients were 0.69, 0.76, and 0.82 for the 30-, 60-,
and 150-min durations. When fixing scan duration (e.g., 30-min
durations), HCP data were more reproducible than the MSC
5.3. Parcellations Learned by GNNs Are
data. One feature that we could not evaluate directly was the Homogeneous in Their Scalar and
reproducibility of the ground truth maps. Glasser et al. (2016) Connectivity Measures
reported maximum and median Dice coefficient estimates of 0.75 If a model is in fact learning unique, discrete areas, the
and 0.72 for repeated scans on HCP participants, indicating that distribution of biological features in these areas should
our classifier learned parcellations that were more reproducible be relatively homogeneous. Unsupervised learning clustering
than those generated by the binary classifier. algorithms designed to parcellate the cortex often incorporate
Figure 7B illustrates subject-level reproducibility estimates objective functions that attempt to maximize within-parcel
in the MSC dataset. Predictions for subject MSC08 were similarity and minimize between-parcel similarity. On the other
significantly less reliable, relative to the other subjects. Gordon hand, gradient-based approaches, like those proposed in Gordon
et al. (2017) also identified MSC08 as having low reproducibility et al. (2016), Wig et al. (2014), and Schaefer et al. (2018), do not
with respect to various graph theoretical metrics computed directly maximize an objective function in this manner, but rather
from the functional connectivity matrices. They noted that identify putative areal boundaries by identifying where biological
FIGURE 6 | Mean areal Dice coefficient estimates, computed using the optimal model on 15-min HCP data (4 repeated sessions) and 30-min MSC data (10 repeated
sessions), normalized with the same color map. Estimates are computed for each area, and averaged across all subjects.
FIGURE 7 | Reproducibility of predicted maps generated by the optimal model, as measured using the Dice coefficient. We show mean reproducibility estimates for
each dataset (A), and subject-level estimates in the Midnight Scan Club (B). Estimates for 60 min (HCP) and 300 min (MSC) durations are not shown in (A) because
there is only one image per subject for these durations. Similarly, estimates for 150 min durations are not shown in (B) because there is only a single scalar estimate
per subject.
properties change dramatically in a small local neighborhood. It Cortical maps predicted in the HCP dataset explained,
is assumed that this biological gradient captures differences in on average, 67.03% of the functional variation while MSC
homogeneity between adjacent cortical areas. In order to group predictions explained 72.90% (t: −3.137, p: 0.007) (Figure 8).
cortical voxels together, these voxels must inherently share some We hypothesized that parcellations predicted in the HCP dataset
physical or biological traits. would be more homogeneous, relative to those learned in the
We computed homogeneity estimates as described in MSC dataset, due to the fact that the MSC imaging data were
section 4.4. In order to compare the homogeneity and variance acquired with lower spatial resolution than that acquired by
estimates between predicted parcellations, we fixed the features the HCP and therefore subject to greater partial volume effects.
used to compute these estimates. For a given subject, we Homogeneity of myelin (t: −0.910, p: 0.377) and sulcal depth
computed functional homogeneity using that subject’s 60-min (t: 1.043, p: 0.320) was not statistically different between the two
BOLD signal (HCP), or the 300-min BOLD signal (MSC). In datasets, while curvature was less variable in the HCP dataset (t:
this way, the only variable that changed with respect to the −2.423, p: 0.029). Contrary to our hypothesis, cortical thickness
homogeneity estimate is the cortical map itself. We could then was less variable in the MSC dataset (t: 11.562, p: 0.000). This
make meaningful quantitative comparisons between estimates is likely a consequence of using a dimensionless representation
for different maps, with respect to a given dataset. of homogeneity, which is internally normalized for each dataset
FIGURE 8 | Homogeneity of predicted parcellations in the HCP and MSC datasets using the optimal model. (A) Predicted parcels in the HCP test set explained as
much variability in the functional connectivity as the ground truth parcels. (B–E) Predictions in the MSC had more variable myelin content and less variable cortical
thickness estimates, relative to the HCP predictions.
as a ratio of the within-to-between parcel variances. This metric than a simple concatenation marginally decreased classification
allows for the direct comparison of homogeneity estimates across accuracy for the jumping-knowledge networks. In contrast to
datasets, instead of representing the raw variance estimates. our predictions, we found that the GAT networks slightly
We compared homogeneity estimates in the predicated HCP outperformed the more flexible JKGAT networks for most
parcellations to estimates computed for the ground truth maps parameterizations.
using paired t-tests. Predicted and ground truth maps both We used a fixed validation dataset of 20 subjects to determine
explained roughly 67% of the functional variation (t: −0.305, p: when to stop model training and evaluated the performance of
0.761). Myelin (t: 0.176, p: 0.860) and curvature: (t: −1.746, p: our models using a fixed test dataset of 148 subjects. In order to
0.083) variation were not statistically different between the two determine the reliability of our accuracy estimates, we computed
groups. However, predictions were more homogeneous than the the standard error of classification accuracy for each model
ground truth maps with respect to sulcal depth (t: −4.442, p: using a bootstrapped approach (Supplementary Material). We
0.000) and cortical thickness: (t: −2.553, p: 0.012). randomly sampled 100 test subjects, with replacement, out of
the 148, and computed the mean accuracy for each sample, for
5.4. Network Architecture Impacts Model each model. We repeated this process 1,000 times, and computed
Performance the variability of these bootstrapped estimates. Standard error
As noted in section 5, we first optimized over network algorithms estimates were less than 0.5%, indicating that test set accuracy
and architectures using the S500 dataset, and then utilized estimates are robust with respect to resampling of the test dataset.
the S1200 dataset for model evaluation. We fixed the features We examined how classification accuracy in the HCP
used for network optimization to the regionalized connectivity dataset was related to the scanning duration of training and
features. We examined how varying each network parameter testing datasets using the default model parameters (as defined
impacted model classification accuracy (Table 1). As mentioned in section 5). When fixing test scan duration, classification
in section 5.1, the best performing model was the GAT network accuracy improved as the training dataset size increased for
with 6 layers with a classification accuracy of 67.60% on the S500 all model types, with maximum accuracy achieved by graph
dataset (significantly inferior to the performance of the same attention network models trained on 400 15-min duration
network on S1200 data, with an accuracy of 79.91%). We found datasets (Supplementary Material). When training dataset size
that optimal performance for the GAT and GCN networks was and training scan duration were fixed, longer test image
achieved with 6 layers, 9 layers for the JKGAT, and 3 layers for the duration yielded more accurate predictions across the board.
baseline model. In general, classification accuracy increased with Predictions on 60-min test data were more accurate than
the number of attention heads, and number of hidden channels, those computed on 30-min images, which in turn were
while classification accuracy decreased with increasing feature more accurate than those generated from 15-min images
dropout rates. Using an LSTM aggregation function rather (Supplementary Material). However, models trained on 15-min
Model
Parameter Value Baseline (%) GCN (%) GAT (%) JKGAT (%)
4 67.02 66.71
Attention heads 8 67.39 67.30
12 67.56 67.29
concat 66.85
Aggregation function
lstm 66.71
Models were trained on 400 15-min datasets, and tested on 60-min test data using the S500 dataset. Boxed values indicate the default parameter values. The best performing model
was the GAT network with 6 layers, achieving a mean classification accuracy of 67.60%. Values in bold are the mean classification accuracy of the best model, trained on resting-state
connectivity features computed by regionalizing time-series over the Destrieux cortical atlas (see Section 4.1).
data performed best when tested on 15-min data, and models independent component analysis. We identified the connected
trained on 60-min data performed best when tested on components of each of the 17 resting-state networks and
60-min data (Supplementary Material) indicating an interaction excluded component regions with sizes smaller than 10
between training and testing scan duration. Similarly, when vertices, resulting in a map of 55 discrete functionally-derived
fixing training and testing scan duration, we found that including subregions of the cortex. We also examined the performance of
the spatial prior significantly improved classification accuracy in models trained on continuous, overlapping connectivity features
all architectures. representing resting-state networks computed using group-ICA
and dual regression.
Computing connectivity features over the Destrieux atlas
5.5. Incorporating Functional Connectivity yielded increased classification accuracy over the Desikan-
Improves Model Performance Beyond Killiany atlas (72.01 vs. 70.08%; paired t: 25.197, p: 0.000;
Spatial Location and Scalar Metrics see models “Full-DX” and “Full-DK”). We hypothesized that
After identifying the optimal network architecture, we examined computing connectivity features over a functionally-aware
how model performance varied as a function of which parcellation (Yeo-17) would yield a significant improvement
features the model was trained on. Briefly, we delineated in classification accuracy, relative to the Destrieux atlas,
three broad feature types: (1) scalar features corresponding to but this was not the case (see “Full-DX” vs. “Full-YEO”
myelin, cortical thickness, sulcal depth, and cortical curvature in Figure 9). Models trained on the Yeo-17 features had a
(2) global location features corresponding to the spectral mean classification accuracy of 71.58% (paired t: 1.916, p:
coordinates computed from the graph Laplacian and (3) 0.057). Training on spatial location or histological features
connectivity features computed from the resting-state signal. alone yielded mean classification accuracies of 44.10 and
In our primary analysis, we utilized connectivity features 54.45%, respectively (Figure 9A). However, training on
computed by regionalizing over the Destrieux atlas (75 folding- features defined by resting-state ICA components had clear
based cortical areas). We compared these features against performance benefits. Models trained on ICA dimensions
those computed using the Desikan-Killiany atlas (35 folding- of 15, 25, 50, and 100 generated mean classification
based cortical areas) and the Yeo-17 resting-state network accuracies of 75.34, 77.79, 79.68, and 79.91%, respectively
atlas (Yeo et al., 2011). The Yeo-17 atlas is a functional (Figure 9C). Similarly, incorporating the prior mask also
atlas of discretized resting-state networks, computed via improved model performance. However, the mask added
FIGURE 9 | Classification accuracy as a function of model features, using the optimal model architecture for (A) single feature types, (B) regionalization over different
cortical atlases, and (C) independent component analysis features. Refer to Table 2 for a description of each feature set.
Feature sets
Thickness + + + + +
Curvature + + + + +
Myelin + + + + +
Sulcal depth + + + + +
Laplacian + + + + +
Desikan (DK) +
Destrieux (DX) + +
Yeo-17 (YEO) +
ICA-RSN +
Features included in a model are marked by a “+.” “Full” models include histological features, global position information, and functional connectivity signals.
diminishing returns, with the better-performing models the S500 and S1200 data releases were preprocessed using
benefiting less from its inclusion. Models trained on higher- different surface registration algorithms: MSMSulc and
dimensional ICA resting-state networks (50 and 100 networks), MSMAll (Robinson et al., 2014, 2018). A consequence
performed almost as well without the spatial prior as they of these preprocessing differences is that data from the
did with it. S1200 release is better aligned with the subject-level
Late into our analysis, we learned of differences in labels provided by Glasser. After performing network
the preprocessing steps used to generate the minimally- optimization using the S500 data, we evaluated final model
preprocessed HCP resting-state data, and to generate the performance on the S1200 dataset. Figure 10 illustrates model
subject- and group-level HCP-MMP parcellations. Specifically, performance after training on each independent dataset. We
FIGURE 10 | Classification accuracy as a function of HCP data release and corresponding multi-modal surface matching algorithm. S500: MSMSulc (Robinson et al.,
2014), S1200: MSMAll (Robinson et al., 2018).
found that utilizing the S1200 dataset showed significant on repeated samples of BOLD images, such that for a given
improvements in mean classification accuracy by upwards 5%, training subject, models were shown four BOLD datasets. This
relative to the S500 dataset. This indicates that the surface likely enabled the models to better learn the mapping between
registration algorithm choice plays a critical role in cortical a given subject’s unique BOLD signature, and its cortical map.
segmentation quality. Another possible explanation is that the ground truth maps
were generated using a linear perceptron model, which does not
6. DISCUSSION take into account any spatial relationships between data points,
while graph neural networks do take this spatial structure into
In this analysis, we presented a general cortical segmentation account. It is likely the case that the perceptron model could not
approach that, given functional connectivity information and adapt to utilize spatial dependencies in the BOLD signal in local
a set of corresponding training labels, can generate cortical neighborhoods and thereby failed to fully learn unique subject-
parcellations for individual participants. This approach to specific connectivity fingerprints, and consequently learned more
segmenting the cortex requires accessible MRI acquisition variable parcellations.
sequences and standard morphological parcellations as inputs. The optimal model predicted parcellations that were as
We compared three different graph neural network variants homogeneous as the ground truth maps when considering
to a baseline fully-connected network. We found that, in multidimensional connectivity features and univariate scalar
all cases, graph neural networks consistently and significantly features. Though the models considered in this analysis
outperformed a baseline neural network that excluded adjacency are capable of learning parcels that capture inter-areal
information. We identified the best performing model and variation of functional brain connectivity and other cortical
explored its performance with respect to various metrics features, it is worth noting that homogeneity as a measure
like segmentation accuracy, prediction reliability, and areal of parcellation quality is an imperfect metric and should be
homogeneity in two independent datasets. used judiciously. For example, the primary sensory areas
Predictions generated for both the HCP and MSC datasets can be further divided into five somatotopic subregions
were highly reproducible. However, we found that nearly twice corresponding to the upper and lower limbs, trunk, ocular,
as much resting-state data was required in MSC subjects to and face areas (Glasser et al., 2013). These subdivisions
achieve the same reproducibility estimates as in the HCP correspond well with task-based fMRI activity and gradients in
data. Predictions generated on the HCP dataset were more myelin content, indicating that the parcels learned by GNNs
reproducible than the ground truth maps themselves (Glasser in our analysis still incorporate significant variability due
et al., 2016), while predictions in the MSC data were roughly as to the aggregation of signals from different somatosensory
reproducible as the ground-truth parcellations. This may in part areas. While learning homogeneous regions is important
be due to the way we trained our models. Models were trained in order to effectively capture spatial biological variation,
maximizing homogeneity was not the training criterion for networks would significantly outperform GAT networks due
this analysis. to the increased flexibility to learn optimized node-specific
As noted in section 3, the MSC study applied different network depths. In their original formulation of the jumping-
preprocessing steps than the HCP. Specifically, the MSC did knowledge network architecture, Xu et al. (2018) found that
not perform FIX-ICA to remove noise components from the including the jumping-knowledge mechanism improved
BOLD images and utilized the FreeSurfer spherical surface model performance relative to the GAT in almost all of their
registration to bring surfaces into spatial correspondence with comparisons. However, we found this not to be the case. This
one another instead of the multi-modal surface matching may be a consequence of the increased number of estimated
algorithm (Robinson et al., 2014, 2018). Given that the MSC parameters in the JKGAT networks, relative to the GAT—the
dataset did not have “ground truth” labels against which we jumping-knowledge aggregation layer learns the parameters for
could compare predictions made on the MSC data, we compared the aggregation function cells in addition to the attention head
predictions against the HCP-MMP atlas (Glasser et al., 2016). As and projection matrix weights learned in the GAT networks. The
expected, predictions generated on the HCP dataset more closely lower classification accuracy at test time is possibly the result
resembled the HCP-MMP atlas than predictions made from the of model over-fitting, necessitating a larger training dataset. It
MSC dataset (the HCP-MMP atlas was derived as a group- is possible that the jumping-knowledge mechanism is generally
average of individual ground truth parcellations). Nevertheless, more useful in the case where graph topologies vary considerably
we found that correspondence of MSC predictions with the across a network, as opposed to more regular graphs such as
atlas followed similar trends with respect to testing image cortical surface data.
duration. We believe some discrepancy in results between the As expected, network performance was dependent on both
HCP and MSC datasets can be attributed to the differences in the size and duration of the training set, and duration of the
dataset-specific preprocessing choices noted above, although the testing data. Classification accuracy increased when models were
relationship between methodological choices and parcellation trained on larger datasets consisting of shorter-duration images.
outcome requires future analyses. Performance differences across Conversely, accuracy increased when models were deployed
the two datasets are also possibly a result of the models on longer-duration test data. It is important to note that we
learning characteristics inherent to the training (HCP) dataset, examined performance of our models on images of long scanning
and thereby performing better on hold out subjects from that durations by concatenating multiple sessions together (30/60-
same dataset. min in the HCP, and 60/150/300-min in the MSC). It is unrealistic
Our optimal model was the 6-layer graph attention network, to expect study participants to be able to lay in an MRI scanner for
trained and tested on resting-state network components single sessions of these lengths. However, it is useful to examine
computed using a 50-dimensional ICA. This model performed how model performance is impacted by tunable parameters
as well with the spatial prior as it did without. However, models like scan duration in order to best guide image acquisition
trained on regionalized connectivity features benefited from in future studies. We found that utilizing repeated scans on
including the spatial prior. We believe it would be prudent for individual subjects as independent training examples, rather
future studies to include a spatial prior of some form into their than concatenating repeated scans together into single datasets,
classification frameworks. Interestingly, predictions on HCP test significantly improved our classification frameworks. This likely
subjects resembled the HCP-MMP atlas more closely than they speaks to the ability of neural network models to generalize
resembled their ground truth counterparts, which might in part better to noise in the datasets. Training models on multiple
be driven by the specific form of the prior. We made the samples of shorter-duration images more accurately captures
assumption that cortical map topology is relatively conserved the individual variability in the resting-state signal than fewer
across individuals. This assumption may be too conservative and longer-duration images, thereby allowing the networks to more
may reduce model sensitivity to atypical cortical connectivity accurately learn a mapping between functional connectivity and
patterns. Nevertheless, there is evidence our GNN models learn cortical areal assignments.
subject-specific topologies of cortical areas, rather than simply Our methodology could be improved in a variety of ways.
learning where a cortical parcel usually is. Importantly, we We chose not to perform intensive hyperparameter optimization,
found that the optimal GAT model could identify three unique and instead focused our efforts on overall performance of
topologies for area 55b (typical, shifted, and split) and that the various network architectures as a function of network
predictions generated by our model replicated, with high fidelity, parameters and data parameters, and the applicability of
the same spatial organization patterns as identified in Glasser trained models to new datasets. However, in the case where a
et al. (2016). This indicates that the model is capable of learning classification model is meant to be distributed to the research
unique connectivity fingerprints of each cortical area on a community for open-source use, it would be prudent to perform
subject-by-subject basis, rather than simply learning the group a more extensive search over the best possible parameter choices.
average fingerprint. As such, we do not believe that including the The utility of functional connectivity has been shown in a
spatial prior in its current form inhibits the ability of the graph variety of studies for delineating cortices (Blumensath et al.,
neural network models used in this analysis to identify atypical 2013; Arslan et al., 2015; Baldassano et al., 2015; Gordon et al.,
cortical topologies. 2016). However, in recent years, using diffusion tractography
We compared three different graph neural networks: graph for learning whole-brain cortical maps has been underutilized,
convolution networks, standard attention networks, and relative to functional connectivity (Gorbach et al., 2011; Parisot
jumping-knowledge networks. We hypothesized that JKGAT et al., 2015; Bajada et al., 2017). Given cortical maps defined
independently by tractography and functional connectivity, it in Figure 10, incorporating MSMAll-processed data from the
is difficult to “match” cortical areas across maps to compare S1200 dataset, instead of MSMSulc-processed data from the S500
biological properties, so heuristics are often applied. Few dataset, improved model classification accuracy by nearly 5%.
studies have simultaneously combined functional connectivity We hypothesize that this improvement would only increase if we
and tractography to better inform the prediction of cortical had access to the data processed with the prototypical version of
maps. Recent work has extended the idea of variational auto- MSMAll. Based on the comparisons of subject-level predictions
encoders to the case of multi-modal data by training coupled with the subject-level ground truth MMP maps, our models
auto-encoders to jointly learn embeddings of multiple data performed well in spite of these registration discrepancies. Our
types. In Gala et al. (2021), the authors apply this approach results lend evidence to the robustness of graph neural networks
to jointly learn embeddings defined by transcriptomics and for learning cortical maps from functional connectivity.
electrophysiology that allow them to identify cell clusters with Finally, participants in both the HCP and MSC studies were
both similar transcriptomic and electrophysiology properties. healthy young adults, and the datasets had been extensively
Future work could apply similar ideas to aggregate functional and quality controlled. Little to no work has been done on extending
diffusion-based connectivity signals. connectivity-based classifiers to atypical populations, such as to
The majority of recent studies have approached the cortical individuals with neurodegeneration. It is unknown how a model
mapping problem from the perspective of generating new trained on connectivity properties from healthy individuals
parcellations from underlying neurobiological data using would perform in populations where connectivity is known
unsupervised clustering or spatial gradient methods. These to degrade. While our model (and that developed by Glasser
approaches attempt to delineate areal boundaries by grouping et al., 2016) predict maps based on healthy individuals, it is
cortical voxels together on the basis of similarity between their possible that some studies would need to train population-
features. Spatial gradient-based methods explicitly define areal specific models.
boundaries, while clustering methods define these boundaries
implicitly. However, both approaches are distinct from methods DATA AVAILABILITY STATEMENT
that utilize pre-existing or pre-computed parcellations as
templates for mapping new data. In the current analysis, we were The original contributions presented in the study are included
concerned with the latter problem. in the article/Supplementary Material, further inquiries can be
Clustering and spatial gradient approaches are often interested directed to the corresponding author/s.
in relating newly-generated cortical maps to underlying in vitro
measures, such as transcriptomics or cytoarchitectural results. AUTHOR CONTRIBUTIONS
Clearly, it is impossible to acquire this data in human subjects
simultaneously with in vivo data. Various projects have attempted KE conceptualized this study, developed the code, performed
to build cytoarchitectural datasets from post-mortem subjects to the analyses, and wrote the bulk of the document. TG
use as a basis of comparisons for maps generated in vivo (Amunts provided comments and neuroscientific insight into the analysis,
et al., 2020). While some cortical areas have been recapitulated contributed to the editing, and organization structure of
using both in vitro and in vivo features, this is not a general the manuscript. DH provided extensive neuroscientific and
rule across the cortex. As such, cross-modal verification is often technical guidance for this work, contributed to the editing,
difficult, and leaves room for methods and datasets than can and organizational structure of the manuscript. All authors
improve upon the validation of cortical mapping studies. contributed to the article and approved the submitted version.
One limitation of our analysis concerns the use of different
versions of the multi-modal surface matching for cortical surface FUNDING
alignment for the S500 HCP data release (Glasser et al., 2013;
Robinson et al., 2014), the S1200 release (Robinson et al., This project was supported by grant NSF BCS 1734430, titled
2018), and for the subject-level HCP-MMP parcellations (Glasser Collaborative Research: Relationship of Cortical Field Anatomy
et al., 2016), which used a different regularization term. These to Network Vulnerability and Behavior (TG, PI).
differences between the three registration methods result in a
slight spatial misalignment between the training labels and the ACKNOWLEDGMENTS
cortical features. While the S500 data release utilized MSMSulc, a
spherical surface registration driven by cortical folding patterns, We thank Matthew F. Glasser for making subject-level Human
the S1200 release utilized MSMAll, and incorporated functional Connectome Project multi-modal parcellations available for this
connectivity into the spatial resampling step. Glasser et al. analysis, and for his helpful and extensive comments on a draft
(2016) used a prototypical version of MSMAll in addition to version of the manuscript.
MSMSulc, and thereby incorporated additional features derived
from resting-state networks to drive the surface matching SUPPLEMENTARY MATERIAL
process. Importantly, this discrepancy between the training labels
and training features is not a flaw in our methodology itself, The Supplementary Material for this article can be found
and correcting for this difference in the registration approach online at: https://www.frontiersin.org/articles/10.3389/fnins.
would only improve the results of our analysis. As we showed 2021.797500/full#supplementary-material
REFERENCES Gopinath, K., Desrosiers, C., and Lombaert, H. (2019). Graph convolutions on
spectral embeddings for cortical surface parcellation. Med. Image Anal. 54,
Amunts, K., Mohlberg, H., Bludau, S., and Zilles, K. (2020). Julich-Brain: a 3D 297–305. doi: 10.1016/j.media.2019.03.012
probabilistic atlas of the human brain’s cytoarchitecture. Science 369, 988–992. Gorbach, N. S., Schutte, C., Melzer, C., Goldau, M., Sujazow, O., Jitsev, J.,
doi: 10.1126/science.abb4588 et al. (2011). Hierarchical information-based clustering for connectivity-based
Arslan, S., Parisot, S., and Rueckert, D. (2015). Joint spectral decomposition for the cortex parcellation. Front. Neuroinform. 5:18. doi: 10.3389/fninf.2011.00018
parcellation of the human cerebral cortex using resting-state fMRI. Inf. Process. Gordon, E. M., Laumann, T. O., Adeyemo, B., Huckins, J. F., Kelley, W.
Med. Imaging 24, 85–97. doi: 10.1007/978-3-319-19992-4_7 M., and Petersen, S. E. (2016). Generation and evaluation of a cortical
Asman, A. J., and Landman, B. A. (2012). Non-local statistical label area parcellation from resting-state correlations. Cereb. Cortex 26, 288–303.
fusion for multi-atlas segmentation. Med. Image Anal. 17, 194–208. doi: 10.1093/cercor/bhu239
doi: 10.1016/j.media.2012.10.002 Gordon, E. M., Laumann, T. O., Gilmore, A. W., Newbold, D. J., Greene, D. J., Berg,
Asman, A. J., and Landman, B. A. (2014). Hierarchical performance estimation J. J., et al. (2017). Precision functional mapping of individual human brains.
in the statistical label fusion framework. Med. Image Anal. 18, 1070–1081. Neuron 95, 791.e7–807.e7. doi: 10.1016/j.neuron.2017.07.011
doi: 10.1016/j.media.2014.06.005 Hacker, C. D., Laumann, T. O., Szrama, N. P., Baldassarre, A., Snyder, A. Z.,
Bajada, C. J., Jackson, R. L., Haroon, H. A., Azadbakht, H., Parker, G. J., Lambon Leuthardt, E. C., et al. (2013). Resting state network estimation in individual
Ralph, M. A., et al. (2017). A graded tractographic parcellation of the temporal subjects. Neuroimage 15, 616–633. doi: 10.1016/j.neuroimage.2013.05.108
lobe. Neuroimage 155, 503–512. doi: 10.1016/j.neuroimage.2017.04.016 Hagler, D. J., Hatton, S., Cornejo, M. D., Makowski, C., Fair, D. A., Dick,
Baldassano, C., Beck, D. M., and Fei-Fei, L. (2015). Parcellating connectivity in A. S., et al. (2019). Image processing and analysis methods for the
spatial maps. PeerJ 3:e784. doi: 10.7717/peerj.784 Adolescent Brain Cognitive Development Study. Neuroimage 202:116091.
Beckmann, C. F., DeLuca, M., Devlin, J. T., and Smith, S. M. (2005). Investigations doi: 10.1016/j.neuroimage.2019.116091
into resting-state connectivity using independent component analysis. Philos. Hamilton, W. L., Ying, R., and Leskovec, J. (2017). “Representation learning
Trans. R. Soc. Lond. B Biol. Sci. 360, 1001–1013. doi: 10.1098/rstb.2005.1634 on graphs: methods and applications,” in IEEE Data Engineering Bulletin
Blumensath, T., Jbabdi, S., Glasser, M. F., Van Essen, D. C., Ugurbil, (California, CF: IEEE) arXiv:1709.05584.
K., Behrens, T. E., et al. (2013). Spatially constrained hierarchical Hochreiter, S., and Schmidhuber, J. (1997). Long short-term memory. Neural
parcellation of the brain with resting-state fMRI. Neuroimage 76, 313–324. Comput. 9, 1735–1780. doi: 10.1162/neco.1997.9.8.1735
doi: 10.1016/j.neuroimage.2013.03.024 Kipf, T. N., and Welling, M. (2016a). Semi-Supervised Classification With Graph
Bullmore, E., and Sporns, O. (2012). The economy of brain network organization. Convolutional Networks. Technical report, University of Amsterdam.
Nat. Rev. Neurosci. 13, 336–349. doi: 10.1038/nrn3214 Kipf, T. N., and Welling, M. (2016b). Variational Graph Auto-Encoders. Technical
Coalson, T. S., Van Essen, D. C., and Glasser, M. F. (2018). The impact of traditional report, University of Amsterdam.
neuroimaging methods on the spatial localization of cortical areas. Proc. Natl. Liu, M., Kitsch, A., Miller, S., Chau, V., Poskitt, K., Rousseau, F., et al. (2016).
Acad. Sci. U.S.A. 115, E6356–E6365. doi: 10.1073/pnas.1801582115 Patch-based augmentation of expectation-maximization for brain MRI tissue
Cucurull, G., Wagstyl, K., Casanova, A., Velickovic, P., Jakobsen, E., Drozdzal, M., segmentation at arbitrary age after premature birth. Neuroimage 127, 387–408.
et al. (2018). “Convolutional neural networks for mesh-based parcellation of the doi: 10.1016/j.neuroimage.2015.12.009
cerebral cortex,” in Med. Imaging with Deep Learn. Available online at: https:// Parisot, S., Arslan, S., Passerat-Palmbach, J., Wells, W. M. III, Rueckert, D.,
openreview.net/pdf?id=rkKvBAiiz. Wells, W. M. III, et al. (2015). Tractography-driven groupwise multi-
Defferrard, M., Bresson, X., and Vandergheynst, P. (2016). “Convolutional scale parcellation of the cortex. Inf. Process. Med. Imaging 24, 600–612.
neural networks on graphs with fast localized spectral filtering,” in NIPS’16: doi: 10.1007/978-3-319-19992-4_47
Proceedings of the 30th International Conference on Neural Information Petersen, R. C., Aisen, P. S., Beckett, L. A., Donohue, M. C., Gamst, A. C., Harvey,
Processing Systems. Barcelona: ACM, 187–98. D. J., et al. (2010). Alzheimer’s Disease Neuroimaging Initiative (ADNI) clinical
Desikan, R. S., Segonne, F., Fischl, B., Quinn, B. T., Dickerson, B. C., Blacker, characterization. Neurology 74, 201–209. doi: 10.1212/WNL.0b013e3181cb3e25
D., et al. (2006). An automated labeling system for subdividing the human Robinson, E. C., Garcia, K., Glasser, M. F., Chen, Z., Coalson, T. S., Makropoulos,
cerebral cortex on MRI scans into gyral based regions of interest. Neuroimage A., et al. (2018). Multimodal surface matching with higher-order smoothness
31, 968–980. doi: 10.1016/j.neuroimage.2006.01.021 constraints. Neuroimage 167, 453–465. doi: 10.1016/j.neuroimage.2017.10.037
Destrieux, C., Fischl, B., Dale, A., and Halgren, E. (2010). Automatic parcellation Robinson, E. C., Jbabdi, S., Glasser, M. F., Andersson, J., Burgess, G. C., Harms,
of human cortical gyri and sulci using standard anatomical nomenclature. M. P., et al. (2014). MSM: a new flexible framework for multimodal surface
Neuroimage 53, 1–15. doi: 10.1016/j.neuroimage.2010.06.010 matching. Neuroimage 100, 414–426. doi: 10.1016/j.neuroimage.2014.05.069
Eschenburg, K., Haynor, D., and Grabowski, T. (2018). “Automated connectivity- Salimi-Khorshidi, G., Douaud, G., Beckmann, C. F., Glasser, M. F., Griffanti,
based cortical mapping using registration-constrained classification,” in L., and Smith, S. M. (2014). Automatic denoising of functional MRI
Medical Imaging 2018: Biomedical Applications in Molecular, Structural, and data: combining independent component analysis and hierarchical fusion of
Functional Imaging (Houston, TX). doi: 10.1117/12.2293968 classifiers. Neuroimage 90, 449–468. doi: 10.1016/j.neuroimage.2013.11.046
Fischl, B., van der Kouwe, A., Destrieux, C., Halgren, E., Segonne, F., Salat, D. Schaefer, A., Kong, R., Gordon, E. M., Laumann, T. O., Zuo, X.-N., Holmes,
H., et al. (2004). Automatically parcellating the human cerebral cortex. Cereb. A. J., et al. (2018). Local-global parcellation of the human cerebral cortex
Cortex 14, 11–22. doi: 10.1093/cercor/bhg087 from intrinsic functional connectivity MRI. Cereb. Cortex 28, 3095–3114.
Gala, R., Budzillo, A., Baftizadeh, F., Miller, J., Gouwens, N., Arkhipov, A., et al. doi: 10.1093/cercor/bhx179
(2021). Consistent cross-modal identification of cortical neurons with coupled Smith, S. M., Hyvärinen, A., Varoquaux, G., Miller, K. L., and Beckmann, C.
autoencoders. Nat. Comput. Sci. 1, 120–127. doi: 10.1038/s43588-021-00030-1 F. (2014). Group-PCA for very large fMRI datasets. Neuroimage 101:738.
Glasser, M. F., Coalson, T. S., Robinson, E. C., Hacker, C. D., Harwell, J., Yacoub, doi: 10.1016/j.neuroimage.2014.07.051
E., et al. (2016). A multi-modal parcellation of human cerebral cortex. Nature Vaswani, A., Brain, G., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., et al. (2017).
536, 171–178. doi: 10.1038/nature18933 Attention Is All You Need. Technical report, Google.
Glasser, M. F., Sotiropoulos, S. N., Wilson, J. A., Coalson, T. S., Fischl, Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Lí, P., and Bengio, Y.
B., Andersson, J., et al. (2013). The minimal preprocessing pipelines (2018). “Graph attention networks,” in International Conference on Learning
for the Human Connectome Project. Neuroimage 80, 105–124. Representations (Vancouver, VC :ICLR) arXiv:1710.10903.
doi: 10.1016/j.neuroimage.2013.04.127 Wagstyl, K., Larocque, S., Cucurull, G., Lepage, C., Cohen, J. P., Bludau, S., et al.
Glasser, M. F., and van Essen, D. C. (2011). Mapping human cortical areas in vivo (2020). BigBrain 3D atlas of cortical layers: cortical and laminar thickness
based on myelin content as revealed by T1- and T2-weighted MRI. J. Neurosci. gradients diverge in sensory and motor cortices. PLoS Biol. 18:e3000678.
31, 11597–11616. doi: 10.1523/JNEUROSCI.2180-11.2011 doi: 10.1371/journal.pbio.3000678
Wang, G., Ying, R., Huang, J., and Leskovec, J. (2019). Improving Graph Attention analysis. Technical report, Stanford. doi: 10.1007/978-3-030-322
Networks with Large Margin-based Constraints. Technical report, Stanford 45-8_91
University, Mountain View.
Wang, M., Zheng, D., Ye, Z., Gan, Q., Li, M., Song, X., et al. (2020). Deep Conflict of Interest: The authors declare that the research was conducted in the
Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural absence of any commercial or financial relationships that could be construed as a
Networks. Technical report, New York University. potential conflict of interest.
Wig, G. S., Laumann, T. O., and Petersen, S. E. (2014). An approach for parcellating
human cortical areas using resting-state correlations. Neuroimage 93(Pt 2), Publisher’s Note: All claims expressed in this article are solely those of the authors
276–291. doi: 10.1016/j.neuroimage.2013.07.035 and do not necessarily represent those of their affiliated organizations, or those of
Xu, K., Li, C., Tian, Y., Sonobe, T., Kawarabayashi, K.-I., and Jegelka, S.
the publisher, the editors and the reviewers. Any product that may be evaluated in
(2018). Representation Learning on Graphs with Jumping Knowledge Networks.
this article, or claim that may be made by its manufacturer, is not guaranteed or
Technical report, MIT.
Yeo, B. T. T., Krienen, F. M., Sepulcre, J., Sabuncu, M. R., Lashkari, D., endorsed by the publisher.
Hollinshead, M., et al. (2011). The organization of the human cerebral cortex
estimated by intrinsic functional connectivity. J. Neurophysiol. 106, 1125–1165. Copyright © 2021 Eschenburg, Grabowski and Haynor. This is an open-access article
doi: 10.1152/jn.00338.2011 distributed under the terms of the Creative Commons Attribution License (CC BY).
Zeng, H., Zhou, H., Srivastava, A., Kannan, R., and Prasanna, V. (2020). The use, distribution or reproduction in other forums is permitted, provided the
GraphSAINT: Graph Sampling Based Inductive Learning Method. Technical original author(s) and the copyright owner(s) are credited and that the original
report, University of Southern California. publication in this journal is cited, in accordance with accepted academic practice.
Zhao, Q., Adeli, E., Honnorat, N., Leng, T., and Pohl, K. M. (2019). No use, distribution or reproduction is permitted which does not comply with these
Variational autoencoder for regression: application to brain aging terms.