Avoid common mistakes on your manuscript.
Main
Whole-brain image volumes at the micron scale are helping scientists characterize neuron-level morphology and connectivity, and discover new neuronal subtypes. These volumes require intense computational processing to uncover the rich neuronal information they contain. Currently, however, image acquisition is outstripping the availability and throughput of analysis pipelines. The steps in analyzing these images include registration, axon segmentation, soma detection, visualization and analysis of results. Several tools exist for these individual steps, but are rarely all part of an integrated pipeline and able to facilitate cloud-based collaboration (Tyson et al., 2022; Pisano et al., 2021). Further, many existing machine learning based tools are highly tuned to their training data and perform poorly when they encounter out-of-distribution artifacts or signal levels (Geisa et al., 2021).
To address these challenges, we present BrainLine, an open-source, fully-integrated pipeline that performs registration, axon segmentation, soma detection, visualization, and analysis on whole-brain fluorescence volumes (Figure 1a). BrainLine combines state-of-the-art, already available open-source tools such as CloudReg (Chandrashekhar et al., 2021) and ilastik (Berg et al., 2019) with brainlit, our Python package developed here. The BrainLine pipeline accommodates images that are hundreds of gigabytes in size and uses generalizable machine learning training schemes that adapt to out-of-distribution samples.
To share and interact with data across multiple institutions, BrainLine uses Amazon S3 to store data in precomputed format, so it can be viewed using Neuroglancer (n/a). Specifically, we use CloudReg (Chandrashekhar et al., 2021) for file conversion of the stitched image, and for image registration to the Allen atlas (Wang et al., 2020).
For axon segmentation and soma detection, we sought to leverage recent machine learning advances but experienced two major constraints. First, as generating ground truth image annotations is labor intensive, we wanted the approach to be effective on a small amount of training data. Second, images were provided to us in a sequential manner, and new samples would sometimes have unique artifacts or different levels of image quality (Figure 1b, c, e, f). We therefore sought a learning algorithm that could be quickly retrained on new data. Many learning algorithms assume that all training and testing data come from the same distribution and fail when this is not the case (Quinonero-Candela et al., 2008). However, using our closed-loop training paradigm with ilastik (Berg et al., 2019), we were able to use a single ilastik project for all samples, only occasionally adding training data when difficult samples arose.
We used an ilastik pixel classification workflow for both axon segmentation and soma detection, but in the latter case we applied a size threshold to the connected components following segmentation. In both cases, the training approach was the same. For each new whole-brain volume, we identified a set of subvolumes (\(99^3\) voxels for axons, \(49^3\) for somas) across a variety of brain regions, and annotated only a few slices (three for axons, five for somas) in each subvolume for our validation set. This strategy is similar to that employed in Friedmann et al. (2020). If our model could not achieve a satisfactory f-score on this validation dataset, we would annotate more subvolumes from the sample and add them to the training set until satisfactory performance was achieved.
We observed that this heterogeneous training procedure (i.e. training on multiple brain samples) often improved performance on other samples as well. In an experiment where we controlled the number of subvolumes used for training, this approach was at least as good as a homogeneous approach, where all training subvolumes came from a single brain sample (Figure 1d, g).
The pipeline can display the axon segmentation and soma detection results in a variety of ways, including brain-region-based bar charts accompanied by statistical tests (Fig. 1a.i), 2D plots with the atlas borders (Fig. 1a.ii), and 3D visualizations using brainrender (Fig. 1a.iii) (Claudi et al., 2021). Since every experimental design is unique, we designed our pipeline in a modular way, so investigators can pick and choose which components they want to incorporate in their own analyses. We also leverage existing software and file formats to facilitate interoperability (Tyson et al., 2022).
BrainLine enables accelerated analysis of brain-wide connectivity through parallel programming, the use of cloud-compliant file formats, and a machine learning training scheme that generalizes across brain samples. As a result, BrainLine alleviates the need for investigators to build custom analysis pipelines from scratch, helping them characterize the morphology and connectivity profiles of neurons, and discover new neuronal subtypes. BrainLine is available as a set of thoroughly documented notebooks and scripts in our Python package brainlit: http://brainlit.neurodata.io/.
References
Berg, S., Kutra, D., Kroeger, T. et al. (2019). Ilastik: interactive machine learning for (bio)image analysis. Nature methods, 16(12):1226–1232.
Chandrashekhar, V., Tward, D. J., Crowley, D. et al. (2021). Cloudreg: automatic terabyte-scale cross-modal brain volume registration. Nature methods, 18(8):845–846.
Claudi, F., Tyson, A. L., Petrucco, L., et al. (2021). Visualizing anatomically registered data with brainrender. eLife, 10:e65751.
Friedmann, D., Pun, A., Adams, E. L., et al. (2020). Mapping mesoscale axonal projections in the mouse brain using a 3d convolutional network. Proceedings of the National Academy of Sciences, 117(20):11068–11075.
Geisa, A., Mehta, R., Helm, H. S., et al. (2021). Towards a theory of out-of-distribution learning. arXiv preprint arXiv:2109.14501
Neuroglancer. (nd). https://github.com/google/neuroglancer
Pisano, T. J., Dhanerawala, Z. M., Kislin, M., et al. (2021). Homologous organization of cerebellar pathways to sensory, motor, and associative forebrain. Cell reports, 36(12):109721.
Quinonero-Candela, J., Sugiyama, M., Schwaighofer, A., & Lawrence, N. D. (2009). Dataset Shift in Machine Learning. MIT Press.
Tyson, A. L., & Margrie, T. W. (2022). Mesoscale microscopy and image analysis tools for understanding the brain. Progress in Biophysics and Molecular Biology, 168:81–93.
Wang, Q., Ding, S., Li, Yang, R., et al. (2020). The allen mouse brain common coordinate framework: a 3d reference atlas. Cell, 181(4), 936–953.
Acknowledgements
This work is supported by NIH Grants RF1MH121539, U19AG033655, and RO1AG066184-01, NSF grants 2031985, 2014862 and the CAREER award. M.W. is supported by NIMH Grant K08MH113039. K.D. is supported by NIMH, NIDA, the NIH BRAIN Initiative, the Integrated Circuit Cracking NeuroNex Technology Hub funded by the National Science Foundation, the NOMIS Foundation, the Else Kröner Fresenius Foundation, the Gatsby Foundation and the AE Foundation.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of Interest
M.I.M. owns a significant share of Anatomy Works with the arrangement being managed by Johns Hopkins University in accordance with its conflict of interest policies. V.C. owns a significant share of Neurosimplicity, LLC, which is a medical device and technology company focusing on medical image processing. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Athey, T.L., Wright, M.A., Pavlovic, M. et al. BrainLine: An Open Pipeline for Connectivity Analysis of Heterogeneous Whole-Brain Fluorescence Volumes. Neuroinform 21, 637–639 (2023). https://doi.org/10.1007/s12021-023-09638-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12021-023-09638-2