Abstract
We present a quantitative strategy to identify all projection neuron types from a given region with statistically different patterns of anatomical targeting. We first validate the technique with mouse primary motor cortex layer 6 data, yielding two clusters consistent with cortico-thalamic and intra-telencephalic neurons. We next analyze the presubiculum, a less-explored region, identifying five classes of projecting neurons with unique patterns of divergence, convergence, and specificity. We report several findings: individual classes target multiple subregions along defined functions; all hypothalamic regions are exclusively targeted by the same class also invading midbrain and agranular retrosplenial cortex; Cornu Ammonis receives input from a single class of presubicular axons also projecting to granular retrosplenial cortex; path distances from the presubiculum to the same targets differ significantly between classes, as do the path distances to distinct targets within most classes; the identified classes have highly non-uniform abundances; and presubicular somata are topographically segregated among classes. This study thus demonstrates that statistically distinct projections shed light on the functional organization of their circuit.
Similar content being viewed by others
Introduction
The classification of neurons in the mammalian nervous system has long been a focus of intensive investigation. While local features from slice preparations in vitro may suffice to infer the circuit roles of GABAergic interneurons1,2,3, long-range projecting axons are crucial architectural elements of neural organization4,5 constituting the conceptual and physical nexus between brain-wide circuits and synaptic communication6. Thus, projection axons have long been digitally traced from serial sections after in vivo labeling and light microscopy imaging7,8,9,10. At the same time, their macroscopic extent (~1 cm span; ~1 m cable length) and microscopic caliber (~100 nm branch thickness) combine into a formidable technological challenge for large-scale collection11,12. As a result, the number of completely reconstructed projection axons in any mammalian neural system has until recently remained into the low tens.
A source brain region projecting to N targets (where N typically ranges between 10 and 50 in the mouse cortex) could contain any combination of 2N−1 distinct axonal projection types. Such a combinatorics challenge requires a large-scale data collection for proper classification. Projects based on fluorescent Micro-Optical Sectioning Tomography (fMOST) technology13,14,15 or the Janelia MouseLight platform16, launched in recent years to address this need, produced nearly 10,000 mouse whole-brain single neuron reconstructions registered to a 3D Common Coordinate Framework (CCF)17 with consensus anatomical labeling18. However, these newly available data do not themselves generate novel scientific insights, explain brain circuitry, or even disprove that axons might simply invade a random subset of the regional target areas19. Rigorous methods are needed to test the hypothesis that specific projection types exist, to characterize their identities, and to quantify their population sizes20.
This study introduces an original technique to objectively identify projection-based neuronal classes. To ascertain whether a collection of axonal projections might result from essentially random variation within the constraints of regional connectivity or likely reflects distinct neuron types, we begin from the foundational criterion for classification: if a set of items belongs to segregated classes, their pairwise inter-individual differences must be on average larger between than within classes. In other words, two items from the same class should tend to be more similar to each other than two items from separate classes. To implement this logic into a classification framework, we couple rigorous statistical testing with unsupervised hierarchical clustering. A unique strength of this approach is its entirely data-driven granularity: the continuous accumulation of new tracings will progressively refine the classification details with increasing statistical power. We can then characterize the identified projection classes by quantifying their population size, topographic soma distributions, and convergence and divergence patterns.
In the remainder of this article, we first propose a formal definition of and a quantitative solution for the classification problem. We validate our approach by applying it to layer 6 of the primary motor cortex, and then utilize it to study the presubiculum, a rather under-investigated region of the mouse brain. We next quantify the neuronal population sizes of the presubicular projection classes and characterize the spatial distribution of their somata. Finally, we analyze the patterns of divergence and convergence of presubicular projection classes. We conclude by discussing the biological interpretations of these results.
Results
Quantitative solution of the classification problem
The axonal projections of each neuron in a source region can be represented as k-dimensional vectors, where k is the number of target regions invaded by the source region. Each of the k components of the vector quantifies the number of axonal points within the corresponding region (Fig. 1; see “Choice of metric to quantify axonal extent” in “Methods”). We explore the null hypothesis, H0, that all neurons from a source region belong to a single projection class (Fig. 2a), as opposed to the alternative hypothesis, HA, that distinct projection classes exist from that source region (Fig. 2b). If two hypothetical classes exist, the projections will be more similar between neurons within a class and more different across classes (Fig. 2c). In such a two-class scenario, the combined within- and across-class distances would thus form a wider distribution than the distribution generated if all neurons belong to just a single class (Fig. 2d). To formally test HA, we measure all pairwise differences between neurons (as arccosine vector distances, see “Methods”). We then generate the distribution of distances for H0 by randomizing the projection patterns while preserving total axonal extent both by neuron and target region. We achieve this single-class continuum by iterative stochastic swapping of axonal points between neurons across two target regions (see Fig. 2e and “Methods”). We can then apply Levene’s one-tail statistical test to ascertain whether the original distribution of pairwise distances has significantly larger variance than the randomized distribution. If the answer is positive, we must discard H0 and accept HA. Starting from the top node in an unsupervised hierarchical clustering tree, we can thus repeat Levene’s test on the neurons of each of the two subtrees, continuing the process until none of the variance differences are statistically significant (Fig. 2f). When Levene’s test fails (i.e., it provides a negative answer), the precise cutoff is determined independently of the other points of failure. Therefore, all neurons within a cluster (i.e., under the same Levene failure point) are statistically equivalent with respect to the axonal projection patterns across the target regions, but each cluster is independent of the other clusters. Moreover, there is no correspondence between the cutoff levels and the resulting number of neurons in each cluster.
Validation of the approach
To validate the above research design, we first analyzed 52 MouseLight layer 6 neurons from the primary motor cortex21 (Source data are provided as a Source Data file). This anatomical area is known to contain two distinct projection classes with well-defined subdivisions: cortico-thalamic (CT) and cortico-cortical or intra-telencephalic (IT) neurons22. The variance of the distribution of pairwise axonal projection differences of these neurons was significantly larger than that of the randomized projections (p = 6.46 × 10−51; variance of real data = 373.4; variance of randomized data = 195.7), indicating the existence of distinct clusters. However, both subtrees after the first split of unsupervised hierarchical clustering returned a non-significant Levene’s test (IT: p = N/A; variance of real data = 219.9; variance of randomized data = 240.0; CT: p = 0.24; variance of real data = 295.1; variance of randomized data = 264.0), revealing exactly two clusters (Fig. 3a). The first cluster, consisting of 21 neurons, projected almost exclusively to motor cortical targets; the second cluster of 31 neurons projected primarily to thalamic targets (Fig. 3b–d). These patterns were fully consistent with the axonal pathways of the IT and CT neuronal classes, respectively. This finding thus corroborates the validity of employing Levene’s test of variance on pairwise difference distributions to identify statistically distinct classes in unsupervised hierarchical clustering.
Classification of projection neurons from mouse presubiculum
We then applied our analytic technique to a lesser-explored source region of the mouse brain: the presubiculum. Unsupervised clustering and the test of variance demonstrated that the 93 MouseLight neurons from the presubiculum form five distinct projection classes (Fig. 4a–c). We designate each class by a letter (A-E) followed by the number of neurons in the class (Fig. 4c). The first class, A38, primarily targets the lateral entorhinal cortex (LEC), accounting for 82% of axonal extent outside of the presubiculum. This class also invades the dorsoventral (granular) retrosplenial cortex as well as the hippocampal formation (dentate gyrus, CA3, CA2, CA1, and subiculum). The second class, B27, mainly targets the dorsal portion of the medial entorhinal cortex (dMEC), accounting for 92.5% of extra-presubicular axonal extent, as well as retrohippocampal zone and parasubiculum. Class C3 neurons mostly target the contralateral dMEC (42%) and LEC (40%), subiculum (14%), and parasubiculum (4%) through extensive callosal and commissural fibers. Class D19 has the most complex (and unreported) pattern of innervation: in addition to major projections to the subiculum (40.8%) and dentate gyrus (16.3%), it is the sole source of projections to the lateral (agranular) retrosplenial cortex, to the hypothalamus (including the lateral mammillary nucleus and 18 additional nuclei), and to the superior and inferior colliculi in the midbrain. This neuronal class also projects to a subset of 8 thalamic nuclei, including the medial part of the anterior thalamic nucleus (ATN) and the lateral geniculate nucleus. Lastly, class E6 projects to a complementary set of 14 other thalamic nuclei, including the ventral, dorsal, anterior, and lateral parts of the ATN and the medial geniculate nucleus. Neurons from all five projection classes also have substantial collaterals within the presubiculum. Examples of projection neurons from each of the presubicular projection classes are depicted in Fig. 4d–e.
Presubicular classes have non-uniform population sizes
Next, we quantified the proportion of neurons in the mouse presubiculum that belong to each projection class. To this aim, we extracted the anterograde tract tracing density distributions from the Allen Institute regional connectivity atlas and matched the fractions of neurons in every class based on their axonal patterns by numerical optimization (see Non-Negative Least Squares in Methods; Source data are provided as a Source Data file). The results converged with very small residual error (<0.0006%) indicating a near-exact correspondence between single-neuron and regional projections. Fully sampling neurons from across the presubiculum, Class D19, reaching the midbrain, hypothalamus, lateral (agranular) retrosplenial, and the lateral geniculate (visual thalamus) accounted for the greatest portion (38.1%) of neurons. Class A38, targeting the hippocampus, subiculum, dorsoventral (granular) retrosplenial cortex, and lateral entorhinal cortex (what pathway), accounted for the second largest share (30.6%) of neurons. Class B27, projecting to the parasubiculum and medial entorhinal cortex (where pathway) consisted of 16.3% of projection neurons. Class E6, focused on other thalamic nuclei including medial geniculate (auditory), was responsible for 13.7% of presubicular neurons. The diffuse contralateral projections of class C3 comprised the remaining 1.3%.
When accounting for these relative proportions together with the MouseLight axonal projections, we can estimate the contribution of each class to the presubicular projections in each collection of target regions. In particular, the dentate gyrus receives 21% of its presubicular afferents from class A38 and 79% from class D19. The subiculum receives 69% of its presubicular afferents from class D19, 30% from class A38, and 1% from class C3. The lateral entorhinal cortex receives 99% of presubicular afferents from class A38 and 1% from class C3. The dorsal medial entorhinal cortex and parasubiculum receive 99% of presubicular afferents from class B27 and 1% from class C3. All other regions are targeted by individual classes: CA3, CA1, and the dorsoventral (granular) retrosplenial cortex by A38; the midbrain, hypothalamus, lateral (agranular) retrosplenial cortex, and part of the thalamic nuclei including medial ATN and lateral geniculate nucleus by D19; and the rest of the thalamic nuclei including dorsoventral ATN and medial geniculate nucleus by E6.
Somata distribution reveals class topographic organization
Computational geometry analysis of soma locations within the presubiculum demonstrated a clear spatial separation among the four main projection classes: A38, B27, D19, and E6 (the smallest class, C3, is largely contralateral projecting). Specifically, the convex hull volume of each neuron class overlapped only minimally (~5–20%) with that of other neuron classes (Fig. 5a–c). In particular, class A38 was positioned more rostrally and dorsally relative to the caudal-ventral position of class B27, with approximately 14% of overlap (Fig. 5a). The overlap of A38 was maximal with D19 (21%); however, while most A38 neurons had a selective somatic concentration in layer 2 (34/38: 89.5%), D19 had a somatic distribution across all 3 presubicular layers: 21% in layer 1 and 26% in layer 3 (Fig. 5b). Class E6 had the most lateral positioning resulting in almost complete segregation from the other projection classes: there were so few overlapping somata that a proper convex hull volume of the overlap could not be calculated (Fig. 5c, d).
Efferent path distances from the same neurons vary by target
We tested whether the path distances from presubicular neurons of a given projection class differed across their divergent target regions (Fig. 6). In these analyses of divergence, ipsilateral and contralateral targets were considered separately, as the latter are systematically farther than the former. For class A38 neurons, projection distances to the ipsilateral lateral entorhinal cortex, subiculum, and dentate gyrus are significantly shorter than those to the ipsilateral hippocampus; moreover, projection distances to the ipsilateral lateral entorhinal cortex are significantly longer than those to the ipsilateral subiculum and dentate gyrus. Similarly, projection distances to the contralateral subiculum and lateral entorhinal cortex are significantly shorter than those to the contralateral hippocampus. Thus, presubicular efferent path distances differ less between ipsilateral and contralateral hippocampus than between other targets across brain hemispheres (Fig. 6a). For class B27, projections to the ipsilateral parasubiculum have significantly shorter paths than those to medial entorhinal cortex, dorsal zone, but the distances are comparable in the contralateral case (Fig. 6b). Finally, for class D19, projections both to the ipsilateral medial anterior thalamic nucleus and lateral geniculate nucleus, and to the ipsilateral hypothalamus and lateral mammillary nucleus combined have significantly longer paths than those to the ipsilateral midbrain (Fig. 6c).
Afferent path distances to the same region vary by class
Next, we asked whether the axons from neurons of distinct projection classes converging onto their shared targets had different path distances. With the sole exception of the dentate gyrus, all target regions displayed a significant dependence of path distance on the presubicular neuron class (Fig. 7). For the ipsilateral medial entorhinal cortex, dorsal zone, projections from E6 and D19 have shorter distances than those from B27 and A38, and projections from B27 have significantly shorter distances than those from A38. For the contralateral medial entorhinal cortex, in contrast, projections from B27 have significantly longer distances than those from A38 (Fig. 7a). For the ipsilateral parasubiculum, path distances from D19 are significantly longer than those from B27 (Fig. 7b). Finally, for the contralateral subiculum, parasubiculum, and lateral entorhinal cortex, path distances from B27 are significantly longer than those from A38 (Fig. 7b–d).
Discussion
This study introduced an original method to objectively identify projection-based neuronal classes by pairing the Levene’s test with unsupervised hierarchical clustering. We first conducted a confirmatory study on layer 6 of the primary motor cortex to verify that the proposed technique could reproduce known projection types in a previously explored area of the mammalian brain. The results yielded two clusters with axonal projections consistent with those of the corticothalamic and intratelencephalic neuron classes found in past studies, thereby confirming the validity of the technique23.
Levene’s test was chosen because it is not dependent on the data distributions being normal. Given the size of current available data, normality cannot be assured. As the accumulation of data increases by several orders of magnitude, it is possible that other statistical tests, such as an F-test, could be used instead. Another form of unsupervised clustering, such as K-means, could be utilized to achieve similar ends to what we were able to achieve. The important aspect is that the method be unsupervised, such that the data themselves direct the clustering without any user input.
To test whether the technique could lead to novel insights, we then applied it to the presubiculum, a region with crucial cognitive function24, yet few studies on its circuitry25. The results yielded five clusters, indicating distinct neuron classes, which led us to reject the null hypothesis that projection neurons exhibit random variation within the constraints of regional connectivity from the presubiculum. In an earlier study26, retrograde tracing identified five classes of neurons projecting from the presubiculum, which target the retrosplenial cortex (corresponding to our class A38), contralateral subiculum (class C3), medial entorhinal cortex (class B27), anterior thalamic nucleus (class E6), and lateral mammillary nucleus (class D19). Our results confirm the existence of these five classes and add new information that reveals patterns of divergence (e.g., class A38 projects to the retrosplenial cortex, dentate gyrus, subiculum, and entorhinal cortex), convergence (e.g., the subiculum receives projections from classes A38, contralateral C3, and D19), and specificity (e.g., class E6 projects exclusively to the medial geniculate nucleus, and all hypothalamic regions receive projections solely from class D19; see summary Fig. 8).
The proposed clustering technique correctly distinguishes cortical (classes A38, B27, and C3) from subcortical (D19 and E6) pathways in the second binary split in the hierarchical classification. These results also add cellular level details to previously reported presubicular projections to retrosplenial cortex and thalamic reticular nuclei27, as well as a broader circuit context to the characterization of individual presubicular neurons targeting the medial entorhinal cortex28.
Furthermore, our findings reveal that several target regions are spatially subdivided according to the differing inputs between classes. These regions include the entorhinal cortex (lateral projections mainly from class A38 and medial projections primarily from class B27), retrosplenial cortex (dorsoventral granular projections almost exclusively from class A38 and lateral agranular projections solely from class D19), and thalamus (medial anterior thalamic nucleus and lateral geniculate nucleus projections principally from class D19 and dorsoventral anterior thalamic nucleus and medial geniculate nucleus projections predominantly from class E6). Some of these regional subdivisions also have known functional distinctions: for instance, the medial entorhinal cortex specializes in spatial representation while the lateral entorhinal cortex specializes in integrating sensory input29. Among the thalamic geniculate nuclei, the medial geniculate nucleus is part of the auditory pathway, whereas the lateral geniculate nucleus is part of the visual pathway4.
From a comparison of divergent path distances from one presubicular class to its major targets, along with a comparison of convergent path distances from each presubicular class to collectively major targets, we found that path distances to the same targets were significantly different between classes, as were the path distances to distinct targets within most classes. This might imply that electrical impulses reach different targets with varying delays, both within the same class and between classes.
Topographic analysis of presubicular classes revealed spatial separation between the somata of each class. Grid cells are co-localized with head-direction and border cells in dorsal presubiculum as compared to the ventral presubiculum30, in a manner similar to that found in the deeper layers of the medial entorhinal cortex31, implying that grid cells are more likely to be found in class A38 and E6 neurons than in class B27 neurons. Topographic analysis also suggests the possibility of anatomically mapping the input and output of the circuitry specializing in head direction computations32. Our reported topography of presubicular projections classes is consistent with the recently observed local modularity of the head-direction microcircuit33, and may help clarify the relationship between the egocentric and allocentric spatial and episodic representations of the cortico-hippocampal system34. Previous studies found head-direction cells in layer 3 of dorsal presubiculum33. Since class D19 neurons are found in layer 3, whereas class A38 neurons are mostly confined to layer 2, this would imply that head-direction cells make up part of the composition of class D19, but less so for class A38.
As with many secondary data analyses, we have limited knowledge of, and control over, artifactual shortcomings in the utilized datasets due to possible idiosyncrasies in labeling, imaging, tracing, registration, and mapping. However, the technique introduced with this work is applicable to many disparate sources of data besides MouseLight, including fMOST13,14,15 and even MapSeq/BarSeq35,36. These data sources follow separate experimental and computational protocols, allowing independent validation for the source regions in which these datasets overlap. Our results so far, in the cases of the mouse primary motor cortex and presubiculum, indicate that the executed analysis is robust to these possible confounding variables22.
Overall, this study revealed that neurons can be divided into distinct classes based on axonal projection patterns, as demonstrated in layer 6 of the primary motor cortex and the presubiculum. Our applied analyses can be used to similarly analyze neurons projecting from all other mouse brain regions with sufficient data. There are currently approximately 40 regions fitting this criterion in the existing datasets, but this number is expected to grow in the near future. Furthermore, we suggest the application of pairing Levene’s test and unsupervised hierarchical clustering to other complementary datasets, such as single-cell transcriptomic datasets, to classify neurons across a molecular domain, in addition to an anatomical domain, as demonstrated here. Moreover, all these complementary datasets are broadly expected to continue to grow in sample size, brain coverage, and acquisition pace37,38, supporting a call to establish cloud-based, community accessible pipelines for robust, rigorous, and systematic neuronal characterization39,40.
Methods
Choice of metric to quantify axonal extent
The axonal reconstructions utilized in this study are represented in the Janelia MouseLight public repository21 (available at http://ml-neuronbrowser.janelia.org) as SWC-formatted files41. This standard data structure captures each neuronal tracing point with a set of numerical values that include the three-dimensional coordinates, the local neurite radius, and the identity of the next point in the path to the root42. The spacing between consecutive points can be computed as 3D Euclidean distance of their locations, and the length of the axon as the sum of those distances.
It may be tempting to assume that length constitutes the most natural metric to quantify axonal extent in each brain region. However, it is important to remember that this dataset was collected by light microscopy and does not capture the distribution of presynaptic boutons. Therefore, it is not directly possible to distinguish synapse-bearing portions of the axonal arborization from fibers of passage. It is arguably the connectivity target regions that should guide classification rather than the regions through which the projection simply travels to reach its destinations. This can be a critical confounding factor as the longest unbranching stretches of cortical projecting neurons often correspond precisely to fibers of passage43.
In our own axonal reconstruction experience, we noticed that, while tracking branches from the image stack, it is natural to increase the density of tracing points when the arbor meanders in the synaptic neuropil than when it traverses layers devoid of potential postsynaptic partners10. Moreover, when we painstakingly identified and annotated the position of all axonal boutons in a different study, we found a tendency to utilize more tracing points per unit of length in bouton-rich branches than otherwise44. These observations are consistent with the need for greater sampling rates in the presence of larger signal gradients or first derivatives in terms of axonal curvature (neuropil meandering), radius (bouton swelling vs. shaft), or both (bifurcation points).
In the MouseLight dataset analyzed here (Source data are provided as a Source Data file), the number of points in an axonal branch and the corresponding branch length are significantly linearly correlated (Pearson R = 0.742; N = 19,847; p < 10−99). To determine whether the average spacing between points varies non-uniformly between supposed fibers of passage and putative synapse-bearing axons, we separated the axonal branches in each presubiculum neuron based on Strahler (centripetal) order, namely order 1–3 (terminal, pre-terminal, and pre-pre-terminal branches) from order 4–6 (those more than 2 bifurcations away from an ending). This choice is justified by converging experimental evidence that cortical axons make most presynaptic contacts at Strahler orders 1–3, while boutons are substantially sparser at orders 4–645,46. This is also consistent with the strongly non-uniform distribution of average branch length in the dataset investigated in this study, indicating more likely fibers of passage at Strahler order 4–6 (964.4 ± 1037.1 µm) than at Strahler order 1–3 (144.7 ± 80.2 µm; one-tail t-test p = 4.8 × 10−11; t-value = −7.35; df = 88). We found indeed that the average spacing of tracing points is significantly smaller at order 1–3 (22.54 ± 10.61 µm) than at Strahler order 4–6 (39.07 ± 27.43 µm; one-tail t-test p = 3.5 × 10−7; t-value = −5.26; df = 111). This again supports the notion that the number of tracing points is a better proxy indicator of synapse-bearing axonal extent than total length. We thus chose to utilize the number of tracing points, and not arbor length, as the metric to classify axonal projections.
Data extraction and storage
The location of each axonal data point for nearly 1100 neurons was extracted from JSON files from the MouseLight dataset21 using the freeware JSONLab v1.5 (v2.0 is now available at https://sourceforge.net/projects/iso2mesh/files/jsonlab/2.0%20%28Magnus%20Prime%29/jsonlab-2.0.zip/download), where the three-dimensional coordinates and parcel information were provided for each axonal point of the neuron. The number of axonal points in each brain parcel were tabulated for all neurons and were stored in a matrix, in which each row represents a neuron, each column represents a parcel, and the values in each cell represent the axonal counts of a particular neuron in a particular region (Fig. 1; Source data are provided as a Source Data file).
Hypothesis design
To determine whether distinct projection classes of neurons exist from a particular parcel of the brain, hypothesis HA, we tested the pairwise differences between neurons from the experimental matrices described above. If only a single class of neurons exists, then only a single distribution of differences between neurons will be generated (Fig. 2a). If two hypothetical classes exist, then the differences between neurons, evaluated two at a time, will be smaller within a given class than across the two classes (Fig. 2b, c). In a multi-class scenario, a histogram of the differences between neurons should be wider than the distribution generated when all the neurons belong to just a single class (Fig. 2d). To generate the distribution of differences for the null hypothesis, H0, a randomized control matrix was generated from the original experimental matrix through multiple iterations of the stochastic pairwise swapping of axonal counts from two neurons across two target regions (Fig. 2e). This method randomized the projection patterns, yielding a continuum consistent with the regional connectivity of Fig. 2a, while preserving axonal sizes (row sums) and regional targeting (column sums) of the original experimental matrix.
Levene’s test
We assessed the hypothesis that the variance of experimental data was significantly larger than the variance of randomized data (α = 0.05). For both the experimental and randomized matrices, we computed the arccosine between a pair of neuronal vectors, each composed of the axonal counts across all target regions (https://github.com/Projectomics/MATLAB). These angles measure the projection difference of two neurons across all brain parcels. We then performed a 1-tailed Levene’s test47 on the angle distributions of the experimental and randomized matrices to assess whether their variances differed significantly. To this aim, we used the MATLAB function vartestn with the TestType parameter set to LeveneAbsolute. If the experimental data had a greater variance than the randomized data, then the experimental data could be further divided into classes, consistent with the scenario presented in Fig. 2b.
Unsupervised hierarchical clustering
We used unsupervised agglomerative hierarchical clustering to determine a biologically accurate division of neuron classes based on axonal projection patterns. Specifically, the MATLAB linkage function, with the average algorithm for computing distance between clusters, was utilized on the 93 MouseLight neurons originating in the presubiculum and the 52 MouseLight neurons originating in layer 6 of the primary motor cortex. The initial assumption (null hypothesis) was that all neurons were part of a single class. If Levene’s test yielded significant results, the number of class divisions was incremented, and the technique was again repeated on each class division. This iterative process continued until none of the subdivided classes yielded significant results, thereby yielding the final class divisions (Fig. 2f).
Non-negative least squares
To estimate the fractional counts of cells in each of k projection classes in each region, we matched their respective single-cell axonal patterns against the regional connectivity from anterograde tracing to the m known targets, as presented in the Allen Mouse Brain Connectivity Atlas (http://connectivity.brain-map.org/projection). The problem is equivalent to a set of constrained, weighted, linear equations that can be solved numerically by non-negative least-square (NNLS) optimization48. NNLS finds the values x that minimizes the Euclidean norm of (Ax - b) with the constraint x ≥ 049, where x is the k-dimensional vector representing the fractions of neurons in each class; b is the m-dimensional vector representing the weights of the regional projections to each target; and A is a k-by-m matrix with rows representing the projections of each class (the sum of the summary vectors of the corresponding neurons) and columns representing target regions. NNLS was computed using the lsqnonneg function in MATLAB.
Matrix A and vector b were based on data from the MouseLight dataset (Source data are provided as a Source Data file) and the Allen Mouse Brain Connectivity Atlas, respectively. Setting the target region to the whole brain in the Connectivity Atlas and the source region to the presubiculum resulted in 7 tracing experiments, which included projection volumes and projection densities for each target brain region. Cross referencing the targeted regions of the MouseLight axonal projections with target regions that appeared in all 7 anterograde tracing experiments resulted in a listing of 66 regions. Matrix A was created with rows representing these 66 brain regions and columns representing the 5 neuron classes found by pairing Levene’s test with unsupervised hierarchical clustering of the presubiculum data (Source data are provided as a Source Data file). The average projection volume and density values for each of the 66 regions were calculated from the 7 experiments, and the averages were multiplied to populate the columns of vector b.
To obtain the highest confidence in the NNLS analysis, matrix A was sequentially bi-normalized first by axonal length and then by invaded region (Source data are provided as a Source Data file). Specifically, first each cell in matrix A was normalized so that each row summed to one. Next, each value was divided by the number of regions, 66, and multiplied by the number of clusters, 5, such that the sum of all values in matrix A equaled 5. Subsequently, each cell in matrix A was normalized so that each column summed to one. Vector b was normalized such that the sum of all values equaled to one. Finally, the squared Euclidean norm of the residual of the MATLAB function lsqnonneg was calculated as a proxy for the uncertainty of the analysis.
Soma analysis
To quantify the spatial separation among the somata among the neuron projection classes in the presubiculum, we performed a convex hull analysis for the location of the soma centers in each class using MATLAB. To create the convex hull, outliers were removed by iteratively going through all points in each class and calculating the volume of the convex hull without each point. If the volume differed by more than 1/n of the volume of the original convex hull, which included all points, the point was considered an outlier and removed from the dataset. This established an algorithmic thresholding that corresponded well with the visual inspection of potential outliers. However, if removing the outliers resulted in fewer than four somata, the minimal number of points required to conduct a convex hull analysis, all points were considered. Between each pair of convex hulls, the proportion of the volume of overlap to the volume of the union of the convex hulls was used to assess the similarity between topographic locations.
Analysis of divergence and convergence
Utilizing the original JSON data files, for every neuron in each presubiculum class, we measured the path distance from the soma to each axonal point in the target region. We then calculated the median path distance to each target region across all neurons in the class, and performed a Wilcoxon Signed Rank Test50, using the MATLAB function ranksum, to assess whether the path distances to each characteristic target of a particular class were significantly different. Using the same data files, we also performed a Wilcoxon Signed Rank Test to assess whether the path distances to each characteristic target between all clusters were significantly different. In both sets of comparisons, multiple testing was corrected for by False Discovery Rate to determine the significance of the resultant p-values.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Code availability
All code is available in the GitHub repository at https://github.com/Projectomics.
References
Fishell, G. & Kepecs, A. Interneuron types as attractors and controllers. Annu. Rev. Neurosci. 43, 1–30 (2020).
Jiang, X. et al. Principles of connectivity among morphologically defined cell types in adult neocortex. Science 350, aac9462 (2015).
DeFelipe, J. et al. New insights into the classification and nomenclature of cortical GABAergic interneurons. Nat. Rev. Neurosci. 14, 202–216 (2013).
Sherman, S. M. & Guillery, R. W. Distinct functions for direct and transthalamic corticocortical connections. J. Neurophysiol. 106, 1068–1077 (2011).
Winnubst, J., Spruston, N. & Harris, J. A. Linking axon morphology to gene expression: a strategy for neuronal cell-type classification. Curr. Opin. Neurobiol. 65, 70–76 (2020).
Ascoli, G. A. Trees of the brain, roots of the mind. (The MIT Press, 2015).
François, C., Tande, D., Yelnik, J. & Hirsch, E. C. Distribution and morphology of nigral axons projecting to the thalamus in primates. J. Comp. Neurol. 447, 249–260 (2002).
Rochefort, N. L. et al. Functional selectivity of interhemispheric connections in cat visual cortex. Cereb. Cortex 19, 2451–2465 (2009).
Rojas-Piloni, G. et al. Relationships between structure, in vivo function and long-range axonal target of cortical pyramidal tract neurons. Nat. Commun. 8, 870 (2017).
Ropireddy, D., Scorcioni, R., Lasher, B., Buzsáki, G. & Ascoli, G. A. Axonal morphometry of hippocampal pyramidal neurons semi-automatically reconstructed after in vivo labeling in different CA3 locations. Brain Struct. Funct. 216, 1–15 (2011).
DeFelipe, J. From the connectome to the synaptome: an epic love story. Science 330, 1198–1201 (2010).
Helmstaedter, M. & Mitra, P. P. Computational methods and challenges for large-scale circuit mapping. Curr. Opin. Neurobiol. 22, 162–169 (2012).
Gao, L. et al. Single-neuron projectome of mouse prefrontal cortex. Nat. Neurosci. 25, 515–529 (2022).
Gong, H. et al. High-throughput dual-colour precision imaging for brain-wide connectome with cytoarchitectonic landmarks at the cellular level. Nat. Commun. 7, 12142 (2016).
Peng, H. et al. Morphological diversity of single neurons in molecularly defined cell types. Nature 598, 174–181 (2021).
Economo, M. N. et al. A platform for brain-wide imaging and reconstruction of individual neurons. Elife 5, e10566 (2016).
Wang, Q. et al. The Allen mouse brain common coordinate framework: a 3D reference atlas. Cell 181, 936–953.e20 (2020).
Chon, U., Vanselow, D. J., Cheng, K. C. & Kim, Y. Enhanced and unified anatomical labeling for a common mouse brain atlas. Nat. Commun. 10, 5067 (2019).
Reimann, M. W. et al. A null model of the mouse whole-neocortex micro-connectome. Nat. Commun. 10, 3903 (2019).
Armañanzas, R. & Ascoli, G. A. Towards the automatic classification of neurons. Trends Neurosci. 38, 307–318 (2015).
Winnubst, J. et al. Reconstruction of 1000 projection neurons reveals new cell types and organization of long-range connectivity in the mouse brain. Cell 179, 268–281.e13 (2019).
Muñoz-Castañeda, R. et al. Cellular anatomy of the mouse primary motor cortex. Nature 598, 159–166 (2021).
Shepherd, G. M. G. Corticostriatal connectivity and its role in disease. Nat. Rev. Neurosci. 14, 278–291 (2013).
Angelaki, D. E. & Laurens, J. The head direction cell network: attractor dynamics, integration within the navigation system, and three-dimensional properties. Curr. Opin. Neurobiol. 60, 136–144 (2020).
Jacobs, H. I. L. et al. The presubiculum links incipient amyloid and tau pathology to memory function in older persons. Neurology 94, e1916–e1928 (2020).
Preston-Ferrer, P., Coletta, S., Frey, M. & Burgalossi, A. Anatomical organization of presubicular head-direction circuits. Elife 5, e14592 (2016).
Vantomme, G. et al. A thalamic reticular circuit for head direction cell tuning and spatial navigation. Cell Rep. 31, 107747 (2020).
Honda, Y. & Furuta, T. Multiple patterns of axonal collateralization of single layer III neurons of the rat presubiculum. Front. Neural Circuits 13, 45 (2019).
Knierim, J. J., Neunuebel, J. P. & Deshmukh, S. S. Functional correlates of the lateral and medial entorhinal cortex: objects, path integration and local-global reference frames. Philos. Trans. R. Soc. Lond. B Biol. Sci. 369, 20130369 (2014).
Boccara, C. N. et al. Grid cells in pre- and parasubiculum. Nat. Neurosci. 13, 987–994 (2010).
Sargolini, F. et al. Conjunctive representation of position, direction, and velocity in entorhinal cortex. Science 312, 758–762 (2006).
Taube, J. S. The head direction signal: origins and sensory-motor integration. Annu. Rev. Neurosci. 30, 181–207 (2007).
Balsamo, G. et al. Modular microcircuit organization of the presubicular head-direction map. Cell Rep. 39, 110684 (2022).
Wang, C., Chen, X. & Knierim, J. J. Egocentric and allocentric representations of space in the rodent brain. Curr. Opin. Neurobiol. 60, 12–20 (2020).
Kebschull, J. M. & Zador, A. M. Cellular barcoding: lineage tracing, screening and beyond. Nat. Methods 15, 871–879 (2018).
Sun, Q. et al. A whole-brain map of long-range inputs to GABAergic interneurons in the mouse medial prefrontal cortex. Nat. Neurosci. 22, 1357–1370 (2019).
David, K. K., Fang, H. Y., Peng, G. C. Y. & Gnadt, J. W. NIH BRAIN circuits programs: an experiment in supporting team neuroscience. Neuron 108, 1020–1024 (2020).
Ecker, J. R. et al. The BRAIN initiative cell census consortium: lessons learned toward generating a comprehensive brain cell atlas. Neuron 96, 542–557 (2017).
Hsu, N. S. et al. The promise of the BRAIN initiative: NIH strategies for understanding neural circuit function. Curr. Opin. Neurobiol. 65, 162–166 (2020).
Litvina, E. et al. BRAIN initiative: cutting-edge tools and resources for the community. J. Neurosci. 39, 8275–8284 (2019).
Nanda, S. et al. Design and implementation of multi-signal and time-varying neural reconstructions. Sci. Data 5, 170207 (2018).
Mehta, K. et al. Online conversion of reconstructed neural morphologies into standardized SWC format. Nat Commun. 14, 7429 (2023).
Anderson, J. C., Binzegger, T., Douglas, R. J. & Martin, K. A. C. Chance or design? Some specific considerations concerning synaptic boutons in cat visual cortex. J. Neurocytol. 31, 211–229 (2002).
Brown, K. M., Sugihara, I., Shinoda, Y. & Ascoli, G. A. Digital morphometry of rat cerebellar climbing fibers reveals distinct branch and bouton types. J. Neurosci. 32, 14670–14684 (2012).
Budd, J. M. L. et al. Neocortical axon arbors trade-off material and conduction delay conservation. PLoS Comput. Biol. 6, e1000711 (2010).
Qian, P., Manubens-Gil, L., Jiang, S. & Peng, H. Non-homogenous axonal bouton distribution in whole-brain single cell neuronal networks. http://biorxiv.org/lookup/doi/10.1101/2023.08.07.552361 (2023) https://doi.org/10.1101/2023.08.07.552361.
Levene, H. Robust Tests for Equality of Variances. In Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling 278–292 (Stanford University Press, 1960).
Lawson, C. L. & Hanson, R. J. Solving Least Squares Problems. (Society for Industrial and Applied Mathematics, 1995).
Lin, C.-J. Projected gradient methods for nonnegative matrix factorization. Neural Comput. 19, 2756–2779 (2007).
Wilcoxon, F. Individual comparisons by ranking methods. Biom. Bull. 1, 80–83 (1945).
Acknowledgements
We thank Dr. Rodrigo Muñoz-Castañeda for help with validating the mapping of neuronal reconstructions to anatomical coordinates. This work was supported in part by NIH grants R01NS39600, U01MH114829, and RF1MH128693, all to G.A.A.
Author information
Authors and Affiliations
Contributions
D.W.W., S.B., S.S., and S.V. contributed to the analysis and interpretation of data, to the writing of software, and to the writing of the manuscript. G.A.A. contributed to the conception of the project, to the analysis and interpretation of data, and to the writing of the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Charles Gerfen and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Wheeler, D.W., Banduri, S., Sankararaman, S. et al. Unsupervised classification of brain-wide axons reveals the presubiculum neuronal projection blueprint. Nat Commun 15, 1555 (2024). https://doi.org/10.1038/s41467-024-45741-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-024-45741-x
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.