A method for repeated neural circuit identification in noisy brain graph data
As larger neural circuit data becomes broadly available to the research community, researchers look to the brain to understand why humans perform certain tasks robustly and efficiently and the underlying circuitry for some neurological disorders. In particular, discovery of repeated structure in large, newly collected brain image volumes would support the conjecture that the brain is modularly organized. At the same time, information extracted from brain imaging is inherently noisy due to errors manifested at all stages of the reconstruction process and the inability of humans to proofread or ground truth the vast amount of data available. Robust methods to analyze brain data could lead to the discovery of repeated brain structure, even in the presence of errors. We define a probabilistic approach to identify significant subgraph structures within imperfect graph data, allowing us to capture uncertainty in our discovery process and perform inference over noisy data. Our probabilistic approach uses graph data where edges are not binary, but rather have some confidence level, or weight, associated with them, as you might expect to obtain from a computer vision algorithm. While current methods often threshold the edges based on their weights, we instead use the edge weights to define a random graph model similar to an Erdös-Rényi model, but where the edges have varying probabilities based on the provided edge weight, thus creating a data-driven probabilistic graph. The intuition is that the true, underlying graph and small variations of it would occur with high probability in this model. Once the random graph model is defined, we use standard probabilistic graph techniques and sampling to determine the distribution of a subgraph occurring in this data-driven model. This distribution may then be compared with that from a standard Erdös-Rényi model with similar expected density using the Kolmogorov-Smirnov test. In other words, we compare the existence of a subgraph in our data-driven random graph model with the existence of that subgraph in a purely random model. Thus, we work towards identifying structural motifs in the presence of unknown reconstruction errors. We apply our methods to a handful of small subgraphs for initial testing of this approach and compare to the results obtained using a thresholded graph.