The ISC is part of the Johns Hopkins Applied Physics Laboratory and will follow all current policies. Please visit the JHU/APL page for more information on the Lab's visitor guidance.




Browse Items


Human-aware robot navigation promises a range of applications in which mobile robots bring versatile assistance to people in common human environments. While prior research has mostly focused on modeling pedestrians as independent, intentional individuals, people move in groups; consequently, it is imperative for mobile robots to respect human groups when navigating around people. This paper explores learning group-aware navigation policies based on dynamic group formation using deep reinforcement learning. Through simulation experiments, we show that group-aware policies, compared to baseline policies that neglect human groups, achieve greater robot navigation performance (e.g., fewer collisions), minimize violation of social norms and discomfort, and reduce the robot's movement impact on pedestrians. Our results contribute to the development of social navigation and the integration of mobile robots into human environments.


Skin lesions can be an early indicator of a wide range of infectious and other diseases. The use of deep learning (DL) models to diagnose skin lesions has great potential in assisting clinicians with prescreening patients. However, these models often learn biases inherent in training data, which can lead to a performance gap in the diagnosis of people with light and/or dark skin tones. To the best of our knowledge, limited work has been done on identifying, let alone reducing, model bias in skin disease classification and segmentation. In this paper, we examine DL fairness and demonstrate the existence of bias in classification and segmentation models for subpopulations with darker skin tones compared to individuals with lighter skin tones, for specific diseases including Lyme, Tinea Corporis and Herpes Zoster. Then, we propose a novel preprocessing, data alteration method, called EDGEMIXUP, to improve model fairness with a linear combination of an input skin lesion image and a corresponding a predicted edge detection mask combined with color saturation alteration. For the task of skin disease classification, EDGEMIXUP outperforms much more complex competing methods such as adversarial approaches, achieving a 10.99% reduction in accuracy gap between light and dark skin tone samples, and resulting in 8.4% improved performance for an underrepresented subpopulation.

Frontiers in Neuroinformatics

Neuroscientists can leverage technological advances to image neural tissue across a range of different scales, potentially forming the basis for the next generation of brain atlases and circuit reconstructions at submicron resolution, using Electron Microscopy and X-ray Microtomography modalities. However, there is variability in data collection, annotation, and storage approaches, which limits effective comparative and secondary analysis. There has been great progress in standardizing interfaces for large-scale spatial image data, but more work is needed to standardize annotations, especially metadata associated with neuroanatomical entities. Standardization will enable validation, sharing, and replication, greatly amplifying investment throughout the connectomics community. We share key design considerations and a usecase developed for metadata for a recent large-scale dataset.

ICML 2022

We propose an approach to solving partial differential equations (PDEs) using a set of neural networks which we call Neural Basis Functions (NBF). This NBF framework is a novel variation of the POD DeepONet operator learning approach where we regress a set of neural networks onto a reduced order Proper Orthogonal Decomposition (POD) basis. These networks are then used in combination with a branch network that ingests the parameters of the prescribed PDE to compute a reduced order approximation to the PDE. This approach is applied to the steady state Euler equations for high speed flow conditions (mach 10-30) where we consider the 2D flow around a cylinder which develops a shock condition. We then use the NBF predictions as initial conditions to a high fidelity Computational Fluid Dynamics (CFD) solver (CFD++) to show faster convergence. Lessons learned for training and implementing this algorithm will be presented as well.

ICML 2022

Properties of interest for crystals and molecules, such as band gap, elasticity, and solubility, are generally related to each other: they are governed by the same underlying laws of physics. However, when state-of-the-art graph neural networks attempt to predict multiple properties simultaneously (the multi-task learning (MTL) setting), they frequently underperform a suite of single property predictors. This suggests graph networks may not be fully leveraging these underlying similarities.  Here we investigate a potential explanation for this phenomenon – the curvature of each property’s loss surface significantly varies, leading to inefficient learning. This difference incurvature can be assessed by looking at spectral properties of the Hessians of each property’s loss function, which is done in a matrix-free manner via randomized numerical linear algebra. We evaluate our hypothesis on two benchmark datasets (Materials Project (MP) and QM8) and consider how these findings can inform the training of novel multi-task learning models.

ICML 2022

One of the most fundamental design choices in neural networks is layer width: it affects the capacity of what a network can learn and determines the complexity of the solution. This latter property is often exploited when introducing information bottlenecks, forcing a network to learn compressed representations. However, such an architecture decision is typically immutable once training begins; switching to a more compressed architecture requires retraining. In this paper we present a new layer design, called Triangular Dropout, which does not have this limitation. After training, the layer can be arbitrarily reduced in width to exchange performance for narrowness. We demonstrate the construction and potential use cases of such a mechanism in three areas. Firstly, we describe the formulation of Triangular Dropout in autoencoders, creating an MNIST autoencoder with selectable compression after training. Secondly, we add Triangular Dropout to VGG19 on ImageNet, creating a powerful network which, without retraining, can be significantly reduced in parameters with only small changes to classification accuracy. Lastly, we explore the application of Triangular Dropout to reinforcement learning (RL) policies on selected control problems, showing that it can be used to characterize the complexity of RL tasks, a critical measurement in multitask learning and lifelong-learning domains.

Frontiers in Neuroinformatics

Technological advances in imaging and data acquisition are leading to the development of petabyte-scale neuroscience image datasets. These large-scale volumetric datasets pose unique challenges since analyses often span the entire volume, requiring a unified platform to access it. In this paper, we describe the Brain Observatory Storage Service and Database (BossDB), a cloud-based solution for storing and accessing petascale image datasets. BossDB provides support for data ingest, storage, visualization, and sharing through a RESTful Application Programming Interface (API). A key feature is the scalable indexing of spatial data and automatic and manual annotations to facilitate data discovery. Our project is open source and can be easily and cost effectively used for a variety of modalities and applications, and has effectively worked with datasets over a petabyte in size.

ICRA 2022

Recent research has enabled fixed-wing unmanned aerial vehicles (UAVs) to maneuver in constrained spaces through the use of direct nonlinear model predictive control (NMPC). However, this approach has been limited to a priori known maps and ground truth state measurements. In this paper, we present a direct NMPC approach that leverages NanoMap, a light-weight point-cloud mapping framework to generate collision-free trajectories using onboard stereo vision. We first explore our approach in simulation and demonstrate that our algorithm is sufficient to enable vision-based navigation in urban environments. We then demonstrate our approach in hardware using a 42-inch fixed-wing UAV and show that our motion planning algorithm is capable of navigating around a building using a minimalistic set of goal-points. We also show that storing a point-cloud history is important for navigating these types of constrained environments.


Network science is a powerful tool that can be used to better explore the complex structure of brain networks. Leveraging graph and motif analysis tools, we interrogate C. elegans connectomes across multiple developmental time points and compare the resulting graph characteristics and substructures over time. We show the evolution of the networks and highlight stable invariants and patterns as well as those that grow or decay unexpectedly, providing a substrate for additional analysis.

ICRA 2022

Robot navigation traditionally relies on building an explicit map that is used to plan collision-free trajectories to a desired target. In deformable, complex terrain, using geometric-based approaches can fail to find a path due to mischaracterizing deformable objects as rigid and impassable. Instead, we learn to predict an estimate of traversability of terrain regions and to prefer regions that are easier to navigate (e.g., short grass over small shrubs). Rather than predicting collisions, we instead regress on realized error compared to a canonical dynamics model. We train with an on-policy approach, resulting in successful navigation policies using as little as 50 minutes of training data split across simulation and real world. Our learning-based navigation system is a sample efficient short-term planner that we demonstrate on a Clearpath Husky navigating through a variety of terrain including grassland and forest.


Wilson loop diagrams are an important tool in studying scattering amplitudes of SYM N = 4 theory and are known by previous work to be associated to positroids. We characterize the conditions under which two Wilson loop diagrams give the same positroid, prove that an important subclass of subdiagrams (exact subdiagrams) correspond to uniform matroids, and enumerate the number of different Wilson loop diagrams that correspond to each positroid cell. We also give a correspondence between those positroids which can arise from Wilson loop diagrams and directions in associahedra.

2021 43rd Annual International Conference of the IEEE Engineering in Medicine Biology Society (EMBC)

As neuroimagery datasets continue to grow in size, the complexity of data analyses can require a detailed understanding and implementation of systems computer science for storage, access, processing, and sharing. Currently, several general data standards (e.g., Zarr, HDF5, precomputed) and purpose-built ecosystems (e.g., BossDB, CloudVolume, DVID, and Knossos) exist. Each of these systems has advantages and limitations and is most appropriate for different use cases. Using datasets that don’t fit into RAM in this heterogeneous environment is challenging, and significant barriers exist to leverage underlying research investments. In this manuscript, we outline our perspective for how to approach this challenge through the use of community provided, standardized interfaces that unify various computational backends and abstract computer science challenges from the scientist. We introduce desirable design patterns and share our reference implementation called intern.

Nature Scientific Reports

Brock A. Wester; Jennifer A. Stiso; Elizabeth P. Reilly; Jordan K. Matelsky; Erik C. Johnson; William R. Gray Roncal

Recent advances in neuroscience have enabled the exploration of brain structure at the level of individual synaptic connections. These connectomics datasets continue to grow in size and complexity; methods to search for and identify interesting graph patterns offer a promising approach to quickly reduce data dimensionality and enable discovery. These graphs are often too large to be analyzed manually, presenting significant barriers to searching for structure and testing hypotheses. We combine graph database and analysis libraries with an easy-to-use neuroscience grammar suitable for rapidly constructing queries and searching for subgraphs and patterns of interest. Our approach abstracts many of the computer science and graph theory challenges associated with nanoscale brain network analysis and allows scientists to quickly conduct research at scale. We demonstrate the utility of these tools by searching for motifs on simulated data and real public connectomics datasets, and we share simple and complex structures relevant to the neuroscience community. We contextualize our findings and provide case studies and software to motivate future neuroscience exploration.

International Conference on Learning Representations (ICLR)

Kiran Karra; Clayton C. Ashcraft

In this paper, we propose a new data poisoning attack and apply it to deep reinforcement learning agents. Our attack centers on what we call in-distribution triggers, which are triggers native to the data distributions the model will be trained on and deployed in. We outline a simple procedure for embedding these, and other, triggers in deep reinforcement learning agents following a multi-task learning paradigm, and demonstrate in three common reinforcement learning environments. We believe that this work has important implications for the security of deep learning models.

AAAI Workshop on Artificial Intelligence and Safety (SafeAI)

I-Jeng Wang; Jared J. Markowitz; Marie Chau

Current deep reinforcement learning (DRL) methods fail to address risk in an intelligent manner, potentially leading to unsafe behaviors when deployed. One strategy for improving agent risk management is to mimic human behavior. While imperfect, human risk processing displays two key benefits absent from standard artificial agents: accounting for rare but consequential events and incorporating context. The former ability may prevent catastrophic outcomes in unfamiliar settings while the latter results in asymmetric processing of potential gains and losses. These two attributes have been quantified by behavioral economists and form the basis of cumulative prospect theory (CPT), a leading model of human decision-making. We introduce a two-step method for training DRL agents to maximize the CPT-value of full episode rewards accumulated from an environment, rather than the standard practice of maximizing expected discounted rewards. We quantitatively compare the distribution of outcomes when optimizing full-episode expected reward, CPT-value, and conditional value-at-risk (CVaR) in the CrowdSim robot navigation environment, elucidating the impacts of different objectives on the agent’s willingness to trade safety for speed. We find that properly-configured maximization of CPT-value allows for a reduction of the frequency of negative outcomes with only a slight degradation of the best outcomes, compared to maximization of expected reward.

Neural Computation

I-Jeng Wang; William A. Paul; Philippe M. Burlina

This work focuses on the ability to control via latent space factors semantic image attributes in generative models, and the faculty to discover mappings from factors to attributes in an unsupervised fashion. The discovery of controllable semantic attributes is of special importance, as it would facilitate higher level tasks such as unsupervised representation learning to improve anomaly detection, or the controlled generation of novel data for domain shift and imbalanced datasets. The ability to control semantic attributes is related to the disentanglement of latent factors, which dictates that latent factors be "uncorrelated" in their effects. Unfortunately, despite past progress, the connection between control and disentanglement remains, at best, confused and entangled, requiring clarifications we hope to provide in this work. To this end, we study the design of algorithms for image generation that allow unsupervised discovery and control of semantic attributes.We make several contributions: a) We bring order to the concepts of control and disentanglement, by providing an analytical derivation that connects mutual information maximization, which promotes attribute control, to total correlation minimization, which relates to disentanglement. b) We propose hybrid generative model architectures that use mutual information maximization with multi-scale style transfer. c) We introduce a novel metric to characterize the performance of semantic attributes control. We report experiments that appear to demonstrate, quantitatively and qualitatively, the ability of the proposed model to perform satisfactory control while still preserving competitive visual quality. We compare to other state of the art methods (e.g., Frechet inception distance (FID)= 9.90 on CelebA and 4.52 on EyePACS).

Conference on Computer Vision and Pattern Recognition (CVPR)

Anshu Saksena; Marisa J. Hughes; Sally A. Matson; Ryan N. Mukherjee; Derek M. Rollend; Armin Hadzic; Gordon A. Christie

Road transportation is one of the largest sectors of greenhouse gas (GHG) emissions affecting climate change. Tackling climate change as a global community will require new capabilities to measure and inventory road transport emissions. However, the large scale and distributed nature of vehicle emissions make this sector especially challenging for existing inventory methods. In this work, we develop machine learning models that use satellite imagery to perform indirect top-down estimation of road transport emissions. Our initial experiments focus on the United States, where a bottom-up inventory was available for training our models. We achieved a mean absolute error (MAE) of 39.5 kg CO2 of annual road transport emissions, calculated on a pixel-by-pixel (100 m2) basis in Sentinel-2 imagery. We also discuss key model assumptions and challenges that need to be addressed to develop models capable of generalizing to global geography. We believe this work is the first published approach for automated indirect top-down estimation of road transport sector emissions using visual imagery and represents a critical step towards scalable, global, near real-time road transportation emissions inventories that are measured both independently and objectively.


Miller L. Wilt; Brock A. Wester; Jordan K. Matelsky; William R. Gray Roncal; Joseph T. Downs; Caitlyn A. Bishop

The nanoscale connectomics community has recently generated automated and semi-automated “wiring diagrams” of brain subregions from terabytes and petabytes of dense 3D neuroimagery. This process involves many chal-lenging and imperfect technical steps, including dense 3Dimage segmentation, anisotropic nonrigid image alignment and coregistration, and pixel classification of each neuron and their individual synaptic connections. As data volumes continue to grow in size, and connectome generation becomes increasingly commonplace, it is important that the scientific community isable to rapidly assess the quality and accuracy of a connectome product to promote dataset analysis and reuse. In this work, we share our scalable toolkit for assessing the quality of a connectome reconstruction via targeted inquiry and large-scale graph analysis, and to provide insights into how such connectome proofreading processes may be improved and optimized in the future. We illustrate the applications and ecosystem on a recent reference dataset.

Conference on Computer Vision and Pattern Recognition (CVPR)

Onyekachi O. Odoemene; Neil M. Fendley; Nathan G. Drenkow; Philippe M. Burlina

Searching for small objects in large images is a task that is both challenging for current deep learning systems and important in numerous real-world applications, such as remote sensing and medical imaging. Thorough scanning of very large images is computationally expensive, particularly at resolutions sufficient to capture small objects. The smaller an object of interest, the more likely it is to be obscured by clutter or otherwise deemed insignificant. We examine these issues in the context of two complementary problems: closed-set object detection and open-set target search. First, we present a method for predicting pixel-level objectness from a low resolution gist image, which we then use to select regions for performing object detection locally at high resolution. This approach has the benefit of not being fixed to a predetermined grid, thereby requiring fewer costly high-resolution glimpses than existing methods. Second, we propose a novel strategy for open-set visual search that seeks to find all instances of a target class which may be previously unseen and is defined by a single image. We interpret both detection problems through a probabilistic, Bayesian lens, whereby the objectness maps produced by our method serve as priors in a maximum-a-posteriori approach to the detection step. We evaluate the end-to-end performance of both the combination of our patch selection strategy with this target search approach and the combination of our patch selection strategy with standard object detection methods. Both elements of our approach are seen to significantly outperform baseline strategies.


I-Jeng Wang; Edward W. Staley; Corban G. Rivera; Ji H. Pak; Olivia J. Lyons; Ashley J. Llorens; Aryeh L. Englander; Robert W. Chalmers

The ability to create artificial intelligence (AI) capable of performing complex tasks is rapidly outpacing our ability to ensure the safe and assured operation of AI-enabled systems. Fortunately, a landscape of AI safety research is emerging in response to this asymmetry and yet there is a long way to go. In particular, recent simulation environments created to illustrate AI safety risks are relatively simple or narrowly-focused on a particular issue. Hence, we see a critical need for AI safety research environments that abstract essential aspects of complex real-world applications. In this work, we introduce the AI safety TanksWorld as an environment for AI safety research with three essential aspects: competing performance objectives, human-machine teaming, and multi-agent competition. The AI safety TanksWorld aims to accelerate the advancement of safe multi-agent decision-making algorithms by providing a software framework to support competitions with both system performance and safety objectives. As a work in progress, this paper introduces our research objectives and learning environment with reference code and baseline performance metrics to follow in a future work.

2020 IEEE Physical Assurance and Inspection of Electronics (PAINE)

Miller L. Wilt; Stergios J. Papadakis; Megan M. Baker

Digital technology advances quickly. New versions of both processors and software are released on a timescale of months, and each modification brings the potential for new security threats. We investigate here the use of RF side channel collection and a machine learning-based classifier for a general purpose reverse-engineering tool. Ideally, such a tool would enable a user to learn as much as possible about the device under test (DUT) with minimal interaction with that DUT. Furthermore, to enable rapid updates, training of the tool to classify new hardware and software should not require detailed knowledge of the new DUT. We demonstrate identification of various processes running on an Intel Atom single-core processor using RF side channel analysis and machine learning. One classifier was able to distinguish among BIOS, Windows 10, and Ubuntu Linux, and another among Ubuntu Linux 16.04, 18.04, and 20.04. A classifier was built that can detect processes running in the background on Windows or Linux, including a web browser and word processor on each. Finally, a classifier was built that detects the WannaCry ransomware operating. For all of these capabilities, for both training and testing, collection of RF leakage was done with minimal interaction with the DUT; the DUT was booted and the probe was placed by hand near the CPU to collect the RF side channel leakage asynchronously and without a trigger. Performance was above 99.9% with a fixed probe position, and above 99% for probe that was placed for each measurement. We describe the application of 1D deep convolutional neural networks inspired by natural language processing algorithms to the RF data, and how very high performance classification of even very subtle RF signatures can be achieved.

Artificial Intelligence in HCI

I-Jeng Wang; Katie M. Popek; Kapil D. Katyal

Fast, collision-free motion through human environments remains a challenging problem for robotic systems. In these situations, the robot’s ability to reason about its future motion and other agents is often severely limited. By contrast, biological systems routinely make decisions by taking into consideration what might exist in the future based on prior experience. In this paper, we present an approach that provides robotic systems the ability to make future predictions of the environment. We evaluate several deep network architectures, including purely generative and adversarial models for map prediction. We further extend this approach to predict future pedestrian motion. We show that prediction plays a key role in enabling an adaptive, risk-sensitive control policy. Our algorithms are able to generate future maps with a structural similarity index metric up to 0.899 compared to the ground truth map. Further, our adaptive crowd navigation algorithm is able to reduce the number of collisions by 43% in the presence of novel pedestrian motion not seen during training.

Complex Networks XI

William R. Gray Roncal; Brock A. Wester; Elizabeth P. Reilly; Erik C. Johnson; Marisa J. Hughes

We previously introduced the Neural Reconstruction Integrity (NRI) metric as a measure of how well the connectivity of the brain is measured in a neural circuit reconstruction, which can be represented as a graph or network. While powerful, NRI requires ground truth data for evaluation, which is conventionally obtained through time-intensive human annotation. NRI is a proxy for graph-based metrics since it focuses on the pre- and post-synaptic connections (or in and out edges) at a single neuron or vertex rather than overall graph structure in order to satisfy the format of available ground truth and provide rapid assessments. In this paper, we study the relationship between the NRI and graph theoretic metrics in order to understand the relationship of NRI to small world properties, centrality measures, and cost of information flow, as well as minimize our dependence on ground truth. The common errors under evaluation are synapse insertions and deletions and neuron splits and merges. We also elucidate the connection between graph metrics and biological priors for more meaningful interpretation of our results. We identified the most useful local metric to be local clustering coefficient, while the most useful global metrics are characteristic path length, rich-club coefficient, and density due to their strong correlations with NRI and perturbation errors.

Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020)

Paul McNamee; James C. Mayfield; Cash J. Costello; Caitlyn A. Bishop; Shelby K. Anderson

For over 30 years researchers have studied the problem of automatically detecting named entities in written language. Throughout this time the majority of such work has focused on detection and classification of entities into coarse-grained types like: PERSON, ORGANIZATION, and LOCATION. Less attention has been focused on non-named mentions of entities, including non-named location phrases. In this work we describe the Location Phrase Detection task. Our key accomplishments include: developing a sequential tagging approach; crafting annotation guidelines; building an annotated dataset from news articles; and, conducting experiments in automated detection of location phrases with both statistical and neural taggers.

Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020)

Shelby K. Anderson; Cash J. Costello; Paul McNamee; James C. Mayfield; Caitlyn A. Bishop

Dragonfly is an open source software tool that supports annotation of text in a low resource language by non-speakers of the language. Using semantic and contextual information, non-speakers of a language familiar with the Latin script can produce high quality named entity annotations to support construction of a name tagger. We describe a procedure for annotating low resource languages using Dragonfly that others can use, which we developed based on our experience annotating data in more than ten languages. We also present performance comparisons between models trained on native speaker and non-speaker annotations.


I-Jeng Wang; Maxwell R. Lennon; Neil M. Fendley; Nathan G. Drenkow; Philippe M. Burlina

We focus on the development of effective adversarial patch attacks and -- for the first time -- jointly address the antagonistic objectives of attack success and obtrusiveness via the design of novel semi-transparent patches. This work is motivated by our pursuit of a systematic performance analysis of patch attack robustness with regard to geometric transformations. Specifically, we first elucidate a) key factors underpinning patch attack success and b) the impact of distributional shift between training and testing/deployment when cast under the Expectation over Transformation (EoT) formalism. By focusing our analysis on three principal classes of transformations (rotation, scale, and location), our findings provide quantifiable insights into the design of effective patch attacks and demonstrate that scale, among all factors, significantly impacts patch attack success. Working from these findings, we then focus on addressing how to overcome the principal limitations of scale for the deployment of attacks in real physical settings: namely the obtrusiveness of large patches. Our strategy is to turn to the novel design of irregularly-shaped, semi-transparent partial patches which we construct via a new optimization process that jointly addresses the antagonistic goals of mitigating obtrusiveness and maximizing effectiveness. Our study -- we hope -- will help encourage more focus in the community on the issues of obtrusiveness, scale, and success in patch attacks.

JAMA Ophthalmology

William A. Paul; Philip A. Mathew; Neil J. Joshi; Philippe M. Burlina

Recent studies have demonstrated the successful application of artificial intelligence (AI) for automated retinal disease diagnostics but have not addressed a fundamental challenge for deep learning systems: the current need for large, criterion standard–annotated retinal data sets for training. Low-shot learning algorithms, aiming to learn from a relatively low number of training data, may be beneficial for clinical situations involving rare retinal diseases or when addressing potential bias resulting from data that may not adequately represent certain groups for training, such as individuals older than 85 years.


Brock A. Wester; Margaret C. Thompson; Francesco V. Tenore; Eric A. Pohlmeyer; Luke E. Osborn; Matthew S. Fifer

The restoration of cutaneous sensation to fingers and fingertips is critical to achieving dexterous prosthesis control for individuals with sensorimotor dysfunction. However, localized and reproducible fingertip sensations in humans have not been reported via intracortical microstimulation (ICMS) in humans. Here, we show that ICMS in a human participant was capable of eliciting percepts in 7 fingers spanning both hands, including 6 fingertip regions (i.e., 3 on each hand). Median percept size was estimated to include 1.40 finger or palmar segments (e.g., one segment being a fingertip or the section of upper palm below a finger). This was corroborated with a more sensitive manual marking technique where median percept size corresponded to roughly 120% of a fingertip segment. Percepts showed high intra-day consistency, including high performance (99%) on a blinded finger discrimination task. Across days, there was more variability in percepts, with 75.8% of trials containing the modal finger or palm region for the stimulated electrode. These results suggest that ICMS can enable the delivery of localized fingertip sensations during object manipulation by neuroprostheses.


I-Jeng Wang; Corban G. Rivera; Jared J. Markowitz; Kapil D. Katyal

Human-aware robot navigation promises a range of applications in which mobile robots bring versatile assistance to people in common human environments. While prior research has mostly focused on modeling pedestrians as independent, intentional individuals, people move in groups; consequently, it is imperative for mobile robots to respect human groups when navigating around people. This paper explores learning group-aware navigation policies based on dynamic group formation using deep reinforcement learning. Through simulation experiments, we show that group-aware policies, compared to baseline policies that neglect human groups, achieve greater robot navigation performance (e.g., fewer collisions), minimize violation of social norms and discomfort, and reduce the robot’s movement impact on pedestrians. Our results contribute to the development of social navigation and the integration of mobile robots into human environments.


Ryan N. Mukherjee; Armin Hadzic; Jeffrey D. Freeman; Gordon A. Christie

We introduce a deep learning approach to perform fine grained population estimation for displacement camps using high-resolution overhead imagery. We train and evaluate our approach on drone imagery cross-referenced with population data for refugee camps in Cox’s Bazar, Bangladesh in 2018 and 2019. Our proposed approach achieves 7.41% mean absolute percent error on sequestered camp imagery. We believe our experiments with real-world displacement camp data constitute an important step towards the development of tools that enable the humanitarian community to effectively and rapidly respond to the global displacement crisis.


Brock A. Wester; Dean M. Kleissas; Jordan K. Matelsky; Justin M. Joyce; Erik C. Johnson; William R. Gray Roncal; Nathan G. Drenkow

As biological imaging datasets continue to grow in size, extracting information from large image volumes presents a computationally intensive challenge. State-of-the-art algorithms are almost entirely dominated by the use of convolutional neural network approaches that may be diffcult to run at scale given schedule, cost, and resource limitations. We demonstrate a novel solution for high-resolution electron microscopy brain image volumes that permits the identification of individual neurons and synapses. Instead of conventional approaches whereby voxels are labelled according to the neuron or neuron segment to which they belong, we instead focus on extracting the underlying brain graph represented by synaptic connections between individual neurons while also identifying key features like skeleton similarity and path length. This graph represents a critical step and scaffold for understanding the structure of neuronal circuitry. Our approach recasts the segmentation problem to one of path-finding between keypoints (i.e., connectivity) in an information sharing framework using virtual agents. We create a family of sensors which follow local decision-making rules that perform computationally cheap operations on potential fields to perform tasks such as avoiding cell membranes and finding synapses. These enable a swarm of virtual agents to effciently and robustly traverse three-dimensional datasets, create a sparse segmentation of pathways, and capture connectivity information. We achieve results that meet or exceed state-of-the-art performance at a substantially lower computational cost. This tool offers a categorically different approach to connectome estimation that can augment how we extract connectivity information at scale. Our method is generalizable and may be extended to biomedical imaging problems such as tracing the bronchial trees in lungs or road networks in natural images.


Kiran Karra; Neil M. Fendley; Clayton C. Ashcraft

In this paper, we introduce the TrojAI software framework, an open source set of Python tools capable of generating triggered (poisoned) datasets and associated deep learning (DL) models with trojans at scale. We utilize the developed framework to generate a large set of trojaned MNIST classifiers, as well as demonstrate the capability to produce a trojaned reinforcement-learning model using vector observations. Results on MNIST show that the nature of the trigger, training batch size, and dataset poisoning percentage all affect successful embedding of trojans. We test Neural Cleanse against the trojaned MNIST models and successfully detect anomalies in the trained models approximately 18% of the time. Our experiments and workflow indicate that the TrojAI software framework will enable researchers to easily understand the effects of various configurations of the dataset and training hyperparameters on the generated trojaned deep learning model, and can be used to rapidly and comprehensively test new trojan detection methods.


James C. Mayfield

We present a way to generate gazetteers from the Wikidata knowledge graph and use the lists to improve a neural NER system by adding an input feature indicating that a word is part of a name in the gazetteer. We empirically show that the approach yields performance gains in two distinct languages: a high-resource, word-based language, English and a highresource, character-based language, Chinese. We apply the approach to a low-resource language, Russian, using a new annotated Russian NER corpus from Reddit tagged with four core and eleven extended types, and show a baseline score.

LREC 2020

James C. Mayfield

Named entity recognition (NER) identifies spans of text that contain names. Many researchers have reported the results of NER on text created through optical character recognition (OCR) over the past two decades. Unfortunately, the test collections that support this research are annotated with named entities after optical character recognition (OCR) has been run. This means that the collection must be re-annotated if the OCR output changes. Instead, by tying annotations to character locations on the page, a collection can be built that supports OCR and NER research without requiring re-annotation when either improves. This means that named entities are annotated on the transcribed text. The transcribed text is all that is needed to evaluate the performance of OCR. For NER evaluation, the tagged OCR output is aligned to the transcription, and modified versions of each are created and scored. This paper presents a methodology for building such a test collection and releases a collection of Chinese OCR-NER data constructed using the methodology. The paper provides performance baselines for current OCR and NER systems applied to this new collection.

ArXIV 2020

Neil J. Joshi; William A. Paul; Philippe M. Burlina

This study evaluated novel AI and deep learning generative methods to address AI bias for retinal diagnostic applications when specifically applied to diabetic retinopathy (DR). Bias often results from data imbalance. We specifically considered here a strong form of data imbalance corresponding to domain shift, where AI classifiers are faced at inference time with data and concepts they were not trained on initially (here the concept of diseased black individuals). A baseline DR diagnostics DLS designed to solve a two-class problem of referable vs not referable DR was used. We modified the public domain Kaggle-EyePACS dataset (88,692 fundi and 44,346 individuals), which was originally designed to be diverse with regard to ethnicity, as follows: 1) we expanded it to include clinician-annotated labels for race since those were not publicly available; 2) we excluded training exemplars for diseased black individuals in training, but not testing, to construct a new scenario of data imbalance with domain shift. For this domain shifted scenario, the accuracy (95% confidence intervals [CI]) of the baseline DR diagnostics DLS for whites was 73.0% (66.9%,79.2%) vs. blacks of 60.5% (53.5%,67.3%], demonstrating disparity of AI performance as measured by accuracy across races. By contrast, an AI approach leveraging generative models was used to train a new diagnostic DLS with additional synthetically generated data for the missing subpopulation (diseased blacks), which achieved accuracy for whites of 77.5% (71.7%,83.3%) and for blacks of 70.0% (63.7%,76.4%), demonstrating closer parity in accuracy across races. The new debiased DLS also showed improvement in sensitivity of over 21% for blacks, with the same level of specificity, when compared with the baseline DLS. These findings demonstrate the potential benefits of using novel generative methods for debiasing AI.

Proceedings of The 12th Language Resources and Evaluation Conference (LREC)

Paul McNamee; Brian M. Thompson

Research in machine translation (MT) is developing at a rapid pace. However, most work in the community has focused on languages where large amounts of digital resources are available. In this study, we benchmark state of the art statistical and neural machine translation systems on two African languages which do not have large amounts of resources: Somali and Swahili. These languages are of social importance and serve as test-beds for developing technologies that perform reasonably well despite the low-resource constraint. Our findings suggest that statistical machine translation (SMT) and neural machine translation (NMT) can perform similarly in low-resource scenarios, but neural systems require more careful tuning to match performance. We also investigate how to exploit additional data, such as bilingual text harvested from the web, or user dictionaries; we find that NMT can significantly improve in performance with the use of these additional data. Finally, we survey the landscape of machine translation resources for the languages of Africa and provide some suggestions for promising future research directions


I-Jeng Wang; William A. Paul; Philippe M. Burlina

This work focuses on the ability to control via latent space factors semantic image attributes in generative models, and the faculty to discover mappings from factors to attributes in an unsupervised fashion. The discovery of controllable semantic attributes is of special importance, as it would facilitate higher level tasks such as unsupervised representation learning to improve anomaly detection, or the controlled generation of novel data for domain shift and imbalanced datasets. The ability to control semantic attributes is related to the disentanglement of latent factors, which dictates that latent factors be "uncorrelated" in their effects. Unfortunately, despite past progress, the connection between control and disentanglement remains, at best, confused and entangled, requiring clarifications we hope to provide in this work. To this end, we study the design of algorithms for image generation that allow unsupervised discovery and control of semantic attributes.We make several contributions: a) We bring order to the concepts of control and disentanglement, by providing an analytical derivation that connects mutual information maximization, which promotes attribute control, to total correlation minimization, which relates to disentanglement. b) We propose hybrid generative model architectures that use mutual information maximization with multi-scale style transfer. c) We introduce a novel metric to characterize the performance of semantic attributes control. We report experiments that appear to demonstrate, quantitatively and qualitatively, the ability of the proposed model to perform satisfactory control while still preserving competitive visual quality. We compare to other state of the art methods (e.g., Frechet inception distance (FID)= 9.90 on CelebA and 4.52 on EyePACS).

Wearable Robotics: Systems and Applications

Robert S. Armiger; Brock A. Wester; Matthew P. Para; Courtneyleigh W. Moran; Kapil D. Katyal; Matthew S. Johannes; John B. Helder

Revolutionizing Prosthetics is a government-sponsored program focused on maturing the many foundational technologies that comprise neural prosthetic systems. Targeting the needs of amputees and movement-impaired individuals, the program focused on technological advancements in areas such as advanced neural recording devices, neural decoding and encoding algorithms, and upper limb prosthetic systems. A primary objective of the program was to support technological advancement of wearable prosthetic devices at the upper limb and hand level. The Modular Prosthetic Limb (MPL) is a representation of the vision for a highly anthropometric prosthetic limb. With 26 articulating joints, 17 actuators, and hundreds of internal sensors for feedback, the vision of the MPL was to replicate the dexterity, speed, and strength of the human hand to an extent never realized in a prosthetic device. Here we describe the MPL’s evolution over the past 13 years and describe the system in entirety, focusing on the fundamental characteristics of the system from hardware to software and controls. Additionally, we briefly touch upon some clinical applications and alternative use cases for the system to date.

Scientific Reports / NatureResearch

Luke E. Osborn

In recent times, we have witnessed a push towards restoring sensory perception to upper-limb amputees, which includes the whole spectrum from gentle touch to noxious stimuli. These are essential components for body protection as well as for restoring the sense of embodiment. Notwithstanding the considerable advances that have been made in designing suitable sensors and restoring tactile perceptions, pain perception dynamics and its decoding using effective bio-markers, are still not fully understood. Here, using electroencephalography (EEG) recordings, we identified and validated a spatio-temporal signature of brain activity during innocuous, moderately more intense, and noxious stimulation of an amputee’s phantom limb using transcutaneous nerve stimulation (TENS). Based on the spatio-temporal EEG features, we developed a system for detecting pain perception and reaction in the brain, which successfully classified three different stimulation conditions with a test accuracy of 94.66%, and we investigated the cortical activity in response to sensory stimuli in these conditions. Our findings suggest that the noxious stimulation activates the pre-motor cortex with the highest activation shown in the central cortex (Cz electrode) between 450 ms and 750 ms post-stimulation, whereas the highest activation for the moderately intense stimulation was found in the parietal lobe (P2, P4, and P6 electrodes). Further, we localized the cortical sources and observed early strong activation of the anterior cingulate cortex (ACC) corresponding to the noxious stimulus condition. Moreover, activation of the posterior cingulate cortex (PCC) was observed during the noxious sensation. Overall, although this is a single case study, this work presents a novel approach and a first attempt to analyze and classify neural activity when restoring sensory perception to amputees, which could chart a route ahead for designing a real-time pain reaction system in upper-limb prostheses.

Applied Sciences

Nathan H. Parrish; Ashley J. Llorens; Alexander E. Driskell

We propose an ensemble approach for multi-target binary classification, where the target class breaks down into a disparate set of pre-defined target-types. The system goal is to maximize the probability of alerting on targets from any type while excluding background clutter. The agent-classifiers that make up the ensemble are binary classifiers trained to classify between one of the target-types vs. clutter. The agent ensemble approach offers several benefits for multi-target classification including straightforward in-situ tuning of the ensemble to drift in the target population and the ability to give an indication to a human operator of which target-type causes an alert. We propose a combination strategy that sums weighted likelihood ratios of the individual agent-classifiers, where the likelihood ratio is between the target-type for the agent vs. clutter. We show that this combination strategy is optimal under a conditionally non-discriminative assumption. We compare this combiner to the common strategy of selecting the maximum of the normalized agent-scores as the combiner score. We show experimentally that the proposed combiner gives excellent performance on the multi-target binary classification problems of pin-less verification of human faces and vehicle classification using acoustic signatures.

Biological Cybernetics

Kevin M. Schultz; Grace M. Hwang

Neurobiological theories of spatial cognition developed with respect to recording data from relatively small and/or simplistic environments compared to animals’ natural habitats. It has been unclear how to extend theoretical models to large or complex spaces. Complementarily, in autonomous systems technology, applications have been growing for distributed control methods that scale to large numbers of low-footprint mobile platforms. Animals and many-robot groups must solve common problems of navigating complex and uncertain environments. Here, we introduce the NeuroSwarms control framework to investigate whether adaptive, autonomous swarm control of minimal artificial agents can be achieved by direct analogy to neural circuits of rodent spatial cognition. NeuroSwarms analogizes agents to neurons and swarming groups to recurrent networks. We implemented neuron-like agent interactions in which mutually visible agents operate as if they were reciprocally connected place cells in an attractor network. We attributed a phase state to agents to enable patterns of oscillatory synchronization similar to hippocampal models of theta-rhythmic (5–12 Hz) sequence generation.We demonstrate that multi-agent swarming and reward-approach dynamics can be expressed as a mobile form of Hebbian learning and that NeuroSwarms supports a single-entity paradigm that directly informs theoretical models of animal cognition.We present emergent behaviors including phase-organized rings and trajectory sequences that interact with environmental cues and geometry in large, fragmented mazes. Thus, NeuroSwarms is a model artificial spatial system that integrates autonomous control and theoretical neuroscience to potentially uncover common principles to advance both domains.

Big Data Analytics

Hannah P. Cowley; Brock A. Wester; Jordan K. Matelsky; William R. Gray Roncal; Joseph T. Downs

As the scope of scientific questions increase and datasets grow larger, the visualization of relevant information correspondingly becomes more difficult and complex. Sharing visualizations amongst collaborators and with the public can be especially onerous, as it is challenging to reconcile software dependencies, data formats, and specific user needs in an easily accessible package. We present substrate, a data-visualization framework designed to simplify communication and code reuse across diverse research teams. Our platform provides a simple, powerful, browser-based interface for scientists to rapidly build effective three-dimensional scenes and visualizations. We aim to reduce the limitations of existing systems, which commonly prescribe a limited set of high-level components, that are rarely optimized for arbitrarily large data visualization or for custom data types. To further engage the broader scientific community and enable seamless integration with existing scientific workflows, we also present pytri, a Python library that bridges the use of substrate with the ubiquitous scientific computing platform, Jupyter. Our intention is to lower the activation energy required to transition between exploratory data analysis, data visualization, and publication-quality interactive scenes.

Wearable Robotics

Luke E. Osborn

From gross movements to object grasping and fine manipulation our arms and hands play an obviously valuable role in our daily lives. The movement of our hands, combined with our sophisticated sense of touch, enables us to seamlessly interact with our environment. The loss of a limb presents a difficult challenge in that much of the basic functionality we rely on as humans is no longer available in the same way. The use of prosthetic hands has proven to be a viable option for replacing a lost or missing hand.

Proc. of the IEEE International Conference on Robotics and Automation (ICRA) 2020

Joseph L. Moore; Max R. Basescu

Fixed-wing unmanned aerial vehicles (UAVs) offer significant performance advantages over rotary-wing UAVs in terms of speed, endurance, and efficiency. However, these vehicles have traditionally been severely limited with regards to maneuverability. In this paper, we present a nonlinear control approach for enabling aerobatic fixed-wing UAVs to maneuver in constrained spaces. Our approach utilizes full-state direct trajectory optimization and a minimalistic, but representative, nonlinear aircraft model to plan aggressive fixed-wing trajectories in real-time at 5 Hz across high angles-of-attack. Randomized motion planning is used to avoid local minima and local-linear feedback is used to compensate for model inaccuracies between updates. We demonstrate our method in hardware and show that both local-linear feedback and re-planning are necessary for successful navigation of a complex environment in the presence of model uncertainty.

William G. Coon

Working memory engages multiple distributed brain networks to support goal-directed behavior and higher order cognition. Dysfunction in working memory has been associated with cognitive impairment in neuropsychiatric disorders. It is important to characterize the interactions among cortical networks that are sensitive to working memory load since such interactions can also hint at the impaired dynamics in patients with poor working memory performance. Functional connectivity is a powerful tool used to investigate coordinated activity among local and distant brain regions. Here, we identified connectivity footprints that differentiate task states representing distinct working memory load levels. We employed linear support vector machines to decode working memory load from task-based functional connectivity matrices in 177 healthy adults. Using neighborhood component analysis, we also identified the most important connectivity pairs in classifying high and low working memory loads. We found that between-network coupling among frontoparietal, ventral attention and default mode networks, and within-network connectivity in ventral attention network are the most important factors in classifying low vs. high working memory load. Task-based within-network connectivity profiles at high working memory load in ventral attention and default mode networks were the most predictive of load-related increases in response times. Our findings reveal the large-scale impact of working memory load on the cerebral cortex and highlight the complex dynamics of intrinsic brain networks during active task states.

Proc. of the IEEE International Conference on Robotics and Automation (ICRA) 2020

Kapil D. Katyal

Mobile robots capable of navigating seamlessly and safely in pedestrian rich environments promise to bring robotic assistance closer to our daily lives. In this paper we draw on insights of how humans move in crowded spaces to explore how to recognize pedestrian navigation intent, how to predict pedestrian motion and how a robot may adapt its navigation policy dynamically when facing unexpected human movements. We experimentally demonstrate the effectiveness of our prediction algorithm using real-world pedestrian datasets and achieve comparable or better prediction accuracy compared to several state-of-the-art approaches. Moreover, we show that confidence of pedestrian prediction can be used to adjust the risk of a navigation policy adaptively to afford the most comfortable level as measured by the frequency of personal space violation in comparison with baselines. Furthermore, our adaptive navigation policy is able to reduce the number of collisions by 43% in the presence of novel pedestrian motion not seen during training.

3rd Workshop on Formal Methods for ML-Enabled Autonomous Systems, Computer-Aided Verification

Joshua T. Brule; Rosa Wu; Aurora C. Schmidt; Ivan I. Papusha; Yanni A. Kouskoulas; Daniel I. Genin

There is great interest in the potential for using formal methods to guarantee the reliability of deep neural networks. However, these techniques may also be used to implant carefully selected input-output pairs. We present initial results on a novel technique using SMT solvers to _ne tune the weights of a ReLU neural network to guarantee outcomes on a finite set of particular examples. This procedure can be used to ensure performance on key examples, but it could also be used to insert difficult-to-find incorrect examples that trigger unexpected performance. We demonstrate this approach by tuning the MNIST network to incorrectly classify a particular image and discuss the potential for the approach to compromise reliability of freely-shared machine learning models.

2019 53rd Annual Conference on Information Sciences and Systems (CISS)

Christopher R. Ratto; Robert T. Newsome; Waseem A. Malik

Feature selection is a common problem in pattern recognition. Though often motivated by the curse of dimensionality, feature selection also has the added benefit of reducing the cost of extracting features from test data. In this work, sparse probit models are modified to incorporate feature costs. A single-classifier approach, Cost-Constrained Feature optimization (CCFO), is compared to a new ensemble method referred to as the Cost-Constrained Classifier Cascade (C4). The C4 method utilizes a boosting framework that accommodates per-sample feature selection. Experimental results compare C4, CCFO, and baseline sparse kernel classification on two data sets with asymmetric feature costs, illustrating that C4 can yield similar or better accuracy and more economical use of expensive features.

Workshop European Association for Machine Translation

Paul McNamee

We describe the JHU submission to the LoResMT 2019 shared task, which involved translating between Bhojpuri, Latvian, Magahi, and Sindhi, to and from English. JHU submitted runs for all eight language pairs. Baseline runs using phrase-based statistical machine translation (SMT) and neural machine translation (NMT) were produced. We also submitted neural runs that made use of backtranslation and ensembling. Preliminary results suggest that system performance is reasonable given the limited amount of training data.

JAMA Ophthalmology

Philippe M. Burlina; Neil J. Joshi; Michael J. Pekala

Deep learning (DL) used for discriminative tasks in ophthalmology, such as diagnosing diabetic retinopathy or age-related macular degeneration (AMD), requires large image data sets graded by human experts to train deep convolutional neural networks (DCNNs). In contrast, generative DL techniques could synthesize large new data sets of artificial retina images with different stages of AMD. Such images could enhance existing data sets of common and rare ophthalmic diseases without concern for personally identifying information to assist medical education of students, residents, and retinal specialists, as well as for training new DL diagnostic models for which extensive data sets from large clinical trials of expertly graded images may not exist.

SPIE Defense + Commercial Sensing

Grace M. Hwang; Kevin M. Schultz

The rise of mobile multi-agent robotic platforms is outpacing control paradigms for tasks that require operating in complex, realistic environments. To leverage inertial, energetic, and cost bene ts of small-scale robots, critical future applications may depend on coordinating large numbers of agents with minimal onboard sensing and communication resources. In this article, we present the perspective that adaptive and resilient autonomous control of swarms of minimal agents might follow from a direct analogy with the neural circuits of spatial cognition in rodents. We focus on spatial neurons such as place cells found in the hippocampus. Two major emergent hippocampal phenomena, self-stabilizing attractor maps and temporal organization by shared oscillations, reveal theoretical solutions for decentralized self-organization and distributed communication in the brain. We consider that autonomous swarms of minimal agents with low-bandwidth communication are analogous to brain circuits of oscillatory neurons with spike-based propagation of information. The resulting notion of `neural swarm control' has the potential to be scalable, adaptive to dynamic environments, and resilient to communication failures and agent attrition. We illustrate a path toward extending this analogy into multi-agent systems applications and discuss implications for advances in decentralized swarm control.

Computers in Biology and Medicine

Neil J. Joshi; Philippe M. Burlina; Michael J. Pekala

Optical coherence tomography (OCT) is an important retinal imaging method since it is a non-invasive, high-resolution imaging technique and is able to reveal the fine structure within the human retina. It has applications for retinal as well as neurological disease characterization and diagnostics. The use of machine learning techniques for analyzing the retinal layers and lesions seen in OCT can greatly facilitate such diagnostics tasks. The use of deep learning (DL) methods principally using fully convolutional networks has recently resulted in significant progress in automated segmentation of optical coherence tomography. Recent work in that area is reviewed herein.

Association for Computational Linguistics (ACL) / Workshop on Arabic Natural Language Processing (WANLP)

Paul McNamee

Our submission to the MADAR shared task on Arabic dialect identification employed a language modeling technique called Prediction by Partial Matching, an ensemble of neural architectures, and sources of additional data for training word embeddings and auxiliary language models. We found several of these techniques provided small boosts in performance, though a simple character-level language model was a strong baseline, and a lower-order LM achieved best performance on Subtask 2. Interestingly, word embeddings provided no consistent benefit, and ensembling struggled to outperform the best component submodel. This suggests the variety of architectures are learning redundant information, and future work may focus on encouraging decorrelated learning.

Neil J. Joshi; Philippe M. Burlina

Since 2016 much progress has been made in the automatic analysis of age related macular degeneration (AMD). Much of it was dedicated to the classification of referable vs. non-referable AMD, fine-grained AMD severity classification, and assessing the five-year risk of progression to the severe form of AMD. Here we review these developments, the main tasks that were addressed, and the main methods that were carried out.

Computational Retinal Image Analysis

Adam B. Cohen; Philippe M. Burlina

Our goal in this chapter is to describe the recent application of deep learning and artificial intelligence (AI) techniques to retinal image analysis. Automatic retinal image analysis (ARIA) is a complex task that has significant applications for diagnostic purposes for a host of retinal, neurological, and vascular diseases. A number of approaches for the automatic analysis of the retinal images have been studied for the past two decades but the recent success of deep learning (DL) for a range of computer vision and image analysis tasks has now permeated medical imaging and ARIA. Since 2016, major improvements were reported using DL discriminative methods (deep convolutional neural networks or autoencoder convolutional networks), and generative methods, in combination with other image analysis methods, that have demonstrated the ability of algorithms to perform on par with ophthalmologists and retinal specialists, for tasks such as automated classification, diagnostics, and segmentation. We review these recent developments in this chapter.

2019 Conference on Computer Vision and Pattern Recognition (CVPR)

I-Jeng Wang; Neil J. Joshi; Philippe M. Burlina

We develop a framework for novelty detection (ND) methods relying on deep embeddings, either discriminative or generative, and also propose a novel framework for assessing their performance. While much progress was made recently in these approaches, it has been accompanied by certain limitations: most methods were tested on relatively simple problems (low resolution images / small number of classes) or involved non-public data; comparative performance has often proven inconclusive because of lacking statistical significance; and evaluation has generally been done on non-canonical problem sets of differing complexity, making apples-to-apples comparative performance evaluation difficult. This has led to a relative confusing state of affairs. We address these challenges via the following contributions: We make a proposal for a novel framework to measure the performance of novelty detection methods using a trade-space demonstrating performance (measured by ROCAUC) as a function of problem complexity. We also make several proposals to formally characterize problem complexity. We conduct experiments with problems of higher complexity (higher image resolution / number of classes). To this end we design several canonical datasets built from CIFAR-10 and ImageNet (IN-125) which we make available to perform future benchmarks for novelty detection as well as other related tasks including semantic zero/adaptive shot and unsupervised learning. Finally, we demonstrate, as one of the methods in our ND framework, a generative novelty detection method whose performance exceeds that of all recent best-in-class generative ND methods.

Asian Conference on Computer Vision

Philippe M. Burlina

The lack of access to large annotated datasets and legal concerns regarding patient privacy are limiting factors for many applications of deep learning in the retinal image analysis domain. Therefore the idea of generating synthetic retinal images, indiscernible from real data, has gained more interest. Generative adversarial networks (GANs) have proven to be a valuable framework for producing synthetic databases of anatomically consistent retinal fundus images. In Ophthalmology, GANs in particular have shown increased interest. We discuss here the potential advantages and limitations that need to be addressed before GANs can be widely adopted for retinal imaging.

NAACL Association for Computational Linguistics

Paul McNamee

We introduce a curriculum learning approach to adapt generic neural machine translation models to a specific domain. Samples are grouped by their similarities to the domain of interest and each group is fed to the training algorithm with a particular schedule. This approach is simple to implement on top of any neural framework or architecture, and consistently outperforms both unadapted and adapted baselines in experiments with two distinct domains and two language pairs.

Johns Hopkins APL Technical Digest

Melissa A. Mrosky; Galen E. Mullins; Paul G. Stankiewicz

The resilience of an unmanned underwater vehicle (UUV) can be defined as the vehicle’s ability to reliably perform its mission across a wide range of changing and uncertain environments. Resilience is critical when operating UUVs where sensor uncertainty, environmental conditions, and stochastic decision-making all contribute to significant variations in performance. A challenge in quantifying the resilience of an autonomous system is the identification of the performance boundaries—critical locations in the testing space where a small change in the environment can cause a large change (i.e., failure) in an autonomous decision-making system. This article outlines a methodology for characterizing the performance boundaries of an autonomous decision-making system in the presence of stochastic effects and uncertain vehicle performance. This approach introduces a method for hierarchically scoring the autonomous decision-making of these systems, allowing the test engineer to quantitatively bound the performance prior to UUV deployment. When using this scoring approach, engineers apply a set of novel subclustering methods, allowing them to identify stable performance boundaries in stochastic systems. The result is a process that effectively measures the resilience of an autonomous decision-making system on UUVs.

Proc. of the IEEE 7th International Conference in Software Engineering Research and Innovation (CONISOFT ‘19)

Daniel I. Genin; Yanni A. Kouskoulas; Aurora C. Schmidt; Jessica A. Lopez

We describe an approach to developing a verified controller using hybrid system safety predicates. It selects from a dictionary of sequences of control actions, interleaving them and under model assumptions guaranteeing their continuing safety in unbounded time. The controller can adapt to changing priorities and objectives during operation. It can confer safety guarantees on a primary controller, identifying, intervening, and remediating actions that might lead to unsafe conditions in the future. Remediation is delayed until the latest time at which a safety-preserving intervention is available. When the assumptions of the safety proofs are violated, the controller provides altered but quantifiable safety guarantees. We apply this approach to synthesize a controller for aircraft collision avoidance, and report on the performance of this controller as a stand-alone collision avoidance system, and as a safety controller for the FAA’s next-generation aircraft collision avoidance system ACAS X.

Do Good Robotics Symposium 2019

I-Jeng Wang; Kapil D. Katyal

A critical capability required for the wide adoption of mobile robots into society is the ability to navigate safely around pedestrians. One important component to enable safe navigation is to accurately predict the motion of pedestrians in the scene. The main objective of this research is to develop novel techniques that accurately predict human motion by using past motion and intent as a prior for making the prediction.In this study, we develop neural network architectures that are capable of learning environment-agnostic embeddings that serve as a prior for prediction. We combine these embeddings with contextual information including desired velocity and a probability distribution describing the intent to make predictions. We compare the average displacement error and final displacement error with state-of-the-art published results and show evidence that combining contextual information results in more accurate prediction of future motion.


Kevin M. Schultz; Grace M. Hwang

Neurobiological theories of spatial cognition developed with respect to recording data from relatively small and/or simplistic environments compared to animals' natural habitats. It has been unclear how to extend theoretical models to large or complex spaces. Complementarily, in autonomous systems technology, applications have been growing for distributed control methods that scale to large numbers of low-footprint mobile platforms. Animals and many-robot groups must solve common problems of navigating complex and uncertain environments. Here, we introduce the NeuroSwarms control framework to investigate whether adaptive, autonomous swarm control of minimal artificial agents can be achieved by direct analogy to neural circuits of rodent spatial cognition. NeuroSwarms analogizes agents to neurons and swarming groups to recurrent networks. We implemented neuron-like agent interactions in which mutually visible agents operate as if they were reciprocally-connected place cells in an attractor network. We attributed a phase state to agents to enable patterns of oscillatory synchronization similar to hippocampal models of theta-rhythmic (5-12 Hz) sequence generation. We demonstrate that multi-agent swarming and reward approach dynamics can be expressed as a mobile form of Hebbian learning and that NeuroSwarms supports a single-entity paradigm that directly informs theoretical models of animal cognition. We present emergent behaviors including phase organized rings and trajectory sequences that interact with environmental cues and geometry in large, fragmented mazes. Thus, NeuroSwarms is a model artificial spatial system that integrates autonomous control and theoretical neuroscience to potentially uncover common principles to advance both domains.

2019 IEEE 16th International Symposium on Biomedical Imaging

I-Jeng Wang; Neil J. Joshi; Philippe M. Burlina; Seth D. Billings

This study investigates unsupervised novelty detection (ND) for screening of rare myopathies and specifically myositis. To support this study we developed from the ground up a novel and fully annotated dataset consisting of 3586 images taken of eighty nine individuals obtained under informed consent during 2016-2017. We developed and compared performance for several ND methods leveraging deep feature embeddings, utilizing generative as well as discriminative deep learning approaches for embeddings, and using various novelty scores. We carried out several performance comparisons including with a clinician, supervised binary classification approaches, and a generative method, demonstrating that our best performing approach is competitive with human performance and other best of breed algorithms.

NeurIPS 2019

Vickram Rajendran; William V. LeVine

Estimating machine learning performance “in the wild” is both an important and unsolved problem. In this paper, we seek to examine, understand, and predict the pointwise competence of classification models. Our contributions are twofold: First, we establish a statistically rigorous definition of competence that generalizes the common notion of classifier confidence; second, we present the ALICE (Accurate Layerwise Interpretable Competence Estimation) Score, a pointwise competence estimator for any classifier. By considering distributional, data, and model uncertainty, ALICE empirically shows accurate competence estimation in common failure situations such as class-imbalanced datasets, out-of-distribution datasets, and poorly trained models. Our contributions allow us to accurately predict the competence of any classification model given any input and error function. We compare our score with state-of-the-art confidence estimators such as model confidence and Trust Score, and show significant improvements in competence prediction over these methods on datasets such as DIGITS, CIFAR10, and CIFAR100.

ICRA 2019

Christopher J. Paxton; Kapil D. Katyal; Yotam Barnoy

Prospection is key to solving challenging problems in new environments, but it has not been deeply explored as applied to task planning for perception-driven robotics. We propose visual robot task planning, where we take in an input image and must generate a sequence of high-level actions and associated observations that achieve some task. In this paper, we describe a neural network architecture and associated planning algorithm that (1) learns a representation of the world that can generate prospective futures, (2) uses this generative model to simulate the result of sequences of high-level actions in a variety of environments, and (3) evaluates these actions via a variant of Monte Carlo Tree Search to find a viable solution to a particular problem. Our approach allows us to visualize intermediate motion goals and learn to plan complex activity from visual information, and used this to generate and visualize task plans on held-out examples of a block-stacking simulation.

ICRA 2019

Kapil D. Katyal; Katie M. Popek; Philippe M. Burlina

Efficient exploration through unknown environments remains a challenging problem for robotic systems. In these situations, the robot's ability to reason about its future motion is often severely limited by sensor field of view (FOV). By contrast, biological systems routinely make decisions by taking into consideration what might exist beyond their FOV based on prior experience. We present an approach for predicting occupancy map representations of sensor data for future robot motions using deep neural networks. We develop a custom loss function used to make accurate prediction while emphasizing physical boundaries. We further study extensions to our neural network architecture to account for uncertainty and ambiguity inherent in mapping and exploration. Finally, we demonstrate a combined map prediction and information-theoretic exploration strategy using the variance of the generated hypotheses as the heuristic for efficient exploration of unknown environments.

9th International IEEE EMBS Conference on Neural Engineering

Matthew S. Fifer

Transient muscle movements influence the temporal structure of myoelectric signal patterns, often leading to unstable prediction behavior from movement-pattern classification methods. We show that temporal convolutional network sequential models leverage the myoelectric signal’s history to discover contextual temporal features that aid in correctly predicting movement intentions, especially during interclass transitions. We demonstrate myoelectric classification using temporal convolutional networks to effect 3 simultaneous hand and wrist degrees-of-freedom in an experiment involving nine human-subjects. Temporal convolutional networks yield significant(p<0.001)performance improvements over other state-of-the-art methods in terms of both classification accuracy and stability.

Computers in biology and medicine

Neil J. Joshi; Philippe M. Burlina; Seth D. Billings

Lyme disease can lead to neurological, cardiac, and rheumatologic complications when untreated. Timely recognition of the erythema migrans rash of acute Lyme disease by patients and clinicians is crucial to early diagnosis and treatment. Our objective in this study was to develop deep learning approaches using deep convolutional neural networks for detecting acute Lyme disease from erythema migrans images of varying quality and acquisition conditions. This study used a cross-sectional dataset of images to train a model employing a deep convolutional neural network to perform classification of erythema migrans versus other skin conditions including tinea corporis and herpes zoster, and normal, non-pathogenic skin. Evaluation of the machine's ability to classify skin types was also performed on a validation set of images. Machine performance for detecting erythema migrans was further tested against a panel of non-medical humans. Online, publicly available images of both erythema migrans and non-Lyme confounding skin lesions were mined, and combined with erythema migrans images from an ongoing, longitudinal study of participants with acute Lyme disease enrolled in 2016 and 2017 who were recruited from primary and urgent care centers. The final dataset had 1834 images, including 1718 expert clinician-curated online images from unknown individuals with erythema migrans, tinea corporis, herpes zoster, and normal skin. It also included 116 images taken of 63 research participants from the Mid-Atlantic region. Two clinicians carefully annotated all lesion images. A convenience sample of 7 non-medically-trained humans were used as a panel to compare against machine performance. We calculated several performance metrics, including accuracy and Kappa (characterizing agreement with gold standard), as well as a receiver operating characteristic curve and associated area under the curve. For detecting erythema migrans, the machine had an accuracy (95% confidence interval error margin) of 86.53% (2.70), ROCAUC of 0.9510 (0.0171) and Kappa of 0.7143. Our results suggested substantial agreement between machine and clinician criterion standard. Comparison of machine with non-medical expert human performance indicated that the machine almost always exceeded acceptable specificity, and could operate with higher sensitivity. This could have benefits for prescreening prior to physician referral, earlier treatment, and reductions in morbidity.

IEEE 53rd Asilomar Conference on Signals, Systems, and Computers

Raphael Norman-Tenazas; William R. Gray Roncal; Erik C. Johnson; Daniel Xenes; Luis M. Rodriguez

Neuroscientists are collecting Electron Microscopy (EM) datasets at increasingly faster rates. This modality offers an unprecedented map of brain structure at the resolution of individual neurons and their synaptic connections. Despite sophisticated image processing algorithms such as Flood Filling Networks, these huge datasets often require large amounts of hand-labeled data for algorithm training, followed by significant human proofreading. Many of these challenges are common across neuroscience modalities (and in other domains), but we use EM as a use case because the scale of this data emphasizes the opportunity and impact of rapidly transferring methods to new datasets. We investigate transfer learning for these workflows, exploring transfer to different regions within a dataset, between datasets from different species, and for datasets collected with different image acquisition techniques. For EM data, we investigate the impact of algorithm performance at different workflow stages. Finally, we assess the impact of candidate transfer learning strategies in environments with no training labels. This work provides a library of algorithms, pipelines, and baselines on established datasets. We enable rapid assessment and improvements to processing pipelines, and an opportunity to quickly and effectively analyze new datasets for the neuroscience community.

Human Brain Mapping

Clara A. Scholl

The grouping of sensory stimuli into categories is fundamental to cognition. Previous research in the visual and auditory systems supports a two-stage processing hierarchy that underlies perceptual categorization: (a) a “bottom-up” perceptual stage in sensory cortices where neurons show selectivity for stimulus features and (b) a “top-down” second stage in higher level cortical areas that categorizes the stimulus-selective input from the first stage. In order to test the hypothesis that the two-stage model applies to the somatosensory system, 14 human participants were trained to categorize vibrotactile stimuli presented to their right forearm. Then, during an fMRI scan, participants actively categorized the stimuli. Representational similarity analysis revealed stimulus selectivity in areas including the left precentral and postcentral gyri, the supramarginal gyrus, and the posterior middle temporal gyrus. Crucially, we identified a single category-selective region in the left ventral precentral gyrus. Furthermore, an estimation of directed functional connectivity delivered evidence for robust top-down connectivity from the second to first stage. These results support the validity of the two-stage model of perceptual categorization for the somatosensory system, suggesting common computational principles and a unified theory of perceptual categorization across the visual, auditory, and somatosensory systems.


Timothy C. Gion; William R. Gray Roncal; Robert T. Hider Jr.; Jordan K. Matelsky; Dean M. Kleissas; Brock A. Wester; Luis M. Rodriguez; Derek M. Pryor

Large volumetric neuroimaging datasets have grown in size over the past ten years from gigabytes to terabytes, with petascale data becoming available and more common over the next few years. Current approaches to store and analyze these emerging datasets are insufficient in their ability to scale in both cost-effectiveness and performance. Additionally, enabling large-scale processing and annotation is critical as these data grow too large for manual inspection. We provide a new cloud-native managed service for large and multi-modal experiments, with support for data ingest, storage, visualization, and sharing through a RESTful Application Programming Interface (API) and web-based user interface. Our project is open source and can be easily and cost effectively used for a variety of modalities and applications

Society for Neuroscience 2019

Morgan V. Schuyler; Elizabeth P. Reilly; Jordan K. Matelsky; William R. Gray Roncal

As larger neural circuit data becomes broadly available to the research community, researchers look to the brain to understand why humans perform certain tasks robustly and efficiently and the underlying circuitry for some neurological disorders. In particular, discovery of repeated structure in large, newly collected brain image volumes would support the conjecture that the brain is modularly organized. At the same time, information extracted from brain imaging is inherently noisy due to errors manifested at all stages of the reconstruction process and the inability of humans to proofread or ground truth the vast amount of data available. Robust methods to analyze brain data could lead to the discovery of repeated brain structure, even in the presence of errors. We define a probabilistic approach to identify significant subgraph structures within imperfect graph data, allowing us to capture uncertainty in our discovery process and perform inference over noisy data. Our probabilistic approach uses graph data where edges are not binary, but rather have some confidence level, or weight, associated with them, as you might expect to obtain from a computer vision algorithm. While current methods often threshold the edges based on their weights, we instead use the edge weights to define a random graph model similar to an Erdös-Rényi model, but where the edges have varying probabilities based on the provided edge weight, thus creating a data-driven probabilistic graph. The intuition is that the true, underlying graph and small variations of it would occur with high probability in this model. Once the random graph model is defined, we use standard probabilistic graph techniques and sampling to determine the distribution of a subgraph occurring in this data-driven model. This distribution may then be compared with that from a standard Erdös-Rényi model with similar expected density using the Kolmogorov-Smirnov test. In other words, we compare the existence of a subgraph in our data-driven random graph model with the existence of that subgraph in a purely random model. Thus, we work towards identifying structural motifs in the presence of unknown reconstruction errors. We apply our methods to a handful of small subgraphs for initial testing of this approach and compare to the results obtained using a thresholded graph.

Society for Neuroscience 2019

Raphael Norman-Tenazas; Erik C. Johnson; William R. Gray Roncal

Understanding the link between information processing in neural circuits and behavior remains a key goal in neuroscience. Network circuits extracted from the brain can be represented as a graph, where neurons are nodes and synapses are edges; attributes such as edge weights can provide additional context and information about the flow of information through the network. With the addition of dynamic neuronal models, these connectomes can be modeled as a system that takes sensory information as inputs and produces outputs that act on the surrounding environment, process sensory input and produce output. Simulating this dynamic model can help interpret experimental results and aid hypothesis development. Simulation can be done using a variety of tools with various levels of fidelity and objective functions (e.g., performance, low-level biological fidelity, concurrence with neuroscience theory). Given the extreme simplicity of the C. elegans nematode and its complete connectome, it is an optimal candidate for initial research discovery. Using simple neuron models, its entire nervous system can be readily simulated and we have demonstrated this on low-level hardware. In a model of C. elegans running in a python simulation environment, we investigate the exploration behavior of C. elegans by modifying the weights of a C. elegans connectome with a genetic algorithm. Therefore, we have investigated training the connectome to perform simple tasks in simulation by using a simple integrate-and-fire neuron model and a simple kinematic model of an agent. Example environments include gradient following, finding food, avoiding collisions, and noxious stimuli. We study our trained connectome in changing environmental conditions and investigate the effects of ablating neurotransmitter pathways on behavior. This work has implications for low-complexity, bio-inspired robotic exploration algorithms which may be more robust than reinforcement learning methods using artificial neural networks.

Computers in Biology and Medicine

Neil J. Joshi; Seth D. Billings; Philippe M. Burlina; I-Jeng Wang

We address the challenge of finding anomalies in ultrasound images via deep learning, specifically applying this to screening for myopathies and finding rare presentations of myopathic disease. Among myopathic diseases, this study focuses on the use case of myositis given the spectrum of muscle involvement seen in these inflammatory muscle diseases, as well as the potential for treatment. For this study, we have developed a fully annotated dataset (called “Myositis3K”) which includes 3586 images of eighty-nine individuals (35 control and 54 with myositis) acquired with informed consent. We approach this challenge as one of performing unsupervised novelty detection (ND), and use tools leveraging deep embeddings combined with several novelty scoring methods. We evaluated these various ND algorithms and compared their performance against human clinician performance, against other methods including supervised binary classification approaches, and against unsupervised novelty detection approaches using generative methods. Our best performing approach resulted in a (ROC) AUC (and 95% CI error margin) of 0.7192 (0.0164), which is a promising baseline for developing future clinical tools for unsupervised prescreening of myopathies.

Do Good Robotics Symposium 2019

Edward W. Staley; Corban G. Rivera; Barton L. Paulhamus; Kapil D. Katyal

Assistive robotics holds the promise of bettering the lives of countless people throughout the world. As robots become more complex, the degrees-of-freedom for controlling robotic systems is rapidly outpacing the degrees-of-control that can be supplied by humans via conventional interfaces. In this paper, we describe a novel paradigm, amplified control,that strives to capture the adaptability of teleoperation while also leveraging the reduced user burden offered by shared control approaches.The novelty of this approach is that machine intelligence amplifies human intelligence for robotic control as opposed to replacing it, supplementing it, or augmenting it. If successful, our novel control paradigm will lower the barrier of entry (e.g., overcoming physical limitations and lessening cognitive load) for people to operate complex robotic systems for assistive robotics as well as other domains.


Marisa J. Hughes; Corban G. Rivera; William R. Gray Roncal; Erik C. Johnson; Dean M. Kleissas; Jordan K. Matelsky; Raphael Norman-Tenazas; Elizabeth P. Reilly; Luis M. Rodriguez; Brock A. Wester; Miller L. Wilt; Joseph T. Downs; Hannah P. Cowley; Nathan G. Drenkow

Emerging neuroimaging datasets (collected through modalities such as Electron Microscopy, Calcium Imaging, or X-ray Microtomography) describe the location and properties of neurons and their connections at unprecedented scale, promising new ways of understanding the brain. These modern imaging techniques used to interrogate the brain can quickly accumulate gigabytes to petabytes of structural brain imaging data. Unfortunately, many neuroscience laboratories lack the computational expertise or resources to work with datasets of this size: computer vision tools are often not portable or scalable, and there is considerable difficulty in reproducing results or extending methods. We developed an ecosystem of neuroimaging data analysis pipelines that utilize open source algorithms to create standardized modules and end-to-end optimized approaches. As exemplars we apply our tools to estimate synapse-level connectomes from electron microscopy data and cell distributions from X-ray microtomography data. To facilitate scientific discovery, we propose a generalized processing framework, that connects and extends existing open-source projects to provide large-scale data storage, reproducible algorithms, and workflow execution engines. Our accessible methods and pipelines demonstrate that approaches across multiple neuroimaging experiments can be standardized and applied to diverse datasets. The techniques developed are demonstrated on neuroimaging datasets, but may be applied to similar problems in other domains.

WMT 2018

Paul McNamee

To better understand the effectiveness of continued training, we analyze the major components of a neural machine translation system (the encoder, decoder, and each embedding space) and consider each component's contribution to, and capacity for, domain adaptation. We find that freezing any single component during continued training has minimal impact on performance, and that performance is surprisingly good when a single component is adapted while holding the rest of the model fixed. We also find that continued training does not move the model very far from the out-of-domain model, compared to a sensitivity analysis metric, suggesting that the out-of-domain model can provide a good generic initialization for the new domain.

Journal of Medical Imaging

Mehran Armand

Reproducibly achieving proper implant alignment is a critical step in total hip arthroplasty procedures that has been shown to substantially affect patient outcome. In current practice, correct alignment of the acetabular cup is verified in C-arm x-ray images that are acquired in an anterior–posterior (AP) view. Favorable surgical outcome is, therefore, heavily dependent on the surgeon’s experience in understanding the 3-D orientation of a hemispheric implant from 2-D AP projection images. This work proposes an easy to use intraoperative component planning system based on two C-arm x-ray images that are combined with 3-D augmented reality (AR) visualization that simplifies impactor and cup placement according to the planning by providing a real-time RGBD data overlay. We evaluate the feasibility of our system in a user study comprising four orthopedic surgeons at the Johns Hopkins Hospital and report errors in translation, anteversion, and abduction as low as 1.98 mm, 1.10 deg, and 0.53 deg, respectively. The promising performance of this AR solution shows that deploying this system could eliminate the need for excessive radiation, simplify the intervention, and enable reproducibly accurate placement of acetabular implants.


Mehran Armand

X-ray image guidance enables percutaneous alternatives to complex procedures. Unfortunately, the indirect view onto the anatomy in addition to projective simplification substantially increase the task-load for the surgeon. Additional 3D information such as knowledge of anatomical landmarks can benefit surgical decision making in complicated scenarios. Automatic detection of these landmarks in transmission imaging is challenging since image-domain features characteristic to a certain landmark change substantially depending on the viewing direction. Consequently and to the best of our knowledge, the above problem has not yet been addressed. In this work, we present a method to automatically detect anatomical landmarks in X-ray images independent of the viewing direction. To this end, a sequential prediction framework based on convolutional layers is trained on synthetically generated data of the pelvic anatomy to predict 23 landmarks in single X-ray images. View independence is contingent on training conditions and, here, is achieved on a spherical segment covering 120 ∘× 90 ∘ in LAO/RAO and CRAN/CAUD, respectively, centered around AP. On synthetic data, the proposed approach achieves a mean prediction error of 5.6±4.5 mm. We demonstrate that the proposed network is immediately applicable to clinically acquired data of the pelvis. In particular, we show that our intra-operative landmark detection together with pre-operative CT enables X-ray pose estimation which, ultimately, benefits initialization of image-based 2D/3D registration.


Mehran Armand

In percutaneous orthopedic interventions the surgeon attempts to reduce and fixate fractures in bony structures. The complexity of these interventions arises when the surgeon performs the challenging task of navigating surgical tools percutaneously only under the guidance of 2D interventional X-ray imaging. Moreover, the intra-operatively acquired data is only visualized indirectly on external displays. In this work, we propose a flexible Augmented Reality (AR) paradigm using optical see-through head mounted displays. The key technical contribution of this work includes the marker-less and dynamic tracking concept which closes the calibration loop between patient, C-arm and the surgeon. This calibration is enabled using Simultaneous Localization and Mapping of the environment, i.e. the operating theater. In return, the proposed solution provides in situ visualization of pre- and intra-operative 3D medical data directly at the surgical site. We demonstrate pre-clinical evaluation of a prototype system, and report errors for calibration and target registration. Finally, we demonstrate the usefulness of the proposed inside-out tracking system in achieving “bull’s eye” view for C-arm-guided punctures. This AR solution provides an intuitive visualization of the anatomy and can simplify the hand-eye coordination for the orthopedic surgeon.

IEEE Robotics and Automation Letters

Rachel A. Hegeman; Mehran Armand

We present a generic data-driven method to address the problem of manipulating a three-dimensional (3-D) compliant object (CO) with heterogeneous physical properties in the presence of unknown disturbances. In this study, we do not assume a prior knowledge about the deformation behavior of the CO and type of the disturbance (e.g., internal or external). We also do not impose any constraints on the CO's physical properties (e.g., shape, mass, and stiffness). The proposed optimal iterative algorithm incorporates the provided visual feedback data to simultaneously learn and estimate the deformation behavior of the CO in order to accomplish the desired control objective. To demonstrate the capabilities and robustness of our algorithm, we fabricated two novel heterogeneous compliant phantoms and performed experiments on the da Vinci Research Kit. Experimental results demonstrated the adaptivity, robustness, and accuracy of the proposed method and, therefore, its suitability for a variety of medical and industrial applications involving CO manipulation.

2018 International Symposium on Medical Robotics

Mehran Armand

Fiber Bragg Grating (FBG) has shown great potential in shape and force sensing of continuum manipulators (CM) and biopsy needles. In the recent years, many researchers have studied different manufacturing and modeling techniques of FBG-based force and shape sensors for medical applications. These studies mainly focus on obtaining shape and force information in a static (or quasi-static) environment. In this paper, however, we study and evaluate dynamic environments where the FBG data is affected by vibration caused by a harmonic force e.g. a rotational debriding tool harmonically exciting the CM and the FBG-based shape sensor. In such situations, appropriate pre-processing of the FBG signal is necessary in order to infer correct information from the raw signal. We look at an example of such dynamic environments in the less invasive treatment of osteolysis by studying the FBG data both in time- and frequency-domain in presence of vibration due to a debriding tool rotating inside the lumen of the CM.

International Journal of Computer Assisted Radiology and Surgery

Mehran Armand; Ryan J. Murphy

Background: Periacetabular osteotomy (PAO) is the treatment of choice for younger patients with developmental hip dysplasia. The procedure aims to normalize the joint configuration, reduce the peak-pressure, and delay the development of osteoarthritis. The procedure is technically demanding and no previous study has validated the use of computer navigation with a minimally invasive transsartorial approach. Methods: Computer-assisted PAO was performed on ten patients. Patients underwent pre- and postoperative computed tomography (CT) scanning with a standardized protocol. Preoperative preparation consisted of outlining the lunate surface and segmenting the pelvis and femur from CT data. The Biomechanical Guidance System was used intra-operatively to automatically calculate diagnostic angles and peak-pressure measurements. Manual diagnostic angle measurements were performed based on pre- and postoperative CT. Differences in angle measurements were investigated with summary statistics, intraclass correlation coefficient, and Bland–Altman plots. The percentage postoperative change in peak-pressure was calculated. Results: Intra-operative reported angle measurements show a good agreement with manual angle measurements with intraclass correlation coefficient between 0.94 and 0.98. Computer navigation reported angle measurements were significantly higher for the posterior sector angle (1.65◦, p = 0.001) and the acetabular anteversion angle (1.24◦, p = 0.004). No significant difference was found for the center-edge (p = 0.056), acetabular index (p = 0.212), and anterior sector angle (p = 0.452). Peak-pressure after PAO decreased by a mean of 13% and was significantly different (p = 0.008). Conclusions: We found that computer navigation can reliably be used with a minimally invasive transsartorial approach PAO. Angle measurements generally agree with manual measurements and peak-pressure was shown to decrease postoperatively. With further development, the system will become a valuable tool in the operating room for both experienced and less experienced surgeons performing PAO. Further studies with a larger cohort and follow-up will allow us to investigate the association with peak-pressure and postoperative outcome and pave the way to clinical introduction.


Mehran Armand

In unilateral pelvic fracture reductions, surgeons attempt to reconstruct the bone fragments such that bilateral symmetry in the bony anatomy is restored. We propose to exploit this “structurally symmetric” nature of the pelvic bone, and provide intra-operative image augmentation to assist the surgeon in repairing dislocated fragments. The main challenge is to automatically estimate the desired plane of symmetry within the patient’s pre-operative CT. We propose to estimate this plane using a non-linear optimization strategy, by minimizing Tukey’s biweight robust estimator, relying on the partial symmetry of the anatomy. Moreover, a regularization term is designed to enforce the similarity of bone density histograms on both sides of this plane, relying on the biological fact that, even if injured, the dislocated bone segments remain within the body. The experimental results demonstrate the performance of the proposed method in estimating this “plane of partial symmetry” using CT images of both healthy and injured anatomy. Examples of unilateral pelvic fractures are used to show how intra-operative X-ray images could be augmented with the forward-projections of the mirrored anatomy, acting as objective road-map for fracture reduction procedures.

2018 IEEE Integrated STEM Education Conference (ISEC)

Maria D. Roncal; Tammy A. Kolarik; Liem C. Huynh; Karla M. Gray Roncal; Mary Ann M. Saunders; William R. Gray Roncal

Data show that many high school students, especially those from underserved or underrepresented backgrounds, are unsuccessful in achieving a four-year college degree, particularly in Science, Technology, Engineering, and Math (STEM) careers. These young scholars often have strong potential and ambitious dreams, but face significant structural barriers in achieving their goals, resulting in substantial opportunity and knowledge gaps. We have developed a novel approach to help these students prepare for college admissions and achieve college success through intensive mentoring by STEM professionals and support through technology innovations. We developed a comprehensive, competency-based curriculum that includes academics, application preparation, essay development, financial aid, test preparation and college visits. Students complete a capstone portfolio project at the end of the summer intervention, and we continue to provide longitudinal support throughout their high school and college years. Over the past nine summers we implemented our model with an all-volunteer staff to achieve significant, measurable results: 168 students have participated in the program; 98% of surveyed program alumni are on-track to earning their four-year college degrees; and the majority plan to earn degrees in STEM and to pursue a graduate degree. Our outcomes significantly outperform both students from similar backgrounds and the overall student population. In this manuscript, we provide an overview of our model, insights into how this approach may be extended to other communities and suggestions for evaluating program efficacy in a data-driven framework.

Annals of Biomedical Engineering

Mehran Armand

We present a novel semi-autonomous clinician-in-the-loop strategy to perform the laparoscopic cryoablation of small kidney tumors. To this end, we introduce a model-independent bimanual tissue manipulation technique. In this method, instead of controlling the robot, which inserts and steers the needle in the deformable tissue (DT), the cryoprobe is introduced to the tissue after accurate manipulation of a target point on the DT to the desired predefined insertion location of the probe. This technique can potentially reduce the risk of kidney fracture, which occurs due to the incorrect insertion of the probe within the kidney. The main challenge of this technique, however, is the unknown deformation behavior of the tissue during its manipulation. To tackle this issue, we proposed a novel real-time deformation estimation method and a vision-based optimization framework, which do not require prior knowledge about the tissue deformation and the intrinsic/extrinsic parameters of the vision system. To evaluate the performance of the proposed method using the da Vinci Research Kit, we performed experiments on a deformable phantom and an ex vivo lamb kidney and evaluated our method using novel manipulability measures. Experiments demonstrated successful real-time estimation of the deformation behavior of these DTs while manipulating them to the desired insertion location(s).

SPIE Medical Imaging

Mehran Armand

Proper implant alignment is a critical step in total hip arthroplasty (THA) procedures. In current practice, correct alignment of the acetabular cup is verified in C-arm X-ray images that are acquired in an anteriorposterior (AP) view. Favorable surgical outcome is, therefore, heavily dependent on the surgeon’s experience in understanding the 3D orientation of a hemispheric implant from 2D AP projection images. This work proposes an easy to use intra-operative component planning system based on two C-arm X-ray images that is combined with 3D augmented reality (AR) visualization that simplifies impactor and cup placement according to the planning by providing a real-time RGBD data overlay. We evaluate the feasibility of our system in a user study comprising four orthopedic surgeons at the Johns Hopkins Hospital, and also report errors in translation, anteversion, and abduction as low as 1.98 mm, 1.10°, and 0.53°, respectively. The promising performance of this AR solution shows that deploying this system could eliminate the need for excessive radiation, simplify the intervention, and enable reproducibly accurate placement of acetabular implants.


Mehran Armand

Machine learning-based approaches outperform competing methods in most disciplines relevant to diagnostic radiology. Interventional radiology, however, has not yet benefited substantially from the advent of deep learning, in particular because of two reasons: (1) Most images acquired during the procedure are never archived and are thus not available for learning, and (2) even if they were available, annotations would be a severe challenge due to the vast amounts of data. When considering fluoroscopy-guided procedures, an interesting alternative to true interventional fluoroscopy is in silico simulation of the procedure from 3D diagnostic CT. In this case, labeling is comparably easy and potentially readily available, yet, the appropriateness of resulting synthetic data is dependent on the forward model. In this work, we propose DeepDRR, a framework for fast and realistic simulation of fluoroscopy and digital radiography from CT scans, tightly integrated with the software platforms native to deep learning. We use machine learning for material decomposition and scatter estimation in 3D and 2D, respectively, combined with analytic forward projection and noise injection to achieve the required performance. On the example of anatomical landmark detection in X-ray images of the pelvis, we demonstrate that machine learning models trained on DeepDRRs generalize to unseen clinically acquired data without the need for re-training or domain adaptation. Our results are promising and promote the establishment of machine learning in fluoroscopy-guided procedures.

SPIE Medical Imaging

Mehran Armand

Pre-operative CT data is available for several orthopedic and trauma interventions, and is mainly used to identify injuries and plan the surgical procedure. In this work we propose an intuitive augmented reality environment allowing visualization of pre-operative data during the intervention, with an overlay of the optical information from the surgical site. The pre-operative CT volume is first registered to the patient by acquiring a single C-arm X-ray image and using 3D/2D intensity-based registration. Next, we use an RGBD sensor on the C-arm to fuse the optical information of the surgical site with patient pre-operative medical data and provide an augmented reality environment. The 3D/2D registration of the pre- and intra-operative data allows us to maintain a correct visualization each time the C-arm is repositioned or the patient moves. An overall mean target registration error (mTRE) and standard deviation of 5.24 ± 3.09 mm was measured averaged over 19 C-arm poses. The proposed solution enables the surgeon to visualize pre-operative data overlaid with information from the surgical site (e.g. surgeon’s hands, surgical tools, etc.) for any C-arm pose, and negates issues of line-of-sight and long setup times, which are present in commercially available systems.

Investigative Ophthalmology & Visual Science

Kapil D. Katyal; Jason W. Harper; Paul E. Rosendall

Purpose: Visual scanning by sighted individuals is done using eye and head movements. In contrast, scanning using the Argus II is solely done by head movement, since eye movements can introduce localization errors. Here, we tested if a scanning mode utilizing eye movements increases visual stability and reduces head movements in Argus II users. Methods: Eye positions were measured in real-time and were used to shift the region of interest (ROI) that is sent to the implant within the wide field of view (FOV) of the scene camera. Participants were able to use combined eye-head scanning: shifting the camera by moving their head and shifting the ROI within the FOV by eye movement. Eight blind individuals implanted with the Argus II retinal prosthesis participated in the study. A white target appeared on a touchscreen monitor and the participants were instructed to report the location of the target by touching the monitor. We compared the spread of the responses, the time to complete the task, and the amount of head movements between combined eye-head and head-only scanning. Results: All participants benefited from the combined eye-head scanning mode. Better precision (i.e., narrower spread of the perceived location) was observed in six out of eight participants. Seven of eight participants were able to adopt a scanning strategy that enabled them to perform the task with significantly less head movement. Conclusions: Integrating an eye tracker into the Argus II is feasible, reduces head movements in a seated localization task, and improves pointing precision.

The Journal of Arthroplasty

Mehran Armand

Background: Laxity of soft tissues after total hip arthroplasty is considered to be a cause of accelerated wear of bearing surfaces and dislocation. The purpose of this study is to assess the contribution of the anterior and posterior capsular ligamentous complexes and the short external rotators, except the quadratus femoris, on the stability of the hip against axial traction. Methods: The study subjects comprised 7 fresh cadavers with 12 normal hip joints. In 6 hips, soft tissues surrounding the hip joint were resected in the following order to simulate the anterior approach: anterior capsule, posterior capsule, piriformis, conjoined tendon, and external obturator. In the remaining 6 hips, soft tissues were resected in the following order to simulate the posterior approach: piriformis, conjoined tendon, external obturator, posterior capsule, and anterior capsule. Soft tissue tension was measured by applying traction amounting to 250 N with joints in the neutral position. Results: The separation distance between the femoral head and acetabulum during axial leg traction significantly increased from 4.0 to 14.5 mm on average after circumferential resection of the capsule via the anterior approach. Subsequent resection of the short external rotators increased the separation distance up to 19.0 mm, but the differences did not reach statistical significance. Resection of the short external rotators via the posterior approach did not significantly increase the separation distance; it significantly increased from 6.0 to 11.4 mm after the resection of the anterior capsule and further to 20.5 mm after the resection of the posterior capsule. Conclusion: The posterior capsule, in addition to the anterior capsule, significantly contributes to hip joint stability in distraction regardless of whether the short external rotators, except the quadratus femoris, were preserved or resected.

Proceedings of ACL 2018, System Demonstrations

Cash J. Costello; Paul McNamee; James C. Mayfield

We demonstrate two annotation platforms that allow an English speaker to annotate names for any language without knowing the language. These platforms provided high-quality ’‘silver standard” annotations for low-resource language name taggers (Zhang et al., 2017) that achieved state-of-the-art performance on two surprise languages (Oromo and Tigrinya) at LoreHLT20171 and ten languages at TAC-KBP EDL2017 (Ji et al., 2017). We discuss strengths and limitations and compare other methods of creating silver- and gold-standard annotations using native speakers. We will make our tools publicly available for research use.

Nature Medicinevolume

Philippe M. Burlina

An artificial intelligence (AI) using a deep-learning approach can classify retinal images from optical coherence tomography for early diagnosis of retinal diseases and has the potential to be used in other image-based medical diagnoses.


Christopher R. Ratto; Michael J. Pekala; Neil M. Fendley; I-Jeng Wang

This paper considers attacks against machine learning algorithms used in remote sensing applications, a domain that presents a suite of challenges that are not fully addressed by current research focused on natural image data such as ImageNet. In particular, we present a new study of adversarial examples in the context of satellite image classification problems. Using a recently curated data set and associated classifier, we provide a preliminary analysis of adversarial examples in settings where the targeted classifier is permitted multiple observations of the same location over time. While our experiments to date are purely digital, our problem setup explicitly incorporates a number of practical considerations that a real-world attacker would need to take into account when mounting a physical attack. We hope this work provides a useful starting point for future studies of potential vulnerabilities in this setting.

Ryan N. Mukherjee; Neil M. Fendley; Gordon A. Christie

In this document, we provide: 1: Full acknowledgments. 2: Descriptions and distributions of metadata features. 3: Additional collection details. 4: Additional results. 5: Examples from our dataset.


Joseph L. Moore; Kevin C. Wolfe; Katie M. Popek; Christopher J. Paxton; Philippe M. Burlina; Kapil D. Katyal

Fast, collision-free motion through unknown environments remains a challenging problem for robotic systems. In these situations, the robot's ability to reason about its future motion is often severely limited by sensor field of view (FOV). By contrast, biological systems routinely make decisions by taking into consideration what might exist beyond their FOV based on prior experience. In this paper, we present an approach for predicting occupancy map representations of sensor data for future robot motions using deep neural networks. We evaluate several deep network architectures, including purely generative and adversarial models. Testing on both simulated and real environments we demonstrated performance both qualitatively and quantitatively, with SSIM similarity measure up to 0.899. We showed that it is possible to make predictions about occupied space beyond the physical robot's FOV from simulated training data. In the future, this method will allow robots to navigate through unknown environments in a faster, safer manner.

U.S. Patent

Kevin C. Wolfe; Derek M. Rollend; Matthew P. Para; Dean M. Kleissas; Kapil D. Katyal; Philippe M. Burlina; Seth D. Billings

An apparatus for improving performance of a retinal implant may include processing circuitry. The processing circuitry may be configured to receive image data corresponding to a camera field of view, determine whether a particular object is detected within the camera field of view, perform image data processing to enable a representation of a portion of the image data corresponding to an implant field of view to be provided on a retinal implant where the implant field of view is smaller than the camera field of view, and, responsive to the particular object being located outside the implant field of view, provide a directional indicator in the implant field of view to indicate a location of the particular object relative to the implant field of view.

JAMA Ophthalmology, Dec 2018

Neil J. Joshi; David E. Freund; Philippe M. Burlina

This study uses fundus images from a national data set to assess 2 deep learning methods for referability classification of age-related macular degeneration.

JAMA Ophthalmology, Dec 2018

Neil J. Joshi; Philippe M. Burlina

Although deep learning (DL) can identify the intermediate or advanced stages of age-related macular degeneration (AMD) as a binary yes or no, stratified gradings using the more granular Age-Related Eye Disease Study (AREDS) 9-step detailed severity scale for AMD provide more precise estimation of 5-year progression to advanced stages. The AREDS 9-step detailed scale's complexity and implementation solely with highly trained fundus photograph graders potentially hampered its clinical use, warranting development and use of an alternate AREDS simple scale, which although valuable, has less predictive ability.

2018 IEEE Integrated STEM Education Conference (ISEC)

Jordan K. Matelsky; Nathan G. Drenkow; Joseph T. Downs; Caitlyn A. Bishop; William R. Gray Roncal; Brock A. Wester

Programs that focus on student outreach are often disjoint from sponsored research efforts, despite the mutually beneficial opportunities that are possible with a combined approach. We designed and piloted a program to simultaneously meet the needs of underserved students and a large-scale sponsored research goal. Our program trained undergraduates to produce neuron maps for a major connectomics effort (i.e., single synapse brain maps), while providing these students with the resources and mentors to conduct novel research. Students were recruited from Johns Hopkins University to participate in a ten-week summer program. These students were trained in computational research, scientific communication skills and methods to map electron microscopy volumes. The students also had regular exposure to mentors and opportunities for guided, small group, independent discovery. A Learning-for-Use model was leveraged to provide the students with the tools, skills, and knowledge to pursue their research questions, while an Affinity Research Group model was adapted to provide students with mentorship in conducting cutting-edge research. A focus was placed on recruiting students who had limited opportunities and access to similar experiences. Program metrics demonstrated a substantial increase in knowledge (e.g., neuroscience, graph theory, machine learning, and scientific communication), while students also showed an overall increase in awareness and responsiveness to computational research after the program. Ultimately, the program positively impacted students' career choices and research readiness, and successfully achieved sponsor goals in a compact timeframe. This framework for combining outreach with sponsored research can be broadly leveraged for other programs across domains.

Medical Physics

Mehran Armand

Purpose: Cone-beam computed tomography (CBCT) is one of the primary imaging modalities in radiation therapy, dentistry, and orthopedic interventions. While CBCT provides crucial intraoperative information, it is bounded by a limited imaging volume, resulting in reduced effectiveness. This paper introduces an approach allowing real-time intraoperative stitching of overlapping and nonoverlapping CBCT volumes to enable 3D measurements on large anatomical structures. Methods: A CBCT-capable mobile C-arm is augmented with a red-green-blue-depth (RGBD) camera. An offline cocalibration of the two imaging modalities results in coregistered video, infrared, and x-ray views of the surgical scene. Then, automatic stitching of multiple small, nonoverlapping CBCT volumes is possible by recovering the relative motion of the C-arm with respect to the patient based on the camera observations. We propose three methods to recover the relative pose: RGBbased tracking of visual markers that are placed near the surgical site, RGBD-based simultaneous localization and mapping (SLAM) of the surgical scene which incorporates both color and depth information for pose estimation, and surface tracking of the patient using only depth data provided by the RGBD sensor. Results: On an animal cadaver, we show stitching errors as low as 0.33, 0.91, and 1.72 mm when the visual marker, RGBD SLAM, and surface data are used for tracking, respectively. Conclusions: The proposed method overcomes one of the major limitations of CBCT C-arm systems by integrating vision-based tracking and expanding the imaging volume without any intraoperative use of calibration grids or external tracking systems. We believe this solution to be most appropriate for 3D intraoperative verification of several orthopedic procedures. © 2018 American Association of Physicists in Medicine []

IEEE Robotics and Automation Letters

Galen E. Mullins

To properly evaluate the ability of robots to operate autonomously in the real world, it is necessary to develop methods for quantifying their self-righting capabilities. Here, we improve upon a sampling-based framework for evaluating self-righting capabilities that was previously validated in two dimensions. To apply this framework to realistic robots in three dimensions, we require algorithms capable of scaling to high-dimensional configuration spaces. Therefore, we introduce a novel adaptive sampling approach that biases queries toward the transitional states of the system, thus, identifying the critical transitions of the system using substantially fewer samples. To demonstrate this improvement, we compare our approach to results that were generated via the previous framework and were validated on hardware platforms. Finally, we apply our technique to a high-fidelity three-dimensional model of a US Navy bomb-defusing robot, which was too complex for the previous framework to analyze.

Nature Methods

Jordan K. Matelsky; Priya J. Manavalan; Robert T. Hider Jr.; William R. Gray Roncal; Timothy C. Gion; Mark A. Chevillet; Brock A. Wester; Derek M. Pryor

Recent technological developments, such as high-throughput imaging and sequencing, enable experimentalists to collect increasingly large, complex, and heterogeneous ‘big’ data. These studies result in terabytes of data per day, yielding petabytes across experiments and laboratories. These experimental capabilities exceed the scale or feature set of existing software. For example, such data cannot be stored, processed, and visualized on a laptop or workstation. Instead, big data must be stored on data centers and processed on high-performance compute clusters.In 2011, we launched Open Connectome Project1, an open-access data repository powered by open-source web-services software applications that store, analyze, and visualize large imaging datasets. However, as technology changed, features were added, and scale increased, our academic development team and resources became overwhelmed. We overhauled our custom stack into a community-built and -maintained software ecosystem deployed in the commercial cloud, integrating multiple open-source projects and extending them for our needs ( The ecosystem makes it possible to analyze disparate datasets by reusing components originally designed for other applications.

Frontiers in Neuroinformatics

Elizabeth P. Reilly; Jeffrey S. Garretson; William R. Gray Roncal; Dean M. Kleissas; Matthew J. Roos; Mark A. Chevillet; Brock A. Wester

Neuroscientists are actively pursuing high-precision maps, or graphs consisting of networks of neurons and connecting synapses in mammalian and non-mammalian brains. Such graphs, when coupled with physiological and behavioral data, are likely to facilitate greater understanding of how circuits in these networks give rise to complex information processing capabilities. Given that the automated or semi-automated methods required to achieve the acquisition of these graphs are still evolving, we developed a metric for measuring the performance of such methods by comparing their output with those generated by human annotators (“ground truth” data). Whereas classic metrics for comparing annotated neural tissue reconstructions generally do so at the voxel level, the metric proposed here measures the “integrity” of neurons based on the degree to which a collection of synaptic terminals belonging to a single neuron of the reconstruction can be matched to those of a single neuron in the ground truth data. The metric is largely insensitive to small errors in segmentation and more directly measures accuracy of the generated brain graph. It is our hope that use of the metric will facilitate the broader community's efforts to improve upon existing methods for acquiring brain graphs. Herein we describe the metric in detail, provide demonstrative examples of the intuitive scores it generates, and apply it to a synthesized neural network with simulated reconstruction errors. Demonstration code is available.


Ashley J. Llorens; Ryan W. Gardner; Jared J. Markowitz

This paper provides a complexity analysis for the game of reconnaissance blind chess (RBC), a recently-introduced variant of chess where each player does not know the positions of the opponent's pieces a priori but may reveal a subset of them through chosen, private sensing actions. In contrast to many commonly studied imperfect information games like poker, an RBC player does not know what the opponent knows or has chosen to learn, exponentially expanding the size of the game's information sets (i.e., the number of possible game states that are consistent with what a player has observed). Effective RBC sensing and moving strategies must account for the uncertainty of both players, an essential element of many real-world decision-making problems. Here we evaluate RBC from a game theoretic perspective, tracking the proliferation of information sets from the perspective of selected canonical bot players in tournament play. We show that, even for effective sensing strategies, the game sizes of RBC compare to those of Go while the average size of a player's information set throughout an RBC game is much greater than that of a player in Heads-up Limit Hold 'Em. We compare these measures of complexity among different playing algorithms and provide cursory assessments of the various sensing and moving strategies.


Edward W. Staley; Kapil D. Katyal; Philippe M. Burlina

This work studies joint camera and robotic manipulator control for reaching tasks in complex environments with obstacles and occluders. We obviate the conventional challenges involved in complex perception, planning, and control modules and careful calibration for sensing and actuation and seek a solution leveraging deep reinforcement learning (DRL). Our method using DRL and deep Q-learning learns a policy for robot actuation and perception control, mapping directly raw image pixels inputs into camera motion and manipulator joint control actions outputs. We show results comparing different training approaches, and demonstrating competency for increasingly complex situations and degrees of freedom. These preliminary experiments suggest the effectiveness and robustness of the proposed approach.

2018 IEEE 14th International Conference on e-Science

Dean M. Kleissas; William R. Gray Roncal

We describe NDStore, a scalable multi-hierarchical data storage deployment for spatial analysis of neuroscience data on the AWS cloud. The system design is inspired by the requirement to maintain high I/O throughput for workloads that build neural connectivity maps of the brain from peta-scale imaging data using computer vision algorithms. We store all ourdata on the AWS object store S3 to limit our deployment costs S3 serves as our base-tier of storage. Redis, an in-memory key-value engine, is used as our caching tier. The data is dynamically moved between the different storage tiers based on user access.All programming interfaces to this system are RESTful web-services. We include a performance evaluation that shows thatour production system provides good performance for a variety of workloads by combining the assets of multiple cloud services.

Procedia Computer Science

Beth G. Magen; Jeffrey S. Lin; Lien T. Duong; Erhan Guven; Paul A. Hanke; Jeffrey S. Chavis; Matthew D. Dinmore

We explored unsupervised machine learning algorithms, specifically graph analytics, applied to behaviors observed in heterogeneous network sensor data for discovering anomalous behavior that could include novel attacks. In addition, we explored the potential difficulties with applying unsupervised machine learning approaches to anomaly detection in a network-defense context to understand how to integrate inherently imperfect anomaly-detection approaches into the workflow of a cyber defense infrastructure. Two general approaches can be used to discover anomalies: (1.) detecting rarity, i.e., finding those activities that are observed the least frequently in a set of observations, and (2.) detecting novelty, i.e., finding activities with the lowest estimated probability of observation based on prior observations of baseline (presumably “normal”) data. This effort will describe the case of detecting rarity. In this paper, we describe the entire pipeline starting from explaining the data used, the data ingest, the quantization of features, application of graph analytics to the data, post-processing to reduce results, and measuring the performance. A network-penetration experiment was setup to conduct the network attacks and generate the data that is the input to this work. Baseline methods are proposed and compared to the main method that is described in this paper.

2018 IEEE International Conference on Communications (ICC)

Clayton R. Fink; Aurora C. Schmidt

We investigate the use of spectral clustering of hashtag adoptions in Nigerian Twitter users between October 2013 and November 2014. This period is of interest due to the online campaign centered around the #BringBackOurGirls (BBOG) hashtag, which relates to the kidnapping of 276 Nigerian schoolgirls. We examine the adoption of hashtags during the six months before, the month after, and the six months following the kidnapping to test the informational value of behavior-based clusters discovered with unsupervised methods for predicting future hashtag usage behaviors. We demonstrate an efficient spectral clustering approach, that leverages power iteration on symmetric adjacency matrices, to group users based on hashtag adoptions prior to the kidnapping. Unlike follow networkbased clusters, these adoption-based clusters reveal groups of users with similar interests and prove to be more predictive of interest in future topics. We compare this unsupervised spectral clustering to spectral clustering based on symmetrized follow network relations as well as clusters induced by latent Dirichlet allocation (LDA) topics. We find that hashtag adoptionbased clusters perform similarly to the more computationally expensive LDA approach at identifying interest groups that are more likely to adopt future topical tags. We also benchmark the spectral clustering approach against the popular Louvain clustering approach on a synthetic dataset, finding the faster spectral clustering algorithm produces more balanced clusters with a higher similarity to the true interest groupings used to synthesize adoption data.

2018 IEEE International Conference on Robotics and Automation (ICRA)

Joseph L. Moore; William R. Setzler

In this paper, we describe the design and analysis of a fixed-wing unmanned aerial-aquatic vehicle. Inspired by prior work in aerobatic post-stall maneuvers for fixed-wing vehicles [1], we explore the feasibility of executing a water-to-air transition with a fixed-wing vehicle using almost entirely commercial off-the-shelf components (excluding the fuselage). To do this, we first propose a conceptual design based on observations about the dominant forces and dimensionless analysis. We then further refine this concept by building a design tool based on simplified models to explore the design space. To verify the results of the design tool, we use a higher fidelity model along with a direct hybrid trajectory optimization approach to show via numerical simulation that the water-to-air transition is feasible. Finally, we successfully test our design experimentally by hand-piloting a prototype vehicle through the water-to-air transition and discuss our approach for replacing the human-pilot with closed-loop control.

International Journal of Medical Robotics and Computer Assisted Surgery

Amit Banerjee

Background: Surgical management of colorectal cancer relies on accurate identification of tumor and possible metastatic disease. Hyperspectral (HS) sensing is a passive, non‐ionizing diagnostic method that has been considered for multiple tumor types. The ability to use HS for identification of tumor specimens during surgical resection of colorectal cancers was explored. Methods: Patients with colorectal cancer who underwent operative resection were enrolled. HS measurements were performed both intra‐ and extra‐luminally. Spectral results were correlated with pathologic evaluation. Results: Fifteen patient specimens were analyzed. For patients with confirmed colorectal cancer, extraluminal spectra analysis yielded 61.68% sensitivity with 90% specificity. For intraluminal specimens, sensitivity increased to 91.97% with 90% specificity. Conclusions: Hyperspectral sensing can reliably detect tumors in resected colon specimens. This research offers promising results for a diagnostic technology that is non‐ionizing and does not require the use of contrast agents to achieve accurate colorectal cancer detection.

Journal of Pediatric Surgery

Amit Banerjee

Purpose: The definitive diagnosis of necrotizing enterocolitis (NEC) is typically at an advanced stage, indicating the need for sensitive and noninvasive diagnostic modalities. Near infrared spectroscopy (NIRS) has been utilized to noninvasively measure intraabdominal tissue oxygenation and to diagnose NEC, but specificity is lacking, in part because sensors are limited to a narrow band of the electromagnetic spectrum. Here, we introduce the concept of broadband optical spectroscopy (BOS) as a noninvasive method to characterize NEC. Methods: NEC was induced in 7-day old mice by gavage feeding with formula supplemented with enteric bacteria plus hypoxia. Transabdominal spectroscopy was performed daily using a broad-spectrum halogen light source coupled with a spectroradiometer capable of detection from 400 to 1800 nm. Results: A feature extraction algorithm was developed based on the spectral waveforms from mice with NEC. When subsequently tested on cohorts of diseased and control mice by a blinded examiner, noninvasive BOS was able to detect disease with 100% specificity and sensitivity. Conclusions: We reveal that the use of BOS is able to accurately and noninvasively discriminate the presence of NEC in a mouse model, thus introducing a noninvasive early diagnostic modality for this devastating disease.

SPIE Defense + Security, 2018

Marina B. Johnson; Allan P. Rosenberg; Grace M. Hwang; Shane W. Lani

We examine the potential for low-intensity focused ultrasound to non-invasively produce small (< 1mm3) focal acoustic fields for precise brain stimulation near the skull. Our goal is to utilize transcranial ultrasonic neuromodulation to transform communications and immersive gaming experiences and to optimize neuromodulation applications in medicine. To begin evaluating possible hardware design strategies for engineering ultrasonic brain interfaces, in the present study we evaluated the skull transmission properties of longitudinal and shear waves as a function of incidence angle for 0–2 MHz. We also employed K-wave and time-reversal numerical simulations to further inspect waveform interactions with modeled layers. Timereversal focusing for single-layer and three-layer skull cases were simulated for three different bandwidth ranges (MHz): Broadband(0–2), 1 MHz(0.4–1.4), and 0.2 MHz(0.4–0.6). Broadband and 1 MHz bandwidths emulate the performance of micromachined or piezo membrane ultrasonic arrays, while 0.2 MHz bandwidth is representative of the performance of conventional piezoelectric ultrasonic transducer. We found the 3dB focal volume was ~0.6 mm for broadband and 1 MHz, with the latter showing a slightly larger sidelobe. In contrast, 0.2 MHz nearly doubled the size of the 3dB focal volume while producing prominent sidelobes. Our results provide initial confirmation that a broadband, ultrasonic, linear array can access the first 15 mm of the human brain, which contains circuitry essential to sensory processing including pre-motor and motor planning, somatosensory feedback, and visual attention. These areas are critical targets for providing haptic feedback via non-invasive neural stimulation.

Journal of Digital Imaging

Jordan K. Matelsky; William R. Gray Roncal; Michael K. Toma; Corban G. Rivera; Erik C. Johnson

Medical imaging analysis depends on the reproducibility of complex computation. Linux containers enable the abstraction, installation, and configuration of environments so that software can be both distributed in self-contained images and used repeatably by tool consumers. While several initiatives in neuroimaging have adopted approaches for creating and sharing more reliable scientific methods and findings, Linux containers are not yet mainstream in clinical settings. We explore related technologies and their efficacy in this setting, highlight important shortcomings, demonstrate a simple use-case, and endorse the use of Linux containers for medical image analysis.

Current Opinion in Neurobiology

Grace M. Hwang; Shane W. Lani

Ultrasound (US) is recognized for its use in medical imaging as a diagnostic tool. As an acoustic energy source, US has become increasingly appreciated over the past decade for its ability to non-invasively modulate cellular activity including neuronal activity. Data obtained from a host of experimental models has shown that low-intensity US can reversibly modulate the physiological activity of neurons in peripheral nerves, spinal cord, and intact brain circuits. Experimental evidence indicates that acoustic pressures exerted by US act, in part, on mechanosensitive ion channels to modulate activity. While the precise mechanisms of action enabling US to both stimulate and suppress neuronal activity remain to be clarified, there are several advantages conferred by the physics of US that make it an appealing option for neuromodulation. For example, it can be focused with millimeter spatial resolutions through skull bone to deep-brain regions. By increasing our engineering capability to leverage such physical advantages while growing our understanding of how US affects neuronal function, the development of a new generation of non-invasive neurotechnology can be developed using ultrasonic methods.

SPIE Defense + Security, 2018

Clara A. Scholl; Carlos A. Renjifo; Eyal Bar-Kochba; David W. Blodgett; Aaron T. Criss; Clare W. Lau; Grace M. Hwang; Jason R. Harper; Thomas B. Criss; Carissa L. Rodriguez

Optical neuroimaging technologies aim to observe neural tissue structure and function by detecting changes in optical signals (scatter, absorption, etc…) that accompany a range of anatomical and functional properties of brain tissue. At present, there is a tradeoff between spatial and temporal resolution that is not currently optimized in a single imaging modality. This work focuses on filling the gap between the spatio-temporal resolutions of existing neuroimaging technologies by developing a coherent optics-based imaging system capable of extracting anatomical and functional information across a measurement volume by leveraging a coherent optics-based approach that provides both magnitude and phase information of the sample. We developed a digital holographic imaging (DHI) system capable of detecting these optical signals with a spatial resolution of better than 50 μm over a twenty-five mm2 field of view at sampling rates of 300 Hz and higher. The DHI system operates in the near-infrared (NIR) at 1064 nm, facilitating increased light penetration depths while minimizing contributions from overt changes in oxy- and deoxy-hemoglobin concentration present at shorter NIR wavelengths. This label-free imaging method detects intrinsic signals driven by tissue motion, allowing for innately spatio-temporally registered extraction of anatomical and functional signals in vivo. In this work, we present in vivo results from rat whisker barrel cortex demonstrating signals reflecting anatomical structure and tissue dynamics.

Journal of Systems and Software

Robert C. Hawthorne; Paul G. Stankiewicz; Galen E. Mullins

In this paper we propose a new method for generating test scenarios for black-box autonomous systems that demonstrate critical transitions in performance modes. This method provides a test engineer with key insights into the software’s decision-making engine and how those decisions affect transitions between performance modes. We achieve this via adaptive, simulation-based testing of the autonomous system where each sample represents a simulated scenario. The test scenario, i.e the system input, represents a given configuration of environmental or mission parameters and the resulting outputs are the system’s performance based on high-level success criteria. For realistic testing scenarios, the dimensionality of the configuration space and the computational expense of high-fidelity simulations precludes exhaustive or uniform sampling. Thus, we have developed specialized adaptive search algorithms designed to discover performance boundaries of the autonomy using a minimal number of samples. Further, unsupervised clustering techniques are presented that can group test scenarios by the resulting performance modes and sort them by those which are most effective at diagnosing changes in the autonomous system’s behavior. The result is a testing framework that gives the test engineer a set of diverse scenarios that exercises the decision boundaries of the autonomous system under test.

2018 IEEE International Conference on Robotics and Automation (ICRA)

Jordan D. Appler; Paul G. Stankiewicz; Austin G. Dress; Galen E. Mullins

In this paper, we investigate the use of surrogate agents to accelerate test scenario generation for autonomous vehicles. Our goal is to train the surrogate to replicate the true performance modes of the system. We create these surrogates by utilizing imitation learning with deep neural networks. By using imitator surrogates in place of the true agent, we are capable of predicting mission performance more quickly, gaining greater throughput for simulation-based testing. We demonstrate that using on-line imitation learning with Dataset Aggregation (DAgger) can not only correctly encode a policy that executes a complex mission, but can also encode multiple different behavioral modes. To improve performance for the target vehicle and mission, we manipulate the training set during each iteration to remove samples which do not contribute to the final policy. We call this approach Quantile-DAgger (Q-DAgger) and demonstrate its ability to replicate the behaviors of an autonomous vehicle in a collision avoidance scenario.

Journal of Biomechanics

Andrew C. Merkle; Francesco V. Tenore

Biological tissue testing is inherently susceptible to the wide range of variability specimen to specimen. A primary resource for encapsulating this range of variability is the biofidelity response corridor or BRC. In the field of injury biomechanics, BRCs are often used for development and validation of both physical, such as anthropomorphic test devices, and computational models. For the purpose of generating corridors, post-mortem human surrogates were tested across a range of loading conditions relevant to under-body blast events. To sufficiently cover the wide range of input conditions, a relatively small number of tests were performed across a large spread of conditions. The high volume of required testing called for leveraging the capabilities of multiple impact test facilities, all with slight variations in test devices. A method for assessing similitude of responses between test devices was created as a metric for inclusion of a response in the resulting BRC. The goal of this method was to supply a statistically sound, objective method to assess the similitude of an individual response against a set of responses to ensure that the BRC created from the set was affected primarily by biological variability, not anomalies or differences stemming from test devices.

2017 8th International IEEE/EMBS Conference on Neural Engineering (NER)

Kapil D. Katyal; Jason W. Harper; Paul E. Rosendall

The Argus II retinal prosthesis has a dissociation between the line of sight of the camera and that of the eye. The image-capturing camera is mounted on the glasses and therefore, eye movements do not influence the visual information sent to the implanted electrodes. We have demonstrated a closed-loop setup that shifts the visual information based on real-time eye position. In contrast to previous experiments, the setup does not require head restraints. The setup is based on a self-calibrating mobile eye tracker that allows free head movements. The patient was required to report the location of a white bar on a black background. An internal sensor was used to record the amount of head motion during the task. Results suggest that during combined eye-head scanning, head movement amplitude was significantly less than in the currently used head-only scanning. In the combined eye-head scanning, the patient first steers to the region of interest using eye movements followed by head movements as in sighted individuals. This is the first demonstration that eye movements can be used in combination with head movements to steer the line of sight of a camera-based retinal prosthesis.

Online J Public Health Inform.

Howard S. Burkom

Our objective was to compare the effectiveness of applying the historical limits method (HLM) to poison center (PC) call volumes with vs without stratifying by exposure type.

Computers in Biology and Medicine

Philippe M. Burlina; David E. Freund; Neil J. Joshi

Background: When left untreated, age-related macular degeneration (AMD) is the leading cause of vision loss in people over fifty in the US. Currently it is estimated that about eight million US individuals have the intermediate stage of AMD that is often asymptomatic with regard to visual deficit. These individuals are at high risk for progressing to the advanced stage where the often treatable choroidal neovascular form of AMD can occur. Careful monitoring to detect the onset and prompt treatment of the neovascular form as well as dietary supplementation can reduce the risk of vision loss from AMD, therefore, preferred practice patterns recommend identifying individuals with the intermediate stage in a timely manner. Methods: Past automated retinal image analysis (ARIA) methods applied on fundus imagery have relied on engineered and hand-designed visual features. We instead detail the novel application of a machine learning approach using deep learning for the problem of ARIA and AMD analysis. We use transfer learning and universal features derived from deep convolutional neural networks (DCNN). We address clinically relevant 4-class, 3-class, and 2-class AMD severity classification problems. Results: Using 5664 color fundus images from the NIH AREDS dataset and DCNN universal features, we obtain values for accuracy for the (4-, 3-, 2-) class classification problem of (79.4%, 81.5%, 93.4%) for machine vs. (75.8%, 85.0%, 95.2%) for physician grading. Discussion: This study demonstrates the efficacy of machine grading based on deep universal features/transfer learning when applied to ARIA and is a promising step in providing a pre-screener to identify individuals with intermediate AMD and also as a tool that can facilitate identifying such individuals for clinical studies aimed at developing improved therapies. It also demonstrates comparable performance between computer and physician grading.

IEEE Robotics and Automation Letters

Mehran Armand; Ryan J. Murphy

This letter presents the development and evaluation of concurrent control of a robotic system for less-invasive treatment of osteolytic lesions behind an acetabular implant. This system implements safety constraints including a remote center of motion, virtual walls, and joint limits while operating through the screw holes of the acetabular implant. The formulated linear constrained optimization problem ensures these constraints are satisfied while maintaining precise control of the tip of a Continuum Dexterous Manipulator attached to a positioning robot. Experiments evaluated the performance of the tip control method within an acetabular cup. The controller reliably reached a series of goal points with a mean error of 0.42 mm and a worst-case error of straying 1.0 mm from our path.

International Journal of Computer Assisted Radiology and Surgery

Mehran Armand

Purpose: In minimally invasive interventions assisted by C-arm imaging, there is a demand to fuse the intra-interventional 2D C-arm image with pre-interventional 3D patient data to enable surgical guidance. The commonly used intensity-based 2D/3D registration has a limited capture range and is sensitive to initialization. We propose to utilize an opto/X-ray C-arm system which allows to maintain the registration during intervention by automating the re-initialization for the 2D/3D image registration. Consequently, the surgical workflow is not disrupted and the interaction time for manual initialization is eliminated. Methods: We utilize two distinct vision-based tracking techniques to estimate the relative poses between different C-arm arrangements: (1) global tracking using fused depth information and (2) RGBD SLAM system for surgical scene tracking. A highly accurate multi-view calibration between RGBD and C-arm imaging devices is achieved using a custom-made multimodal calibration target. Results: Several in vitro studies are conducted on pelvic-femur phantom that is encased in gelatin and covered with drapes to simulate a clinically realistic scenario. The mean target registration errors (mTRE) for re-initialization using depth-only and RGB + depth are 13.23 mm and 11.81 mm, respectively. 2D/3D registration yielded 75% success rate using this automatic re-initialization, compared to a random initialization which yielded only 23% successful registration Conclusion: The pose-aware C-arm contributes to the 2D/3D registration process by globally re-initializing the relationship of C-arm image and pre-interventional CT data. This system performs inside-out tracking, is self-contained, and does not require any external tracking devices.

Operative Neurosurgery

Mehran Armand

BACKGROUND: Neuromodulation devices have the potential to transform modern day treatments for patients with medicine-resistant neurological disease. For instance, the NeuroPace System (NeuroPace Inc, Mountain View, California) is a Food and Drug Administration (FDA)-approved device developed for closed-loop direct brain neurostimulation in the setting of drug-resistant focal epilepsy. However, current methods require placement either above or below the skull in nonanatomic locations. This type of positioning has several drawbacks including visible deformities and scalp pressure from underneath leading to eventual wound healing difficulties, micromotion of hardware with infection, and extrusion leading to premature explantation. OBJECTIVE: To introduce complete integration of a neuromodulation device within a customized cranial implant for biocompatibility optimization and prevention of visible deformity. METHODS: We report a patient with drug-resistant focal epilepsy despite previous seizure surgery and maximized medical therapy. Preoperative imaging demonstrated severe resorption of previous bone flap causing deformity and risk for injury. She underwent successful responsive neurostimulation device implantation via complete integration within a clear customized cranial implant. RESULTS: The patient has recovered well without complication and has been followed closely for 180 d. Device interrogation with electrocorticographic data transmission has been successfully performed through the clear implant material for the first time with no evidence of any wireless transmission interference. CONCLUSION: Cranial contour irregularities, implant site infection, and bone flap resorption/osteomyelitis are adverse events associated with implantable neurotechnology. This method represents a novel strategy to incorporate all future neuromodulation devices within the confines of a low-profile, computer-designed cranial implant and the newfound potential to eliminate contour irregularities, improve outcomes, and optimize patient satisfaction.

IEEE Robotics and Automation Letters

Mehran Armand; Ryan J. Murphy

In conventional core decompression of osteonecrosis, surgeons cannot successfully reach the whole area of the femoral head due to rigidity of the instruments currently used. To address this issue, we present design and fabrication of a novel steerable drill using a continuum dexterous manipulator (CDM) and two different flexible cutting tools passing through the lumen of the CDM. A set of experiments investigated functionality and efficiency of the curved-drilling approach and the flexible tools on simulated cancellous bone. Geometry of the cutter head, rotational and feed velocity of the tool, and pulling tension of the CDM cables have been identified as the effective curved-drilling parameters. Considering these parameters, we investigated drilling trajectory, contact force, and mass removal for various combinations of feed-velocities (0.05, 0.10, and 0.15 mm/s) and cable tensions (6, 10, 15, and 25 N) with constant rotational speed of 2250 r/min. Results show that: first, pulling tension of the cable is the most dominant parameter affecting the curved-drilling trajectory; and second, the proposed steerable drill is able to achieve 40° bend without buckling. Based on these results we developed a method for planning drill trajectories and successfully verified abilities for S-shape and multiple-branch drilling. The verification experiments were performed on both simulated and human cadaveric bones.


Neil J. Joshi; Philippe M. Burlina; Seth D. Billings

Objective: To evaluate the use of ultrasound coupled with machine learning (ML) and deep learning (DL) techniques for automated or semi-automated classification of myositis. Methods: Eighty subjects comprised of 19 with inclusion body myositis (IBM), 14 with polymyositis (PM), 14 with dermatomyositis (DM), and 33 normal (N) subjects were included in this study, where 3214 muscle ultrasound images of 7 muscles (observed bilaterally) were acquired. We considered three problems of classification including (A) normal vs. affected (DM, PM, IBM); (B) normal vs. IBM patients; and (C) IBM vs. other types of myositis (DM or PM). We studied the use of an automated DL method using deep convolutional neural networks (DL-DCNNs) for diagnostic classification and compared it with a semi-automated conventional ML method based on random forests (ML-RF) and “engineered” features. We used the known clinical diagnosis as the gold standard for evaluating performance of muscle classification. Results: The performance of the DL-DCNN method resulted in accuracies ± standard deviation of 76.2% ± 3.1% for problem (A), 86.6% ± 2.4% for (B) and 74.8% ± 3.9% for (C), while the ML-RF method led to accuracies of 72.3% ± 3.3% for problem (A), 84.3% ± 2.3% for (B) and 68.9% ± 2.5% for (C). Conclusions: This study demonstrates the application of machine learning methods for automatically or semi-automatically classifying inflammatory muscle disease using muscle ultrasound. Compared to the conventional random forest machine learning method used here, which has the drawback of requiring manual delineation of muscle/fat boundaries, DCNN-based classification by and large improved the accuracies in all classification problems while providing a fully automated approach to classification.

2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)

Philippe M. Burlina; Erica L. Schwarz; Seth D. Billings; Neil J. Joshi

This study addresses the development of machine learning methods for reduced space ultrasound to perform automated prescreening of breast cancer. The use of ultrasound in low-resource settings is constrained by lack of trained personnel and equipment costs, and motivates the need for automated, low-cost diagnostic tools. We hypothesize a solution to this problem is the use of 1D ultrasound (single piezoelectric element). We leverage random forest classifiers to classify 1D samples of various types of tissue phantoms simulating cancerous, benign lesions, and non-cancerous tissues. In addition, we investigate the optimal ultrasound power and frequency parameters to maximize performance. We show preliminary results on 2-, 3- and 5-class classification problems for the ideal power/frequency combination. These results demonstrate promise towards the use of a single-element ultrasound device to screen for breast cancer.

Shock Waves

Peter M. Thielen; Alexander S. Iwaskiw; Julie E. Gleason; Thomas S. Mehoke; Jessica E. Dymond; Brock A. Wester; Andrew C. Merkle; Jeffrey M. Paulson

Biological response to blast overpressure is complex and results in various and potentially non-concomitant acute and long-term deficits to exposed individuals. Clinical links between blast severity and injury outcomes remain elusive and have yet to be fully described, resulting in a critical inability to develop associated protection and mitigation strategies. Further, experimental models frequently fail to reproduce observed physiological phenomena and/or introduce artifacts that confound analysis and reproducibility. New models are required that employ consistent mechanical inputs, scale with biological analogs and known clinical data, and permit high-throughput examination of biological responses for a range of environmental and battlefield- relevant exposures. Here we describe a novel, biofidelic headform capable of integrating complex biological samples for blast exposure studies. We additionally demonstrate its utility in detecting acute transcriptional responses in the model organism Caenorhabditis elegans after exposure to blast overpressure. This approach enables correlation between mechanical exposure and biological outcome, permitting both the enhancement of existing surrogate and computational models and the high-throughput biofidelic testing of current and future protection systems.

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Philippe M. Burlina; I-Jeng Wang; Kapil D. Katyal

This work leverages Deep Reinforcement Learning (DRL) to make robotic control immune to changes in the robot manipulator or the environment and to perform reaching, collision avoidance and grasping without explicit, prior and fine knowledge of the human arm structure and kinematics, without careful hand-eye calibration, solely based on visual/retinal input, and in ways that are robust to environmental changes. We learn a manipulation policy which we show takes the first steps toward generalizing to changes in the environment and can scale and adapt to new manipulators. Experiments are aimed at a) comparing different DCNN network architectures b) assessing the reward prediction for two radically different manipulators and c) performing a sensitivity analysis comparing a classical visual servoing formulation of the reaching task with the proposed DRL method.

2017 Fifteenth IAPR International Conference on Machine Vision Applications (MVA)

I-Jeng Wang; Philippe M. Burlina; Aurora C. Schmidt; Jared J. Markowitz

We examine hierarchical approaches to image classification problems that include categories for which we have no training examples. Building on prior work in hierarchical classification that optimizes the trade-off between depth in a tree and accuracy of placement, we compare the performance of multiple formulations of the problem on both previously seen (non-novel) and previously unseen (novel) classes. We use a subset of 150 object classes from the ImageNet ILSVRC2012 data set, for which we have 218 human-annotated semantic attribute labels and for which we compute deep convolutional features using the OVERFEAT network. We quantitatively evaluate several approaches, using input posteriors derived from distances to SVM classifier boundaries as well as input posteriors based on semantic attribute estimation. We find that the relative performances of the methods differ in non-novel and novel applications and achieve information gains in novel applications through the incorporation of attribute-based posteriors.

2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)

Philippe M. Burlina; David E. Freund; Michael J. Pekala; Neil J. Joshi; Arnaldo Horta

This work investigates a hybrid method based on random forests and deep image features to combine non-visual side channel information with image data for classification. We apply this to automated retinal image analysis (ARIA) and the detection of age-related macular degeneration (AMD). For evaluation, we use a dataset collected by the National Institute of Health with over 4000 study participants. The non-visual side channel data includes information related to demographics (e.g. ethnicity), lifestyle (e.g. sunlight exposure), and prior conditions (e.g. cataracts). Our study, which compares the performance of different feature combinations, offers preliminary results that constitute a baseline for future investigations on joint deep visual and side channel feature exploitation for AMD detection. This approach could potentially be used for other medical image analysis problems.

Journal of Neuroscience Methods

Eric A. Pohlmeyer

Background: The common marmoset (Callithrix jacchus) has been proposed as a suitable bridge between rodents and larger primates. They have been used in several types of research including auditory, vocal, visual, pharmacological and genetics studies. However, marmosets have not been used as much for behavioral studies. New method: Here we present data from training 12 adult marmosets for behavioral neuroscience studies. We discuss the husbandry, food preferences, handling, acclimation to laboratory environments and neurosurgical techniques. In this paper, we also present a custom built “scoop” and a monkey chair suitable for training of these animals. Results: The animals were trained for three tasks: 4 target center-out reaching task, reaching tasks that involved controlling robot actions, and touch screen task. All animals learned the center-out reaching task within 1–2 weeks whereas learning reaching tasks controlling robot actions task took several months of behavioral training where the monkeys learned to associate robot actions with food rewards. Comparison to existing method: We propose the marmoset as a novel model for behavioral neuroscience research as an alternate for larger primate models. This is due to the ease of handling, quick reproduction, available neuroanatomy, sensorimotor system similar to larger primates and humans, and a lissencephalic brain that can enable implantation of microelectrode arrays relatively easier at various cortical locations compared to larger primates. Conclusion: All animals were able to learn behavioral tasks well and we present the marmosets as an alternate model for simple behavioral neuroscience tasks.

JAMA Ophthalmology

David E. Freund; Michael J. Pekala; Neil J. Joshi; Philippe M. Burlina

Importance: Age-related macular degeneration (AMD) affects millions of people throughout the world. The intermediate stage may go undetected, as it typically is asymptomatic. However, the preferred practice patterns for AMD recommend identifying individuals with this stage of the disease to educate how to monitor for the early detection of the choroidal neovascular stage before substantial vision loss has occurred and to consider dietary supplements that might reduce the risk of the disease progressing from the intermediate to the advanced stage. Identification, though, can be time-intensive and requires expertly trained individuals. Objective: To develop methods for automatically detecting AMD from fundus images using a novel application of deep learning methods to the automated assessment of these images and to leverage artificial intelligence advances. Design, Setting, and Participants: Deep convolutional neural networks that are explicitly trained for performing automated AMD grading were compared with an alternate deep learning method that used transfer learning and universal features and with a trained clinical grader. Age-related macular degeneration automated detection was applied to a 2-class classification problem in which the task was to distinguish the disease-free/early stages from the referable intermediate/advanced stages. Using several experiments that entailed different data partitioning, the performance of the machine algorithms and human graders in evaluating over 130 000 images that were deidentified with respect to age, sex, and race/ethnicity from 4613 patients against a gold standard included in the National Institutes of Health Age-related Eye Disease Study data set was evaluated. Main Outcomes and Measures: Accuracy, receiver operating characteristics and area under the curve, and kappa score. Results: The deep convolutional neural network method yielded accuracy (SD) that ranged between 88.4% (0.5%) and 91.6% (0.1%), the area under the receiver operating characteristic curve was between 0.94 and 0.96, and kappa coefficient (SD) between 0.764 (0.010) and 0.829 (0.003), which indicated a substantial agreement with the gold standard Age-related Eye Disease Study data set. Conclusions and Relevance: Applying a deep learning–based automated assessment of AMD from fundus images can produce results that are similar to human performance levels. This study demonstrates that automated algorithms could play a role that is independent of expert human graders in the current management of AMD and could address the costs of screening or monitoring, access to health care, and the assessment of novel treatments that address the development or progression of AMD.

International Conference on Interactive Theorem Proving

Aurora C. Schmidt; Daniel I. Genin; Yanni A. Kouskoulas

We present the formally verified predicate and strategy used to independently evaluate the safety of the final version (Run 15) of the FAAs next-generation air-traffic collision avoidance system, ACAS X. This approach is a general one that can analyze simultaneous vertical and horizontal maneuvers issued by aircraft collision avoidance systems. The predicate is specialized to analyze sequences of vertical maneuvers, and in the horizontal dimension is modular, allowing it to be safely composed with separately analyzed horizontal dynamics. Unlike previous efforts, this approach enables analysis of aircraft that are turning, and accelerating non-deterministically. It can also analyze the safety of coordinated advisories, and encounters with more than two aircraft. We provide results on the safety evaluation of ACAS X coordinated collision avoidance on a subset of the system state space. This approach can also be used to establish the safety of vertical collision avoidance maneuvers for other systems with complex dynamics.

Frontiers in Neuroinformatics

Christopher R. Ratto; Michael E. Wolmetz; Griffin W. Milsap; Matthew J. Roos; Carlos A. Caceres Garcia

Dimensionality poses a serious challenge when making predictions from human neuroimaging data. Across imaging modalities, large pools of potential neural features (e.g., responses from particular voxels, electrodes, and temporal windows) have to be related to typically limited sets of stimuli and samples. In recent years, zero-shot prediction models have been introduced for mapping between neural signals and semantic attributes, which allows for classification of stimulus classes not explicitly included in the training set. While choices about feature selection can have a substantial impact when closed-set accuracy, open-set robustness, and runtime are competing design objectives, no systematic study of feature selection for these models has been reported. Instead, a relatively straightforward feature stability approach has been adopted and successfully applied across models and imaging modalities. To characterize the tradeoffs in feature selection for zero-shot learning, we compared correlation-based stability to several other feature selection techniques on comparable data sets from two distinct imaging modalities: functional Magnetic Resonance Imaging and Electrocorticography. While most of the feature selection methods resulted in similar zero-shot prediction accuracies and spatial/spectral patterns of selected features, there was one exception; A novel feature/attribute correlation approach was able to achieve those accuracies with far fewer features, suggesting the potential for simpler prediction models that yield high zero-shot classification accuracy.

Cognitive Science

Jonathon J. Kopecky

While some studies suggest cultural differences in visual processing, others do not, possibly because the complexity of their tasks draws upon high‐level factors that could obscure such effects. To control for this, we examined cultural differences in visual search for geometric figures, a relatively simple task for which the underlying mechanisms are reasonably well known. We replicated earlier results showing that North Americans had a reliable search asymmetry for line length: Search for long among short lines was faster than vice versa. In contrast, Japanese participants showed no asymmetry. This difference did not appear to be affected by stimulus density. Other kinds of stimuli resulted in other patterns of asymmetry differences, suggesting that these are not due to factors such as analytic/holistic processing but are based instead on the target‐detection process. In particular, our results indicate that at least some cultural differences reflect different ways of processing early‐level features, possibly in response to environmental factors.

SPIE Defense + Security

David W. Blodgett; Mark A. Chevillet; Michael P. McLoughlin; Michael J. Fitch; Bruce A. Swett; Scott M. Hendrickson; Clara A. Scholl; Grace M. Hwang; Erich C. Walter

The development of portable non-invasive brain computer interface technologies with higher spatio-temporal resolution has been motivated by the tremendous success seen with implanted devices. This talk will discuss efforts to overcome several major obstacles to viability including approaches that promise to improve spatial and temporal resolution. Optical approaches in particular will be highlighted and the potential benefits of both Blood-Oxygen Level Dependent (BOLD) and Fast Optical Signal (FOS) will be discussed. Early-stage research into the correlations between neural activity and FOS will be explored.

Proceedings of the Human Factors and Ergonomics Society Annual Meeting

Kelly P. Sharer; Alexander G. Perrone; Kylie A. Molinaro; Nathan D. Bos; Ariel M. Greenberg

Open plan offices are both popular and controversial. We studied the response of a group moving from shared, but closed offices to an open plan office. The main data source reported here is a workplace satisfaction survey given pre-move, post-move, and to a lab baseline comparison group at the same organization, with some additional data from observations and interviews. Workers moving to the open plan office appreciated the flexible support for collaboration and the space’s appearance. There was lower satisfaction related to space for private concentrated work, temperature control, and ability to have private conversations. There were also some statistical interactions suggesting more positive responses by males and less positive responses by introverts; analysis was limited by small sample size. Observations and interviews gave further insight into open plan “neighborhoods” and the design of ad hoc spaces.

American Journal of Public Health

Howard S. Burkom

Public health institutions at local, regional, and national levels face evolving challenges with limited resources. Multiple forms of data are increasingly available, ranging from streaming statistical data to episodic reports of confirmed disease incidence. While technological tools for collecting and using these data proliferate, economic pressures often preclude growth of concomitant staff with required expertise. The intent here is to provide perspective on evolution of public health surveillance since the late 1990s, suggest how methodological approaches can be improved, and recommend areas of growth given mandates for evidence-based policy and practice. Remarks in this article stem from my last 18 years’ work on surveillance system development at the Johns Hopkins University Applied Physics Laboratory, as consultant to the US Centers for Disease Control and Prevention, and as board member and research committee chair of the International Society for Disease Surveillance. These efforts have been enriched by collaborations with US and international public health partners, both civilian and military.

Public Health Reports

Howard S. Burkom

Syndromic surveillance has expanded since 2001 in both scope and geographic reach and has benefited from research studies adapted from numerous disciplines. The practice of syndromic surveillance continues to evolve rapidly. The International Society for Disease Surveillance solicited input from its global surveillance network on key research questions, with the goal of improving syndromic surveillance practice. A workgroup of syndromic surveillance subject matter experts was convened from February to June 2016 to review and categorize the proposed topics. The workgroup identified 12 topic areas in 4 syndromic surveillance categories: informatics, analytics, systems research, and communications. This article details the context of each topic and its implications for public health. This research agenda can help catalyze the research that public health practitioners identified as most important.

Journal of Biomedical Informatics

Howard S. Burkom

To compare the performance of the standard Historical Limits Method (HLM), with a modified HLM (MHLM), the Farrington-like Method (FLM), and the Serfling-like Method (SLM) in detecting simulated outbreak signals. We used weekly time series data from 12 infectious diseases from the U.S. Centers for Disease Control and Prevention’s National Notifiable Diseases Surveillance System (NNDSS). Data from 2006 to 2010 were used as baseline and from 2011 to 2014 were used to test the four detection methods. MHLM outperformed HLM in terms of background alert rate, sensitivity, and alerting delay. On average, SLM and FLM had higher sensitivity than MHLM. Among the four methods, the FLM had the highest sensitivity and lowest background alert rate and alerting delay. Revising or replacing the standard HLM may improve the performance of aberration detection for NNDSS standard weekly reports.

Geosci. Model Dev.

Grace M. Hwang

We describe the Bayesian user-friendly model for palaeo-environmental reconstruction (BUMPER), a Bayesian transfer function for inferring past climate and other environmental variables from microfossil assemblages. BUMPER is fully self-calibrating, straightforward to apply, and computationally fast, requiring  ∼ 2s to build a 100-taxon model from a 100-site training set on a standard personal computer. We apply the model's probabilistic framework to generate thousands of artificial training sets under ideal assumptions. We then use these to demonstrate the sensitivity of reconstructions to the characteristics of the training set, considering assemblage richness, taxon tolerances, and the number of training sites. We find that a useful guideline for the size of a training set is to provide, on average, at least 10 samples of each taxon. We demonstrate general applicability to real data, considering three different organism types (chironomids, diatoms, pollen) and different reconstructed variables. An identically configured model is used in each application, the only change being the input files that provide the training-set environment and taxon-count data. The performance of BUMPER is shown to be comparable with weighted average partial least squares (WAPLS) in each case. Additional artificial datasets are constructed with similar characteristics to the real data, and these are used to explore the reasons for the differing performances of the different training sets.


Michael E. Wolmetz; Christopher R. Ratto; Carlos A. Caceres Garcia; Griffin W. Milsap; Matthew J. Roos; Mark A. Chevillet

Non-invasive neuroimaging studies have shown that semantic category and attribute information are encoded in neural population activity. Electrocorticography (ECoG) offers several advantages over non-invasive approaches, but the degree to which semantic attribute information is encoded in ECoG responses is not known. We recorded ECoG while patients named objects from 12 semantic categories and then trained high-dimensional encoding models to map semantic attributes to spectral-temporal features of the task-related neural responses. Using these semantic attribute encoding models, untrained objects were decoded with accuracies comparable to whole-brain functional Magnetic Resonance Imaging (fMRI), and we observed that high-gamma activity (70–110 Hz) at basal occipitotemporal electrodes was associated with specific semantic dimensions (manmade-animate, canonically large-small, and places-tools). Individual patient results were in close agreement with reports from other imaging modalities on the time course and functional organization of semantic processing along the ventral visual pathway during object recognition. The semantic attribute encoding model approach is critical for decoding objects absent from a training set, as well as for studying complex semantic encodings without artificially restricting stimuli to a small number of semantic categories.

Giga Science

Rachel J. Vogelstein; William R. Gray Roncal; Dean M. Kleissas

Modern technologies are enabling scientists to collect extraordinary amounts of complex and sophisticated data across a huge range of scales like never before. With this onslaught of data, we can allow the focal point to shift from data collection to data analysis. Unfortunately, lack of standardized sharing mechanisms and practices often make reproducing or extending scientific results very difficult. With the creation of data organization structures and tools that drastically improve code portability, we now have the opportunity to design such a framework for communicating extensible scientific discoveries. Our proposed solution leverages these existing technologies and standards, and provides an accessible and extensible model for reproducible research, called ‘science in the cloud’ (SIC). Exploiting scientific containers, cloud computing, and cloud data services, we show the capability to compute in the cloud and run a web service that enables intimate interaction with the tools and data presented. We hope this model will inspire the community to produce reproducible and, importantly, extensible results that will enable us to collectively accelerate the rate at which scientific breakthroughs are discovered, replicated, and extended.

Games and Culture

Rebecca E. Rhodes; Jennifer A. McKneely; Nathan D. Bos; Jonathon J. Kopecky; Alexander G. Perrone; Jason A. Spitaletta

Game-based training may have different characteristics than other forms of instruction. The independent validation of the Intelligence Advanced Research Projects Activity (IARPA) Sirius program evaluated game-based cognitive bias training across several games with a common set of control groups. Control groups included a professionally produced video that taught the same cognitive biases and an unrelated video that did not teach any biases. Knowledge was tested immediately after training and after a delay. This article presents the results from the two phases of the Sirius program. Game-based training showed advantages in teaching bias mitigation skills (procedural knowledge) but had no advantage over video instruction in teaching people to answer explicit questions about biases (declarative knowledge). Overall, training effects persisted over time, and games performed as well as and in some cases better than the video-based instruction for knowledge retention. Our results suggest that serious games can be an effective training tool, particularly for teaching procedural knowledge.

IEEE Transactions on Computational Social Systems

Elizabeth P. Reilly; Alison C. Albin; Jonathan D. Cohen; Mykola Hayvanovych

A May 2011 Nature article by Liu, Slotine, and Barabasi laid a mathematical foundation for analyzing network controllability of self-organizing networks and how to identify the minimum number of nodes needed to control a network, or driver nodes. In this paper, we continue to explore this topic, beginning with a look at how Laplacian eigenvalues relate to the percentage of nodes required to control a network. Next, we define and analyze super driver nodes, or those driver nodes that survive graph randomization. Finally, we examine node properties to differentiate super driver nodes from other types of nodes in a graph.

IEEE Transactions on Robotics

Katie M. Popek

A variety of magnetic devices can be manipulated remotely using a single permanent “actuator” magnet positioned in space by a robotic manipulator. This paper describes the spherical-actuator-magnet manipulator (SAMM), which is designed to replace or augment the singularity-prone spherical wrist used by prior permanent-magnet manipulation systems. The SAMM uses three omniwheels to enable holonomic control of the heading of its magnet's dipole and to enable its magnet to be rotated continuously about any axis of rotation. The SAMM performs closed-loop control of its dipole's heading using field measurements obtained from Hall-effect sensors as feedback, combined with modeled dynamics, using an extended Kalman filter. We describe the operation and construction of the SAMM, develop and characterize controllers for the SAMM's spherical magnet, and demonstrate remote actuation of an untethered magnetic device in a lumen using the SAMM.

2017 IEEE International Conference on Robotics and Automation (ICRA)

Paul G. Stankiewicz; Galen E. Mullins

We propose a novel method for generating test scenarios for a black box autonomous system that demonstrate critical transitions in its performance modes. In complex environments it is possible for an autonomous system to fail at its assigned mission even if it complies with requirements for all subsystems and throws no faults. This is particularly true when the autonomous system may have to choose between multiple exclusive objectives. The standard approach of testing robustness through fault detection is directly stimulating the system and detecting violations of the system requirements. Our approach differs by instead running the autonomous system through full missions in a simulated environment and measuring performance based on high-level mission criteria. The result is a method of searching for challenging scenarios for an autonomous system under test that exercise a variety of performance modes. We utilize adaptive sampling to intelligently search the state space for test scenarios which exist on the boundary between distinct performance modes. Additionally, using unsupervised clustering techniques we can group scenarios by their performance modes and sort them by those which are most effective at diagnosing changes in the autonomous system's behavior.

International Journal on Software Tools for Technology Transfer

Ryan W. Gardner; Aurora C. Schmidt; Yanni A. Kouskoulas

The Next-Generation Airborne Collision Avoidance System (ACAS X) is intended to be installed on all large aircraft to give advice to pilots and prevent mid-air collisions with other aircraft. It is currently being developed by the Federal Aviation Administration (FAA). In this paper, we determine the geometric configurations under which the advice given by ACAS X is safe under a precise set of assumptions and formally verify these configurations using hybrid systems theorem proving techniques. We consider subsequent advisories and show how to adapt our formal verification to take them into account. We examine the current version of the real ACAS X system and discuss some cases where our safety theorem conflicts with the actual advisory given by that version, demonstrating how formal hybrid systems proving approaches are helping to ensure the safety of ACAS X. Our approach is general and could also be used to identify unsafe advice issued by other collision avoidance systems or confirm their safety.


William R. Gray Roncal

Methods for resolving the three-dimensional (3D) microstructure of the brain typically start by thinly slicing and staining the brain, followed by imaging numerous individual sections with visible light photons or electrons. In contrast, X-rays can be used to image thick samples, providing a rapid approach for producing large 3D brain maps without sectioning. Here we demonstrate the use of synchrotron X-ray microtomography (µCT) for producing mesoscale (∼1 µm 3 resolution) brain maps from millimeter-scale volumes of mouse brain. We introduce a pipeline for µCT-based brain mapping that develops and integrates methods for sample preparation, imaging, and automated segmentation of cells, blood vessels, and myelinated axons, in addition to statistical analyses of these brain structures. Our results demonstrate that X-ray tomography achieves rapid quantification of large brain volumes, complementing other brain mapping and connectomics efforts.

2017 IEEE Biomedical Circuits and Systems Conference (BioCAS)

Matthew S. Fifer; Courtneyleigh W. Moran; Robert S. Armiger

In this work, we investigated the use of noninvasive, targeted transcutaneous electrical nerve stimulation (TENS) of peripheral nerves to provide sensory feedback to two amputees, one with targeted sensory reinnervation (TSR) and one without TSR. A major step in developing a closed-loop prosthesis is providing the sense of touch back to the amputee user. We investigated the effect of targeted nerve stimulation amplitude, pulse width, and frequency on stimulation perception. We discovered that both subjects were able to reliably detect stimulation patterns with pulses less than 1 ms. We utilized the psychophysical results to produce a subject specific stimulation pattern using a leaky integrate and fire (LIF) neuron model from force sensors on a prosthetic hand during a grasping task. For the first time, we show that TENS is able to provide graded sensory feedback at multiple sites in both TSR and non-TSR amputees while using behavioral results to tune a neuromorphic stimulation pattern driven by a force sensor output from a prosthetic hand.

SPIE Defense + Security, 2017

James D. Beaty; Denise M. D'Angelo; John B. Helder; Christopher J. Dohopolski; Matthew S. Johannes; Brock A. Wester; Johnathan A. Pino; Matthew J. Rich; Michael P. McLoughlin; Matthew S. Fifer; Eric A. Pohlmeyer; Francesco V. Tenore

Brain-computer interface (BCI) research has progressed rapidly, with BCIs shifting from animal tests to human demonstrations of controlling computer cursors and even advanced prosthetic limbs, the latter having been the goal of the Revolutionizing Prosthetics (RP) program. These achievements now include direct electrical intracortical microstimulation (ICMS) of the brain to provide human BCI users feedback information from the sensors of prosthetic limbs. These successes raise the question of how well people would be able to use BCIs to interact with systems that are not based directly on the body (e.g., prosthetic arms), and how well BCI users could interpret ICMS information from such devices. If paralyzed individuals could use BCIs to effectively interact with such non-anthropomorphic systems, it would offer them numerous new opportunities to control novel assistive devices. Here we explore how well a participant with tetraplegia can detect infrared (IR) sources in the environment using a prosthetic arm mounted camera that encodes IR information via ICMS. We also investigate how well a BCI user could transition from controlling a BCI based on prosthetic arm movements to controlling a flight simulator, a system with different physical dynamics than the arm. In that test, the BCI participant used environmental information encoded via ICMS to identify which of several upcoming flight routes was the best option. For both tasks, the BCI user was able to quickly learn how to interpret the ICMSprovided information to achieve the task goals.

2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC)

Edward W. Tunstel Jr.; Max R. Basescu; Kelles D. Gordge; Bryanna Y. Yeh; Agata M. Ciesielski

A prototype conversational interface has been developed to facilitate spoken interaction between humans and robots. The focus is on end-users such as security officers including military personnel who would conduct missions with robot partners in close proximity. The focal group is familiar with Battle Management Language, a variation of natural language that is restricted to avoid ambiguity. This language is chosen, given its familiarity, to inspire natural spoken interaction with robots and to avoid the greater challenge of full natural language processing. To achieve this, an open-source framework for building applications with conversational interfaces, Dorset, was interfaced to a research mobile robot enabling spoken voice commands and responses. The completed prototype successfully demonstrated feasibility and potential for advancing interaction with robots from current head-down, hands-on approaches toward a more natural head-up, hands-off paradigm suitable for humans and robots working in close proximity to one another.


Ryan J. Murphy; Mehran Armand

Real-time large deflection sensing of continuum dexterous manipulators (CDM) is essential and challenging for many minimally invasive surgery (MIS) applications. To this end, the feasibility of using Fiber Bragg Grating (FBG) sensors to detect large CDM deflections was demonstrated. Previous studies by our group proposed attaching an FBG array along with two nitinol (NiTi) wires as substrates to form a triangular cross section capable of large deflection detection for a 35 mm CDM. The strenuous fabrication procedure, however, relies on trial and error to ensure accurate attachment of components. In this paper, we propose a novel design for assembling large deflection FBG sensors utilizing a custom-design three-lumen polycarbonate tube with circular cross section. The proposed design eliminates fabrication challenges by embedding the FBG array and NiTi wires inside the tube in a more robust, repeatable, time-efficient and cost-effective (compared to multicore fibers) manner. Calibration experiments of the sensor assembly alone and inside the CDM indicate consistent linear (R2~0.99) wavelength-curvature relationship. Experimental results show 3.3% error in curvature detection.

IEEE/ASME Transactions on Mechatronics

Ryan J. Murphy

Dexterous continuum manipulators (DCMs) have been widely adopted for minimally- and less-invasive surgery. During the operation, these DCMs interact with surrounding anatomy actively or passively. The interaction force will inevitably affect the tip position and shape of DCMs, leading to potentially inaccurate control near critical anatomy. In this paper, we demonstrated a two-dimensional mechanical model for a tendon actuated, notched DCM with compliant joints. The model predicted deformation of the DCM accurately in the presence of tendon force, friction force, and external force. A partition approach was proposed to describe the DCM as a series of interconnected rigid and flexible links. Beam mechanics, taking into consideration tendon interaction and external force on the tip and the body, was applied to obtain the deformation of each flexible link of the DCM. The model results were compared with experiments for free bending as well as bending in the presence of external forces acting at either the tip or body of the DCM. The overall mean error of tip position between model predictions and all of the experimental results was 0.62 ± 0.41 mm. The results suggest that the proposed model can effectively predict the shape of the DCM.

2016 23rd International Conference on Pattern Recognition (ICPR)

Philippe M. Burlina; Seth D. Billings

This study focuses on using ultrasound (US) biomarkers for characterizing myopathies and in particular myositis. US offers an opportunity to deliver diagnostics in clinical settings at a fraction of the cost and discomfort entailed in current workflows. US is also better suited for usage in under-resourced environments. This paper is focused on studying the link between biomarkers related to absolute and relative echo intensity of muscle tissue and the presence and severity of myositis disease. We show that there is good correlation between these biomarkers and the severity of muscle disease rated by the Heckmatt criteria. A moderate correlation is also found between these biomarkers and muscles categorized by healthy vs. diseased status of each patient. Experimental data involving 37 patients (9 polymyositis, 3 dermatomyositis, 9 inclusion body myositis, and 16 healthy patients) and seven muscle groups show correlations up to 0.91.

2016 23rd International Conference on Pattern Recognition (ICPR)

Philippe M. Burlina

Deep convolutional neural networks (DCNNs) perform on par or better than humans for image classification. Hence efforts have now shifted to more challenging tasks such as object detection and classification in images, video or RGBD. Recently developed region CNNs (R-CNN) such as Fast R-CNN [7] address this detection task for images. Instead, this paper is concerned with video and also focuses on resource-limited systems. Newly proposed methods accelerate R-CNN by sharing convolutional layers for proposal generation, location regression and labeling [12][13][19][25]. These approaches when applied to video are stateless: they process each image individually. This suggests an alternate route: to make R-CNN stateful and exploit temporal consistency. We extend Fast R-CNNs by making it employ recursive Bayesian filtering and perform proposal propagation and reuse. We couple multi-target proposal/detection tracking (MTT) with R-CNN and do detection-to-track association. We call this approach MRCNN as short for MTT + R-CNN. In MRCNN, region proposals that are vetted via classification and regression in R-CNNs - are treated as observations in MTT and propagated using assumed kinematics. Actual proposal generation (e.g. via Selective Search) need only be performed sporadically and/or periodically and is replaced at all other times by MTT proposal predictions. Preliminary results show that MRCNNs can economize on both proposal and classification computations, and can yield up to a 10 to 30 factor decrease in number of proposals generated, about one order of magnitude proposal computation time savings and nearly one order magnitude improvement in overall computational time savings, for comparable localization and classification performance. This method can additionally be beneficial for false alarm abatement.

Science and Technology for the Built Environment

Grace M. Hwang

Nearly 600 articles were located in citation and keyword searches regarding the effects of humidity on comfort, health, and indoor environmental quality. Of these, around 70 articles reported the effects of low humidity (relative humidity ≤ 40%) and were analyzed in detail. Information in some categories was well chronicled, while other categories had significant knowledge gaps. Low humidity decreased house dust mite allergens. Due to different envelopes, generalizations could not be made for all bacteria and viruses. However, lower humidity increased virus survival for influenza. For comfort, low humidity had little effect on thermal comfort, but skin dryness, eye irritation, and static electricity increased as humidity decreased. For indoor environmental quality, low humidity had nonuniform effects on volatile organic compound emissions and perceived indoor air quality. Across many low humidity studies, ventilation rates and exposure times were noted as confounding variables. A majority of studies that used human subjects utilized exposure times of 3 h or less with adult subjects; few studies used children, adolescents, or elderly subjects.

ACCV 2016

Derek M. Rollend; Kapil D. Katyal; Philippe M. Burlina; Seth D. Billings; Kevin C. Wolfe; Paul E. Rosendall

We describe the recent development of assistive computer vision algorithms for use with the Argus II retinal prosthesis system. While users of the prosthetic system can learn and adapt to the limited stimulation resolution, there exists great potential for computer vision algorithms to augment the experience and significantly increase the utility of the system for the user. To this end, our recent work has focused on helping with two different challenges encountered by the visually impaired: face detection and object recognition. In this paper, we describe algorithm implementations in both of these areas that make use of the retinal prosthesis for visual feedback to the user, and discuss the unique challenges faced in this domain.

Experimental Neurology

Brock A. Wester; Matthew J. Rich; Michael P. McLoughlin; James D. Beaty; Brendan A. John; Eric A. Pohlmeyer

As Brain-Computer Interface (BCI) systems advance for uses such as robotic arm control it is postulated that the control paradigms could apply to other scenarios, such as control of video games, wheelchair movement or even flight. The purpose of this pilot study was to determine whether our BCI system, which involves decoding the signals of two 96-microelectrode arrays implanted into the motor cortex of a subject, could also be used to control an aircraft in a flight simulator environment. The study involved six sessions in which various parameters were modified in order to achieve the best flight control, including plane type, view, control paradigm, gains, and limits. Successful flight was determined qualitatively by evaluating the subject's ability to perform requested maneuvers, maintain flight paths, and avoid control losses such as dives, spins and crashes. By the end of the study, it was found that the subject could successfully control an aircraft. The subject could use both the jet and propeller plane with different views, adopting an intuitive control paradigm. From the subject's perspective, this was one of the most exciting and entertaining experiments she had performed in two years of research. In conclusion, this study provides a proof-of-concept that traditional motor cortex signals combined with a decoding paradigm can be used to control systems besides a robotic arm for which the decoder was developed. Aside from possible functional benefits, it also shows the potential for a new recreational activity for individuals with disabilities who are able to master BCI control.

IEEE International Conference on Systems, Man, and Cybernetics

Edward W. Tunstel Jr.; Robert J. Bamberger Jr.; Colin J. Taylor; Jessica M. Hatch; Ryan J. Murphy; Matthew P. Para; Kapil D. Katyal; Matthew S. Johannes; Kevin C. Wolfe; Joseph L. Moore

This paper presents a nested marsupial robotic system and its execution of a notional disaster response task. Human supervised autonomy is facilitated by tightly-coupled, high-level user feedback enabling command and control of a bimanual mobile manipulator carrying a quadrotor unmanned aerial vehicle that carries a miniature ground robot. Each robot performs a portion of a mock hazardous chemical spill investigation and sampling task within a shipping container. This work offers an example application for a heterogeneous team of robots that could directly support first responder activities using complementary capabilities of autonomous dexterous manipulation and mobility, autonomous planning and control, and teleoperation. The task was successfully executed during multiple live trials at the DARPA Robotics Challenge Technology Expo in June 2015. A key contribution of the work is the application of a unified algorithmic approach to autonomous planning, control, and estimation supporting vision-based manipulation and non-GPS-based ground and aerial mobility, thus reducing algorithmic complexity across this capability set. The unified algorithmic approach is described along with the robot capabilities, hardware implementations, and human interface, followed by discussion of live demonstration execution and results.

2016 NIPS Workshop on Deep Learning for Action and Interaction

I-Jeng Wang; Edward W. Staley; Austin D. Reiter; Kapil D. Katyal; Matthew S. Johannes; Philippe M. Burlina

Deep learning (DL) has led to near or better than human performance in image classification or object/speech recognition. DL is now providing new tools to address autonomous robotic manipulation and navigation challenges. One of the fundamental capabilities necessary for robotic manipulation is the ability to reorient objects within the hand. In this paper, we describe an approach using Deep Reinforcement Learning (DRL) techniques to learn a policy to perform in-hand manipulation directly from raw image pixels. This paper presents an overview of the working prototype, the description of the algorithms and a working prototype using the Modular Prosthetic Limb (MPL) in a Gazebo simulation.

2014 6th Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS)

Saurabh Vyas; Philippe M. Burlina

Our work is focused on the development of non-invasive methods to estimate skin constitutive elements. Such methods can play an important clinical and scientific role in detecting the early onset of skin tumors. Given current statistics by the American Academy of Dermatology suggesting that more than 10 people die each hour worldwide due to skin related conditions, this has potentially high impact on the delivery of skin cancer diagnostics, and patient mortality and morbidity. It can also serve as a valuable tool for research in cosmetology and pharmaceuticals in general. We combine a physics-based model of human skin with machine learning and hyperspectral imaging to non-invasively estimate physiological skin parameters, including melanosomes, collagen, oxygen saturation, blood volume, and skin thickness. While some prior work has been done in this regard, no validation against ground truth has occurred whatsoever. In this specific study we develop a protocol to validate our methodology for estimating one of these skin parameters, skin thickness, using a dataset of 48 hyperspectral signatures obtained in vivo, and cross-validate our depth estimates with a gold standard obtained via Ultrasound. Relative to this gold standard, we find promising mean absolute errors of less than 0.1 mm for skin thickness estimation.