The ISC is part of the Johns Hopkins Applied Physics Laboratory and will follow all current policies. Please visit the JHU/APL page for more information on the Lab's visitor guidance.

2021

Unsupervised Discovery, Control, and Disentanglement of Semantic Attributes with Applications to Anomaly Detection


Abstract

This work focuses on the ability to control via latent space factors semantic image attributes in generative models, and the faculty to discover mappings from factors to attributes in an unsupervised fashion. The discovery of controllable semantic attributes is of special importance, as it would facilitate higher level tasks such as unsupervised representation learning to improve anomaly detection, or the controlled generation of novel data for domain shift and imbalanced datasets. The ability to control semantic attributes is related to the disentanglement of latent factors, which dictates that latent factors be "uncorrelated" in their effects. Unfortunately, despite past progress, the connection between control and disentanglement remains, at best, confused and entangled, requiring clarifications we hope to provide in this work. To this end, we study the design of algorithms for image generation that allow unsupervised discovery and control of semantic attributes.We make several contributions: a) We bring order to the concepts of control and disentanglement, by providing an analytical derivation that connects mutual information maximization, which promotes attribute control, to total correlation minimization, which relates to disentanglement. b) We propose hybrid generative model architectures that use mutual information maximization with multi-scale style transfer. c) We introduce a novel metric to characterize the performance of semantic attributes control. We report experiments that appear to demonstrate, quantitatively and qualitatively, the ability of the proposed model to perform satisfactory control while still preserving competitive visual quality. We compare to other state of the art methods (e.g., Frechet inception distance (FID)= 9.90 on CelebA and 4.52 on EyePACS).

Citation

article: 10.1162/neco_a_01359 author: Paul William and Wang I-Jeng and Alajaji Fady and Burlina Philippe title: "Unsupervised Discovery Control and Disentanglement of Semantic Attributes With Applications to Anomaly Detection" journal: Neural Computation volume: 33 number: 3 pages: 802-826 year: 2021 month: 03 abstract: "Our work focuses on unsupervised and generative methods that address the following goals: (1) learning unsupervised generative representations that discover latent factors controlling image semantic attributes (2) studying how this ability to control attributes formally relates to the issue of latent factor disentanglement clarifying related but dissimilar concepts that had been confounded in the past and (3) developing anomaly detection methods that leverage representations learned in the first goal. For goal 1 we propose a network architecture that exploits the combination of multiscale generative models with mutual information (MI) maximization. For goal 2 we derive an analytical result lemma 1 that brings clarity to two related but distinct concepts: the ability of generative networks to control semantic attributes of images they generate resulting from MI maximization and the ability to disentangle latent space representations obtained via total correlation minimization. More specifically we demonstrate that maximizing semantic attribute control encourages disentanglement of latent factors. Using lemma 1 and adopting MI in our loss function we then show empirically that for image generation tasks the proposed approach exhibits superior performance as measured in the quality and disentanglement of the generated images when compared to other state-of-the-art methods with quality assessed via the Fréchet inception distance (FID) and disentanglement via mutual information gap. For goal 3 we design several systems for anomaly detection exploiting representations learned in goal 1 and demonstrate their performance benefits when compared to state-of-the-art generative and discriminative algorithms. Our contributions in representation learning have potential applications in addressing other important problems in computer vision such as bias and privacy in AI." issn: 0899-7667 doi: 10.1162/neco_a_01359 url: https://doi.org/10.1162/neco\_a\_01359 eprint: https://direct.mit.edu/neco/article-pdf/33/3/802/1889446/neco\_a\_01359.pdf

Citation

article: 10.1162/neco_a_01359 author: Paul William and Wang I-Jeng and Alajaji Fady and Burlina Philippe title: "Unsupervised Discovery Control and Disentanglement of Semantic Attributes With Applications to Anomaly Detection" journal: Neural Computation volume: 33 number: 3 pages: 802-826 year: 2021 month: 03 abstract: "Our work focuses on unsupervised and generative methods that address the following goals: (1) learning unsupervised generative representations that discover latent factors controlling image semantic attributes (2) studying how this ability to control attributes formally relates to the issue of latent factor disentanglement clarifying related but dissimilar concepts that had been confounded in the past and (3) developing anomaly detection methods that leverage representations learned in the first goal. For goal 1 we propose a network architecture that exploits the combination of multiscale generative models with mutual information (MI) maximization. For goal 2 we derive an analytical result lemma 1 that brings clarity to two related but distinct concepts: the ability of generative networks to control semantic attributes of images they generate resulting from MI maximization and the ability to disentangle latent space representations obtained via total correlation minimization. More specifically we demonstrate that maximizing semantic attribute control encourages disentanglement of latent factors. Using lemma 1 and adopting MI in our loss function we then show empirically that for image generation tasks the proposed approach exhibits superior performance as measured in the quality and disentanglement of the generated images when compared to other state-of-the-art methods with quality assessed via the Fréchet inception distance (FID) and disentanglement via mutual information gap. For goal 3 we design several systems for anomaly detection exploiting representations learned in goal 1 and demonstrate their performance benefits when compared to state-of-the-art generative and discriminative algorithms. Our contributions in representation learning have potential applications in addressing other important problems in computer vision such as bias and privacy in AI." issn: 0899-7667 doi: 10.1162/neco_a_01359 url: https://doi.org/10.1162/neco\_a\_01359 eprint: https://direct.mit.edu/neco/article-pdf/33/3/802/1889446/neco\_a\_01359.pdf