The ISC is part of the Johns Hopkins Applied Physics Laboratory and will follow all current policies. Please visit the JHU/APL page for more information on the Lab's visitor guidance.

2018

Freezing Subnetworks to Analyze Domain Adaptation in Neural Machine Translation


Abstract

To better understand the effectiveness of continued training, we analyze the major components of a neural machine translation system (the encoder, decoder, and each embedding space) and consider each component's contribution to, and capacity for, domain adaptation. We find that freezing any single component during continued training has minimal impact on performance, and that performance is surprisingly good when a single component is adapted while holding the rest of the model fixed. We also find that continued training does not move the model very far from the out-of-domain model, compared to a sensitivity analysis metric, suggesting that the out-of-domain model can provide a good generic initialization for the new domain.

Citation

@inproceedingsthompson-etal-2018-freezing title: "Freezing Subnetworks to Analyze Domain Adaptation in Neural Machine Translation" author: "Thompson Brian and Khayrallah Huda and Anastasopoulos Antonios and McCarthy Arya D. and Duh Kevin and Marvin Rebecca and McNamee Paul and Gwinnup Jeremy and Anderson Tim and Koehn Philipp" booktitle: "Proceedings of the Third Conference on Machine Translation: Research Papers" month: oct year: "2018" address: "Brussels Belgium" publisher: "Association for Computational Linguistics" url: "https://www.aclweb.org/anthology/W18-6313" doi: "10.18653/v1/W18-6313" pages: "124--132" abstract: "To better understand the effectiveness of continued training we analyze the major components of a neural machine translation system (the encoder decoder and each embedding space) and consider each component's contribution to and capacity for domain adaptation. We find that freezing any single component during continued training has minimal impact on performance and that performance is surprisingly good when a single component is adapted while holding the rest of the model fixed. We also find that continued training does not move the model very far from the out-of-domain model compared to a sensitivity analysis metric suggesting that the out-of-domain model can provide a good generic initialization for the new domain."

Citation

@inproceedingsthompson-etal-2018-freezing title: "Freezing Subnetworks to Analyze Domain Adaptation in Neural Machine Translation" author: "Thompson Brian and Khayrallah Huda and Anastasopoulos Antonios and McCarthy Arya D. and Duh Kevin and Marvin Rebecca and McNamee Paul and Gwinnup Jeremy and Anderson Tim and Koehn Philipp" booktitle: "Proceedings of the Third Conference on Machine Translation: Research Papers" month: oct year: "2018" address: "Brussels Belgium" publisher: "Association for Computational Linguistics" url: "https://www.aclweb.org/anthology/W18-6313" doi: "10.18653/v1/W18-6313" pages: "124--132" abstract: "To better understand the effectiveness of continued training we analyze the major components of a neural machine translation system (the encoder decoder and each embedding space) and consider each component's contribution to and capacity for domain adaptation. We find that freezing any single component during continued training has minimal impact on performance and that performance is surprisingly good when a single component is adapted while holding the rest of the model fixed. We also find that continued training does not move the model very far from the out-of-domain model compared to a sensitivity analysis metric suggesting that the out-of-domain model can provide a good generic initialization for the new domain."