Data and Parameter Scaling Laws For Neural Machine Translation
Mitchell A. Gordon, Jared Kaplan, Kevin Duh
in EMNLP 2021.

Compressing BERT: Studying the Effects of Weight Pruning on Transfer Learning
Mitchell A. Gordon, Kevin Duh, and Nicholas Andrews
in Proceedings of the 5th Workshop on Representation Learning for NLP, ACL 2020.

Distill, Adapt, Distill: Training Small, In-Domain Models for Neural Machine Translation
Mitchell A. Gordon and Kevin Duh
in Proceedings of the 4th Workshop on Neural Generation and Translation, ACL 2020.