Pre-trained, high-performing model that guides the training of a simpler, student model, often in the context of knowledge distillation.
Generality: 561