* equal contribution
the paper formerly known as HAKD: Hardware Aware Knowledge Distillation
a short version of this paper appeared at the CDNNRIA workshop, NeurIPS 2018