💭 I learned if you use cross entropy on a smooth target you should calculate and subtract the base entropy from the loss if you plan on adding it to other loss terms. 🎈