💭

  • really important you dont train on padded indices
  • you can't take ml-dependent data processing out of the train cycle for performance or it doesnt train silly