💭 What if you don't discard action vectors, but rather than trying to predict the set of actions (may be many, complicated to predict and loss, many may be poor actions) we try to predict the top n highest visit count actions, maybe 3. Going to have to fill a lot of notepaper before coding anything or I'll waste more time