💭 muzero could work with a transformer network using seq2seq transformer and a hidden state of tokens. Not sure how thatd work with heterogeneous tokens with different meanings and uses. Maybe they're separate sequences