💠when trying to improve a system, it may be useful to deliberately degrade that system to make it easier to test. Eg could test alphazero fast on a low sim mcts and an easy game. Eg when improving long term llm conversation could work with an llm weaker at long term conversation to see the benefits more easily. All these cases risk solving the wrong problem so they're additive rather than replacement.