09-07-2022, 06:44 AM
Hey Dennis,
thank you very much for the detailed reply (I do really appreciate it!).
With what I've been training yesterday (your pre-formed command with slight changes like --thinking-time 3 and --tournament-mode e.g.), I managed to reach ~37% win rate. I might start raising --thinking-time even further, but we'll see if that doesn't take too long. In 7. you say that for --num-games there is also a risk of decreasing performance. This also implies that I can't just pick the last checkpoint of a training run and it will be the strongest right away, right? If so, is there a good way to find the best checkpoint (other than trial and error)? Are the Elos and Pick Counts from the log at all meaningful for this? Also just to clarify, this data is with respect to the trained agent (not UCT or MC-GRAVE) at each checkpoint, right?
Two further questions I have:
Can you resume a training run after pausing/interrupting it? Perhaps via "--best-agents-data-dir" (if so, what needs to be in this directory)?
Assuming the agents I'm training eventually surpass MC-GRAVE, is there some way to compare them amongst each other, aside from launching two instances and entering moves manually? I'm guessing this one is related to this thread.
thank you very much for the detailed reply (I do really appreciate it!).
With what I've been training yesterday (your pre-formed command with slight changes like --thinking-time 3 and --tournament-mode e.g.), I managed to reach ~37% win rate. I might start raising --thinking-time even further, but we'll see if that doesn't take too long. In 7. you say that for --num-games there is also a risk of decreasing performance. This also implies that I can't just pick the last checkpoint of a training run and it will be the strongest right away, right? If so, is there a good way to find the best checkpoint (other than trial and error)? Are the Elos and Pick Counts from the log at all meaningful for this? Also just to clarify, this data is with respect to the trained agent (not UCT or MC-GRAVE) at each checkpoint, right?
Two further questions I have:
Can you resume a training run after pausing/interrupting it? Perhaps via "--best-agents-data-dir" (if so, what needs to be in this directory)?
Assuming the agents I'm training eventually surpass MC-GRAVE, is there some way to compare them amongst each other, aside from launching two instances and entering moves manually? I'm guessing this one is related to this thread.