08-15-2020, 01:42 PM
I see the behaviour you're describing with default settings, but I suspect it actually is just a case of the AIs not being good enough yet.
I assume you're using the default Ludii AI, which then by default uses the UCT algorithm since we didn't get around to doing any training / AI analysis for this game yet?
Try turning on visualisations of what the AIs are thinking using View > Show AI Distribution, and ramping up the thinking time to something like 10 seconds per move. While the AI is thinking, you'll see circles appearing for all the moves. The size of a circle is a representation of "how much the AI is thinking about that move" (visit counts of MCTS, if you're familiar with that algorithm), and the colour is a representation of how good the AI believes that move to be (average value backpropagated for that move in MCTS); blue means the AI believes it to be a winning move, purple means neutral, red means a losing move.
Looking at these visualisations, even on a much smaller board than the default, I see the AI going through drastic changes in how it feels about certain moves multiple times during a 10-second window. Some moves will have large blue or purpler circles for multiple seconds before finally turning purple or red and then shrinking, with other moves growing in size again. Also the overall evaluation changes a lot; even on the small Board Size of 5 I have all the moves being purple-ish (neutral) for quite a long time, and only all turning red towards the end of a 10-second search. This suggests that the AI really does spend a lot of time thinking it's in a neutral game state, and only towards the end finally starts realising that it's going to be losing this game.
To be sure that the AIs do correctly reason about the swap rule, I'd suggest taking a look at Hex with extremely small board sizes, longer thinking times, and again turning on AI visualisations (or looking at the detailed reports in the Analysis tab). Yes, even here the first Player tends to make a "strong" move that the opponent can counter by swapping, but the visualisations and the Analysis tab tell us that the AI also "knows" that it's going to lose by doing so; the problem simply is that it will lose regardless of what it does. The apparant preference for AIs to lose by swapping over losing through other ways is interesting though. I guess it's just an artifact of the fact that these algorithms are based on random simulations. These moves that get countered by swapping only have one single counter-move; the opponent has to specifically pick the swap to counter them. Other moves might have multiple counters, meaning that they're more likely to get countered in random simulations. So, when the AIs know that they're going to lose to an optimal player anyway, they'll probably just prefer to pick a move that only has a single possible counter.
I assume you're using the default Ludii AI, which then by default uses the UCT algorithm since we didn't get around to doing any training / AI analysis for this game yet?
Try turning on visualisations of what the AIs are thinking using View > Show AI Distribution, and ramping up the thinking time to something like 10 seconds per move. While the AI is thinking, you'll see circles appearing for all the moves. The size of a circle is a representation of "how much the AI is thinking about that move" (visit counts of MCTS, if you're familiar with that algorithm), and the colour is a representation of how good the AI believes that move to be (average value backpropagated for that move in MCTS); blue means the AI believes it to be a winning move, purple means neutral, red means a losing move.
Looking at these visualisations, even on a much smaller board than the default, I see the AI going through drastic changes in how it feels about certain moves multiple times during a 10-second window. Some moves will have large blue or purpler circles for multiple seconds before finally turning purple or red and then shrinking, with other moves growing in size again. Also the overall evaluation changes a lot; even on the small Board Size of 5 I have all the moves being purple-ish (neutral) for quite a long time, and only all turning red towards the end of a 10-second search. This suggests that the AI really does spend a lot of time thinking it's in a neutral game state, and only towards the end finally starts realising that it's going to be losing this game.
To be sure that the AIs do correctly reason about the swap rule, I'd suggest taking a look at Hex with extremely small board sizes, longer thinking times, and again turning on AI visualisations (or looking at the detailed reports in the Analysis tab). Yes, even here the first Player tends to make a "strong" move that the opponent can counter by swapping, but the visualisations and the Analysis tab tell us that the AI also "knows" that it's going to lose by doing so; the problem simply is that it will lose regardless of what it does. The apparant preference for AIs to lose by swapping over losing through other ways is interesting though. I guess it's just an artifact of the fact that these algorithms are based on random simulations. These moves that get countered by swapping only have one single counter-move; the opponent has to specifically pick the swap to counter them. Other moves might have multiple counters, meaning that they're more likely to get countered in random simulations. So, when the AIs know that they're going to lose to an optimal player anyway, they'll probably just prefer to pick a move that only has a single possible counter.