alpha-beta

unhandyandy · 12-01-2019, 07:04 PM

I'm not an expert in ML. Does "policy" refer to preferences among move choices, and "state evaluation" to evaluating a position?

I know that all the action is currently in MCTS, but I'm struck by how much better AB plays brandubh. My impression is that MCTS works well at bootstrapping from zero knowledge of a game, but the ludeme project seems to be starting elsewhere.

Has any research been done on ML of positional evaluation in the context of AB search? For example, if a human could write an evaluator that then has its parameters tweaked by ML over the course of many plays, that might be the best of both worlds.