gambit/com.phasmidsoftware/com.phasmidsoftware.gambit/com.phasmidsoftware.gambit.game/MCTSPlayer

MCTSPlayer

com.phasmidsoftware.gambit.game.MCTSPlayer

class MCTSPlayer[P, S, M, Pl](me: Pl, iterations: Int = ..., explorationConstant: Double = ...)(using state: State[P, S], game: Game[S, M, Pl]) extends Player[S, M, Pl]

A generic Monte Carlo Tree Search player. For more information about MCTS, see https://en.wikipedia.org/wiki/Monte_Carlo_tree_search

Implements the standard four-phase MCTS loop:

Selection -- walk the tree by UCB1 (Upper Confidence Bound) until an unexpanded node is found.
Expansion -- add one new child for an untried move.
Simulation -- play randomly to a terminal state (rollout).
Backprop -- update visit/win counts along the path to root.

The search tree is retained between calls to chooseMove. After each move the subtree rooted at the chosen child becomes the new root, carrying forward all accumulated visit and win counts. If the opponent plays an unexplored line the retained tree cannot be advanced and a fresh root is created instead.

Tree matching uses == on the state type S; callers must ensure S has meaningful equality (satisfied by case class and case object).

== Future upgrades ==

Actor-based parallelism: move the mutable tree into an Akka/Pekko actor. Multiple rollout worker actors could then submit simulation results to the tree actor concurrently (root parallelization), giving a near-linear speedup with the number of cores.
Heuristic rollouts: replace pure random simulation with a heuristic-guided playout for stronger play.

Type parameters

M: the move type.
P: the proto-state type.
Pl: the player identity type.
S: the state type.

Value parameters

explorationConstant: UCB1 exploration parameter C (default sqrt(2)).
game: implicit Game[S, M, Pl] for move application.
iterations: number of MCTS iterations per move (default 1000).
me: this player's identity.
state: implicit State[P, S] for goal detection.

Attributes

Graph
Supertypes: trait Player[S, M, Pl]

class Object

trait Matchable

class Any

Members list

Value members

Concrete methods

Choose a move from the given state. Returns None if no move is available (terminal position).

Value parameters

random: a Random instance.
s: the current state.

Attributes

Returns: Some(move) or None.
Definition Classes: Player

Called at the end of a game with the full result and this player's identity. Default implementation is a no-op. Override to implement learning or logging.

Value parameters

me: this player's identity, used to extract the relevant score.
result: the game result (all players' scores).

Attributes

Definition Classes: Player

In this article

Generated with