MCTSPlayer

com.phasmidsoftware.gambit.game.MCTSPlayer
class MCTSPlayer[P, S, M, Pl](me: Pl, iterations: Int = ..., explorationConstant: Double = ...)(using state: State[P, S], game: Game[S, M, Pl]) extends Player[S, M, Pl]

A generic Monte Carlo Tree Search player. For more information about MCTS, see https://en.wikipedia.org/wiki/Monte_Carlo_tree_search

Implements the standard four-phase MCTS loop:

  1. Selection -- walk the tree by UCB1 (Upper Confidence Bound) until an unexpanded node is found.
  2. Expansion -- add one new child for an untried move.
  3. Simulation -- play randomly to a terminal state (rollout).
  4. Backprop -- update visit/win counts along the path to root.

The search tree is retained between calls to chooseMove. After each move the subtree rooted at the chosen child becomes the new root, carrying forward all accumulated visit and win counts. If the opponent plays an unexplored line the retained tree cannot be advanced and a fresh root is created instead.

Tree matching uses == on the state type S; callers must ensure S has meaningful equality (satisfied by case class and case object).

== Future upgrades ==

  • Actor-based parallelism: move the mutable tree into an Akka/Pekko actor. Multiple rollout worker actors could then submit simulation results to the tree actor concurrently (root parallelization), giving a near-linear speedup with the number of cores.

  • Heuristic rollouts: replace pure random simulation with a heuristic-guided playout for stronger play.

Type parameters

M

the move type.

P

the proto-state type.

Pl

the player identity type.

S

the state type.

Value parameters

explorationConstant

UCB1 exploration parameter C (default sqrt(2)).

game

implicit Game[S, M, Pl] for move application.

iterations

number of MCTS iterations per move (default 1000).

me

this player's identity.

state

implicit State[P, S] for goal detection.

Attributes

Graph
Supertypes
trait Player[S, M, Pl]
class Object
trait Matchable
class Any

Members list

Value members

Concrete methods

override def chooseMove(s: S, random: Random): Option[M]

Choose a move from the given state. Returns None if no move is available (terminal position).

Choose a move from the given state. Returns None if no move is available (terminal position).

Value parameters

random

a Random instance.

s

the current state.

Attributes

Returns

Some(move) or None.

Definition Classes
override def gameOver(result: GameResult[Pl], me: Pl): Unit

Called at the end of a game with the full result and this player's identity. Default implementation is a no-op. Override to implement learning or logging.

Called at the end of a game with the full result and this player's identity. Default implementation is a no-op. Override to implement learning or logging.

Value parameters

me

this player's identity, used to extract the relevant score.

result

the game result (all players' scores).

Attributes

Definition Classes