fights.envs.puoribor#

Puoribor, a variant of the classical Quoridor game. Coordinates are specified in the form of (x, y), where (0, 0) is the top left corner. All coordinates and directions are absolute and does not change between agents. Directions

  • Top: +y

  • Right: +x

  • Bottom: -y

  • Left: -x

Environment#

class fights.envs.puoribor.PuoriborEnv#
env_id = ('puoribor', 3)#

Environment identifier in the form of (name, version).

board_size: int = 9#

Size (width and height) of the board.

max_walls: int = 10#

Maximum allowed walls per agent.

step(state: PuoriborState, agent_id: int, action: PuoriborAction, *, pre_step_fn: Callable[[PuoriborState, int, PuoriborAction], None] | None = None, post_step_fn: Callable[[PuoriborState, int, PuoriborAction], None] | None = None) PuoriborState#

Step through the game, calculating the next state given the current state and action to take. :arg state:

Current state of the environment.

Parameters:
  • agent_id – ID of the agent that takes the action. (0 or 1)

  • action – Agent action, encoded in the form described by PuoriborAction.

  • pre_step_fn – Callback to run before executing action. state, agent_id and action will be provided as arguments.

  • post_step_fn – Callback to run after executing action. The calculated state, agent_id and action will be provided as arguments.

Returns:

A copy of the object with the restored state.

legal_actions(state: PuoriborState, agent_id: int) NDArray[np.int_]#

Find possible actions for the agent.

Parameters:
  • state – Current state of the environment.

  • agent_id – Agent_id of the agent.

Returns:

A numpy array of shape (4, 9, 9) which is one-hot encoding of possible actions.

initialize_state() PuoriborState#

Initialize a PuoriborState object with correct environment parameters. :returns:

Created initial state object.

Action#

fights.envs.puoribor.PuoriborAction#

Alias of ArrayLike to describe the action type. Encoded as an array of shape (3,), in the form of [ action_type, coordinate_x, coordinate_y ]. action_type

  • 0 (move piece)

  • 1 (place wall horizontally)

  • 2 (place wall vertically)

  • 3 (rotate section)

coordinate_x, coordinate_y
  • position to move the piece to

  • top or left position to place the wall

  • top left position of the section to rotate

alias of Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]]

State#

class fights.envs.puoribor.PuoriborState(board: NDArray[np.int_], walls_remaining: NDArray[np.int_], done: bool = False)#

PuoriborState represents the game state.

board: NDArray[np.int_]#

Array of shape (C, W, H), where C is channel index and W, H is board width, height. Channels

  • C = 0: one-hot encoded position of agent 0. (starts from top)

  • C = 1: one-hot encoded position of agent 1. (starts from bottom)

  • C = 2: label encoded positions of horizontal walls. (1 for wall placed by agent 0, 2 for agent 1)

  • C = 3: label encoded positions of vertical walls. (encoding is same as C = 2)

  • C = 4: one-hot encoded positions of horizontal walls’ midpoints.

  • C = 5: one-hot encoded positions of vertical walls’ midpoints.

walls_remaining: NDArray[np.int_]#

Array of shape (2,), in the form of [ agent0_remaining_walls, agent1_remaining_walls ].

done: bool = False#

Boolean value indicating whether the game is done.

perspective(agent_id: int) NDArray[np.int_]#

Return board where specified agent with agent_id is on top. :arg agent_id:

The ID of agent to use as base.

Returns:

A rotated board array. The board’s channel 0 will contain position of agent of id agent_id, and channel 1 will contain the opponent’s position. In channel 2 and 3, walles labeled with 1 are set by agent of id agent_id, and the others are set by the opponent.

to_dict() Dict#

Serialize state object to dict. :returns:

A serialized dict.

static from_dict(serialized) PuoriborState#

Deserialize from serialized dict. :arg serialized:

A serialized dict.

Returns:

Deserialized PuoriborState object.

Examples#

See examples for example usage.