Minesweeper
Minesweeper (Environment)
#
A JAX implementation of the minesweeper game.
-
observation:
Observation
- board: jax array (int32) of shape (num_rows, num_cols): each cell contains -1 if not yet explored, or otherwise the number of mines in the 8 adjacent squares.
- action_mask: jax array (bool) of shape (num_rows, num_cols): indicates which actions are valid (not yet explored squares).
- num_mines: jax array (int32) of shape
()
, indicates the number of mines to locate. - step_count: jax array (int32) of shape (): specifies how many timesteps have elapsed since environment reset.
-
action: multi discrete array containing the square to explore (row and col).
-
reward: jax array (float32): Configurable function of state and action. By default: 1 for every timestep where a valid action is chosen that doesn't reveal a mine, 0 for revealing a mine or selecting an already revealed square (and terminate the episode).
-
episode termination: Configurable function of state, next_state, and action. By default: Stop the episode if a mine is explored, an invalid action is selected (exploring an already explored square), or the board is solved.
-
state:
State
- board: jax array (int32) of shape (num_rows, num_cols): each cell contains -1 if not yet explored, or otherwise the number of mines in the 8 adjacent squares.
- step_count: jax array (int32) of shape (): specifies how many timesteps have elapsed since environment reset.
- flat_mine_locations: jax array (int32) of shape (num_rows * num_cols,): indicates the (flat) locations of all the mines on the board. Will be of length num_mines.
- key: jax array (int32) of shape (2,) used for seeding the sampling of mine placement on reset.
1 2 3 4 5 6 7 8 |
|
observation_spec: jumanji.specs.Spec[jumanji.environments.logic.minesweeper.types.Observation]
cached
property
writable
#
Specifications of the observation of the Minesweeper
environment.
Returns:
Type | Description |
---|---|
Spec for the `Observation` whose fields are |
|
action_spec: MultiDiscreteArray
cached
property
writable
#
Returns the action spec. An action consists of the height and width of the square to be explored.
Returns:
Type | Description |
---|---|
action_spec |
|
__init__(self, generator: Optional[jumanji.environments.logic.minesweeper.generator.Generator] = None, reward_function: Optional[jumanji.environments.logic.minesweeper.reward.RewardFn] = None, done_function: Optional[jumanji.environments.logic.minesweeper.done.DoneFn] = None, viewer: Optional[jumanji.viewer.Viewer[jumanji.environments.logic.minesweeper.types.State]] = None)
special
#
Instantiate a Minesweeper
environment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
generator |
Optional[jumanji.environments.logic.minesweeper.generator.Generator] |
|
None |
reward_function |
Optional[jumanji.environments.logic.minesweeper.reward.RewardFn] |
|
None |
done_function |
Optional[jumanji.environments.logic.minesweeper.done.DoneFn] |
|
None |
viewer |
Optional[jumanji.viewer.Viewer[jumanji.environments.logic.minesweeper.types.State]] |
|
None |
reset(self, key: PRNGKeyArray) -> Tuple[jumanji.environments.logic.minesweeper.types.State, jumanji.types.TimeStep[jumanji.environments.logic.minesweeper.types.Observation]]
#
Resets the environment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
key |
PRNGKeyArray |
needed for placing mines. |
required |
Returns:
Type | Description |
---|---|
state |
|
step(self, state: State, action: Union[jax.Array, numpy.ndarray, numpy.bool_, numpy.number]) -> Tuple[jumanji.environments.logic.minesweeper.types.State, jumanji.types.TimeStep[jumanji.environments.logic.minesweeper.types.Observation]]
#
Run one timestep of the environment's dynamics.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
state |
State |
|
required |
action |
Union[jax.Array, numpy.ndarray, numpy.bool_, numpy.number] |
|
required |
Returns:
Type | Description |
---|---|
next_state |
|