Minesweeper
Bases: Environment[State, MultiDiscreteArray, Observation]
A JAX implementation of the minesweeper game.
-
observation:
Observation
- board: jax array (int32) of shape (num_rows, num_cols): each cell contains -1 if not yet explored, or otherwise the number of mines in the 8 adjacent squares.
- action_mask: jax array (bool) of shape (num_rows, num_cols): indicates which actions are valid (not yet explored squares).
- num_mines: jax array (int32) of shape
()
, indicates the number of mines to locate. - step_count: jax array (int32) of shape (): specifies how many timesteps have elapsed since environment reset.
-
action: multi discrete array containing the square to explore (row and col).
-
reward: jax array (float32): Configurable function of state and action. By default: 1 for every timestep where a valid action is chosen that doesn't reveal a mine, 0 for revealing a mine or selecting an already revealed square (and terminate the episode).
-
episode termination: Configurable function of state, next_state, and action. By default: Stop the episode if a mine is explored, an invalid action is selected (exploring an already explored square), or the board is solved.
-
state:
State
- board: jax array (int32) of shape (num_rows, num_cols): each cell contains -1 if not yet explored, or otherwise the number of mines in the 8 adjacent squares.
- step_count: jax array (int32) of shape (): specifies how many timesteps have elapsed since environment reset.
- flat_mine_locations: jax array (int32) of shape (num_rows * num_cols,): indicates the (flat) locations of all the mines on the board. Will be of length num_mines.
- key: jax array (int32) of shape (2,) used for seeding the sampling of mine placement on reset.
1 2 3 4 5 6 7 8 |
|
Instantiate a Minesweeper
environment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
generator
|
Optional[Generator]
|
|
None
|
reward_function
|
Optional[RewardFn]
|
|
None
|
done_function
|
Optional[DoneFn]
|
|
None
|
viewer
|
Optional[Viewer[State]]
|
|
None
|
Source code in jumanji/environments/logic/minesweeper/env.py
91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 |
|
action_spec: specs.MultiDiscreteArray
cached
property
#
Returns the action spec. An action consists of the height and width of the square to be explored.
Returns:
Name | Type | Description |
---|---|---|
action_spec |
MultiDiscreteArray
|
|
observation_spec: specs.Spec[Observation]
cached
property
#
Specifications of the observation of the Minesweeper
environment.
Returns:
Type | Description |
---|---|
Spec[Observation]
|
Spec for the |
animate(states, interval=200, save_path=None)
#
Creates an animated gif of the board based on the sequence of states.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
states
|
Sequence[State]
|
a list of |
required |
interval
|
int
|
the delay between frames in milliseconds, default to 200. |
200
|
save_path
|
Optional[str]
|
the path where the animation file should be saved. If it is None, the plot will not be saved. |
None
|
Returns:
Type | Description |
---|---|
FuncAnimation
|
animation.FuncAnimation: the animation object that was created. |
Source code in jumanji/environments/logic/minesweeper/env.py
259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 |
|
close()
#
Perform any necessary cleanup.
Environments will automatically :meth:close()
themselves when
garbage collected or when the program exits.
Source code in jumanji/environments/logic/minesweeper/env.py
278 279 280 281 282 283 |
|
render(state)
#
Renders the current state of the board.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
state
|
State
|
the current state to be rendered. |
required |
Source code in jumanji/environments/logic/minesweeper/env.py
251 252 253 254 255 256 257 |
|
reset(key)
#
Resets the environment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
key
|
PRNGKey
|
needed for placing mines. |
required |
Returns:
Name | Type | Description |
---|---|---|
state |
State
|
|
timestep |
TimeStep[Observation]
|
|
Source code in jumanji/environments/logic/minesweeper/env.py
134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 |
|
step(state, action)
#
Run one timestep of the environment's dynamics.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
state
|
State
|
|
required |
action
|
Array
|
|
required |
Returns:
Name | Type | Description |
---|---|---|
next_state |
State
|
|
next_timestep |
TimeStep[Observation]
|
|
Source code in jumanji/environments/logic/minesweeper/env.py
150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 |
|