Tetris
Tetris (Environment)
#
RL Environment for the game of Tetris. The environment has a grid where the player can place tetrominoes. The environment has the following characteristics:
- observation:
Observation
- grid: jax array (int32) of shape (num_rows, num_cols) representing the current state of the grid.
- tetromino: jax array (int32) of shape (4, 4) representing the current tetromino sampled from the tetromino list.
- action_mask: jax array (bool) of shape (4, num_cols). For each tetromino there are 4 rotations, each one corresponds to a line in the action_mask. Mask of the joint action space: True if the action (x_position and rotation degree) is feasible for the current tetromino and grid state.
-
action: multi discrete array of shape (2,)
- rotation_index: The degree index determines the rotation of the tetromino: 0 corresponds to 0 degrees, 1 corresponds to 90 degrees, 2 corresponds to 180 degrees, and 3 corresponds to 270 degrees.
- x_position: int between 0 and num_cols - 1 (included).
-
reward: The reward is 0 if no lines was cleared by the action and a convex function of the number of cleared lines otherwise.
-
episode termination: if the tetromino cannot be placed anymore (i.e., it hits the top of the grid).
1 2 3 4 5 6 7 8 |
|
observation_spec: jumanji.specs.Spec[jumanji.environments.packing.tetris.types.Observation]
cached
property
writable
#
Specifications of the observation of the Tetris
environment.
Returns:
Type | Description |
---|---|
Spec containing all the specifications for all the `Observation` fields |
|
action_spec: MultiDiscreteArray
cached
property
writable
#
Returns the action spec. An action consists of two pieces of information: the amount of rotation (number of 90-degree rotations) and the x-position of the leftmost part of the tetromino.
Returns:
Type | Description |
---|---|
MultiDiscreteArray |
The action spec, which is a |
__init__(self, num_rows: int = 10, num_cols: int = 10, time_limit: int = 400, viewer: Optional[jumanji.viewer.Viewer[jumanji.environments.packing.tetris.types.State]] = None) -> None
special
#
Instantiates a Tetris
environment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_rows |
int |
number of rows of the 2D grid. Defaults to 10. |
10 |
num_cols |
int |
number of columns of the 2D grid. Defaults to 10. |
10 |
time_limit |
int |
time_limit of an episode, i.e. number of environment steps before the episode ends. Defaults to 400. |
400 |
viewer |
Optional[jumanji.viewer.Viewer[jumanji.environments.packing.tetris.types.State]] |
|
None |
reset(self, key: PRNGKeyArray) -> Tuple[jumanji.environments.packing.tetris.types.State, jumanji.types.TimeStep[jumanji.environments.packing.tetris.types.Observation]]
#
Resets the environment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
key |
PRNGKeyArray |
needed for generating new tetrominoes. |
required |
Returns:
Type | Description |
---|---|
state |
|
step(self, state: State, action: Union[jax.Array, numpy.ndarray, numpy.bool_, numpy.number]) -> Tuple[jumanji.environments.packing.tetris.types.State, jumanji.types.TimeStep[jumanji.environments.packing.tetris.types.Observation]]
#
Run one timestep of the environment's dynamics.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
state |
State |
|
required |
action |
Union[jax.Array, numpy.ndarray, numpy.bool_, numpy.number] |
|
required |
Returns:
Type | Description |
---|---|
next_state |
|