PacMan
PacMan (Environment)
#
A JAX implementation of the 'PacMan' game where a single agent must navigate a maze to collect pellets and avoid 4 heuristic agents. The game takes place on a 31x28 grid where the player can move in 4 directions (left, right, up, down) and collect pellets to gain points. The goal is to collect all of the pellets on the board without colliding with one of the heuristic agents. Using the AsciiGenerator the environment will always generate the same maze as long as the same Ascii diagram is in use.
-
observation:
Observation
- player_locations: current 2D position of agent.
- grid: jax array (int) of the ingame maze with walls.
- ghost_locations: jax array (int) of ghost positions.
- power_up_locations: jax array (int) of power-pellet locations
- pellet_locations: jax array (int) of pellets.
- action_mask: jax array (bool) defining current actions.
- score: (int32) of total points aquired.
-
action: jax array (int) of shape () specifiying which action to take [0,1,2,3,4] corresponding to [up, right, down, left, no-op. If there is an invalid action taken, i.e. there is a wall blocking the action, then no action (no-op) is taken.
-
reward: jax array (float32) of shape (): 10 per pellet collected, 20 for a power pellet and 200 for each unique ghost eaten.
-
episode termination (if any):
- agent has collected all pellets.
- agent killed by ghost.
- timer has elapsed.
-
state: State:
- key: jax array (uint32) of shape(2,).
- grid: jax array (int)) of shape (31,28) of the ingame maze with walls.
- pellets: int tracking the number of pellets.
- frightened_state_time: jax array (int) of shape () tracks number of steps for the scatter state.
- pellet_locations: jax array (int) of pellets of shape (316,2).
- power_up_locations: jax array (int) of power-pellet locations of shape (4,2).
- player_locations: current 2D position of agent.
- ghost_locations: jax array (int) of ghost positions of shape (4,2).
- initial_player_locations: starting 2D position of agent.
- initial_ghost_positions: jax array (int) of ghost positions of shape (4,2).
- ghost_init_targets: jax array (int) of ghost positions. used to direct ghosts on respawn.
- old_ghost_locations: jax array (int) of shape (4,2) of ghost positions from last step. used to prevent ghost backtracking.
- ghost_init_steps: jax array (int) of shape (4,2) number of initial ghost steps. used to determine per ghost initialisation.
- ghost_actions: jax array (int) of shape (4,).
- last_direction: int tracking the last direction of the player.
- dead: bool used to track player death.
- visited_index: jax array (int) of visited locations of shape (320,2). used to prevent repeated pellet points.
- ghost_starts: jax array (int) of shape (4,2) used to reset ghost positions if eaten
- scatter_targets: jax array (int) of shape (4,2) target locations for ghosts when scatter behavior is active.
- step_count: (int32) of total steps taken from reset till current timestep.
- ghost_eaten: jax array (bool)of shape (4,) tracking if ghost has been eaten before.
- score: (int32) of total points aquired.
1 2 3 4 5 6 7 8 |
|
observation_spec: jumanji.specs.Spec[jumanji.environments.routing.pac_man.types.Observation]
cached
property
writable
#
Specifications of the observation of the PacMan
environment.
Returns:
Type | Description |
---|---|
Spec containing all the specifications for all the `Observation` fields |
|
action_spec: DiscreteArray
cached
property
writable
#
Returns the action spec.
5 actions: [0,1,2,3,4] -> [Up, Right, Down, Left, No-op].
Returns:
Type | Description |
---|---|
action_spec |
a |
__init__(self, generator: Optional[jumanji.environments.routing.pac_man.generator.Generator] = None, viewer: Optional[jumanji.viewer.Viewer[jumanji.environments.routing.pac_man.types.State]] = None, time_limit: Optional[int] = None) -> None
special
#
Instantiates a PacMan
environment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
generator |
Optional[jumanji.environments.routing.pac_man.generator.Generator] |
|
None |
time_limit |
Optional[int] |
the time_limit of an episode, i.e. the maximum number of environment steps before the episode terminates. By default, set to 1000. |
None |
viewer |
Optional[jumanji.viewer.Viewer[jumanji.environments.routing.pac_man.types.State]] |
|
None |
reset(self, key: PRNGKeyArray) -> Tuple[jumanji.environments.routing.pac_man.types.State, jumanji.types.TimeStep[jumanji.environments.routing.pac_man.types.Observation]]
#
Resets the environment by calling the instance generator for a new instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
key |
PRNGKeyArray |
A PRNGKey to use for random number generation. |
required |
Returns:
Type | Description |
---|---|
state |
|
step(self, state: State, action: Union[jax.Array, numpy.ndarray, numpy.bool_, numpy.number]) -> Tuple[jumanji.environments.routing.pac_man.types.State, jumanji.types.TimeStep[jumanji.environments.routing.pac_man.types.Observation]]
#
Run one timestep of the environment's dynamics.
If an action is invalid, the agent does not move, i.e. the episode does not automatically terminate.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
state |
State |
State object containing the dynamics of the environment. |
required |
action |
Union[jax.Array, numpy.ndarray, numpy.bool_, numpy.number] |
(int32) specifying which action to take: [0,1,2,3,4] correspond to [Up, Right, Down, Left, No-op]. If an invalid action is taken, i.e. there is a wall blocking the action, then no action (no-op) is taken. |
required |
Returns:
Type | Description |
---|---|
state |
the new state of the environment. the next timestep to be observed. |
render(self, state: State) -> Any
#
Render the given state of the environment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
state |
State |
|
required |