Skip to content

PacMan

PacMan (Environment) #

A JAX implementation of the 'PacMan' game where a single agent must navigate a maze to collect pellets and avoid 4 heuristic agents. The game takes place on a 31x28 grid where the player can move in 4 directions (left, right, up, down) and collect pellets to gain points. The goal is to collect all of the pellets on the board without colliding with one of the heuristic agents. Using the AsciiGenerator the environment will always generate the same maze as long as the same Ascii diagram is in use.

  • observation: Observation

    • player_locations: current 2D position of agent.
    • grid: jax array (int) of the ingame maze with walls.
    • ghost_locations: jax array (int) of ghost positions.
    • power_up_locations: jax array (int) of power-pellet locations
    • pellet_locations: jax array (int) of pellets.
    • action_mask: jax array (bool) defining current actions.
    • score: (int32) of total points aquired.
  • action: jax array (int) of shape () specifiying which action to take [0,1,2,3,4] corresponding to [up, right, down, left, no-op. If there is an invalid action taken, i.e. there is a wall blocking the action, then no action (no-op) is taken.

  • reward: jax array (float32) of shape (): 10 per pellet collected, 20 for a power pellet and 200 for each unique ghost eaten.

  • episode termination (if any):

    • agent has collected all pellets.
    • agent killed by ghost.
    • timer has elapsed.
  • state: State:

    • key: jax array (uint32) of shape(2,).
    • grid: jax array (int)) of shape (31,28) of the ingame maze with walls.
    • pellets: int tracking the number of pellets.
    • frightened_state_time: jax array (int) of shape () tracks number of steps for the scatter state.
    • pellet_locations: jax array (int) of pellets of shape (316,2).
    • power_up_locations: jax array (int) of power-pellet locations of shape (4,2).
    • player_locations: current 2D position of agent.
    • ghost_locations: jax array (int) of ghost positions of shape (4,2).
    • initial_player_locations: starting 2D position of agent.
    • initial_ghost_positions: jax array (int) of ghost positions of shape (4,2).
    • ghost_init_targets: jax array (int) of ghost positions. used to direct ghosts on respawn.
    • old_ghost_locations: jax array (int) of shape (4,2) of ghost positions from last step. used to prevent ghost backtracking.
    • ghost_init_steps: jax array (int) of shape (4,2) number of initial ghost steps. used to determine per ghost initialisation.
    • ghost_actions: jax array (int) of shape (4,).
    • last_direction: int tracking the last direction of the player.
    • dead: bool used to track player death.
    • visited_index: jax array (int) of visited locations of shape (320,2). used to prevent repeated pellet points.
    • ghost_starts: jax array (int) of shape (4,2) used to reset ghost positions if eaten
    • scatter_targets: jax array (int) of shape (4,2) target locations for ghosts when scatter behavior is active.
    • step_count: (int32) of total steps taken from reset till current timestep.
    • ghost_eaten: jax array (bool)of shape (4,) tracking if ghost has been eaten before.
    • score: (int32) of total points aquired.
1
2
3
4
5
6
7
8
from jumanji.environments import pac_man
env = PacMan()
key = jax.random.PRNGKey(0)
state, timestep = jax.jit(env.reset)(key)
env.render(state)
action = env.action_spec.generate_value()
state, timestep = jax.jit(env.step)(state, action)
env.render(state)

observation_spec: jumanji.specs.Spec[jumanji.environments.routing.pac_man.types.Observation] cached property writable #

Specifications of the observation of the PacMan environment.

Returns:

Type Description
Spec containing all the specifications for all the `Observation` fields
  • player_locations: tree of BoundedArray (int32) of shape ().
  • grid: BoundedArray (int)) of the ingame maze with walls.
  • ghost_locations: jax array (int) of ghost positions.
  • power_up_locations: jax array (int) of power-pellet locations
  • pellet_locations: jax array (int) of pellet locations.
  • action_mask: jax array (bool) defining current actions.
  • frightened_state_time: int counting time remaining in scatter mode.
  • score: (int) of total score obtained by player.

action_spec: DiscreteArray cached property writable #

Returns the action spec.

5 actions: [0,1,2,3,4] -> [Up, Right, Down, Left, No-op].

Returns:

Type Description
action_spec

a specs.DiscreteArray spec object.

__init__(self, generator: Optional[jumanji.environments.routing.pac_man.generator.Generator] = None, viewer: Optional[jumanji.viewer.Viewer[jumanji.environments.routing.pac_man.types.State]] = None, time_limit: Optional[int] = None) -> None special #

Instantiates a PacMan environment.

Parameters:

Name Type Description Default
generator Optional[jumanji.environments.routing.pac_man.generator.Generator]

Generator whose __call__ instantiates an environment instance. Implemented options are [AsciiGenerator].

None
time_limit Optional[int]

the time_limit of an episode, i.e. the maximum number of environment steps before the episode terminates. By default, set to 1000.

None
viewer Optional[jumanji.viewer.Viewer[jumanji.environments.routing.pac_man.types.State]]

Viewer used for rendering. Defaults to PacManViewer.

None

reset(self, key: PRNGKeyArray) -> Tuple[jumanji.environments.routing.pac_man.types.State, jumanji.types.TimeStep[jumanji.environments.routing.pac_man.types.Observation]] #

Resets the environment by calling the instance generator for a new instance.

Parameters:

Name Type Description Default
key PRNGKeyArray

A PRNGKey to use for random number generation.

required

Returns:

Type Description
state

State object corresponding to the new state of the environment after a reset. timestep: TimeStep object corresponding the first timestep returned by the environment after a reset.

step(self, state: State, action: Union[jax.Array, numpy.ndarray, numpy.bool_, numpy.number]) -> Tuple[jumanji.environments.routing.pac_man.types.State, jumanji.types.TimeStep[jumanji.environments.routing.pac_man.types.Observation]] #

Run one timestep of the environment's dynamics.

If an action is invalid, the agent does not move, i.e. the episode does not automatically terminate.

Parameters:

Name Type Description Default
state State

State object containing the dynamics of the environment.

required
action Union[jax.Array, numpy.ndarray, numpy.bool_, numpy.number]

(int32) specifying which action to take: [0,1,2,3,4] correspond to [Up, Right, Down, Left, No-op]. If an invalid action is taken, i.e. there is a wall blocking the action, then no action (no-op) is taken.

required

Returns:

Type Description
state

the new state of the environment. the next timestep to be observed.

render(self, state: State) -> Any #

Render the given state of the environment.

Parameters:

Name Type Description Default
state State

state object containing the current environment state.

required

Last update: 2024-03-29
Back to top