Skip to content

Base

Bases: ABC, Generic[State, ActionSpec, Observation]

Environment written in Jax that differs from the gym API to make the step and reset functions jittable. The state contains all the dynamics and data needed to step the environment, no computation stored in attributes of self. The API is inspired by brax.

Initialize environment.

Source code in jumanji/env.py
50
51
52
53
54
55
def __init__(self) -> None:
    """Initialize environment."""
    self.observation_spec  # noqa: B018
    self.action_spec  # noqa: B018
    self.reward_spec  # noqa: B018
    self.discount_spec  # noqa: B018

action_spec: ActionSpec abstractmethod cached property #

Returns the action spec.

Returns:

Name Type Description
action_spec ActionSpec

a potentially nested Spec structure representing the action.

discount_spec: specs.BoundedArray cached property #

Returns the discount spec. By default, this is assumed to be a float between 0 and 1.

Returns:

Name Type Description
discount_spec BoundedArray

a specs.BoundedArray spec.

observation_spec: specs.Spec[Observation] abstractmethod cached property #

Returns the observation spec.

Returns:

Name Type Description
observation_spec Spec[Observation]

a potentially nested Spec structure representing the observation.

reward_spec: specs.Array cached property #

Returns the reward spec. By default, this is assumed to be a single float.

Returns:

Name Type Description
reward_spec Array

a specs.Array spec.

__exit__(*args) #

Calls :meth:close().

Source code in jumanji/env.py
136
137
138
def __exit__(self, *args: Any) -> None:
    """Calls :meth:`close()`."""
    self.close()

close() #

Perform any necessary cleanup.

Source code in jumanji/env.py
130
131
def close(self) -> None:
    """Perform any necessary cleanup."""

render(state) #

Render frames of the environment for a given state.

Parameters:

Name Type Description Default
state State

State object containing the current dynamics of the environment.

required
Source code in jumanji/env.py
122
123
124
125
126
127
128
def render(self, state: State) -> Any:
    """Render frames of the environment for a given state.

    Args:
        state: State object containing the current dynamics of the environment.
    """
    raise NotImplementedError("Render method not implemented for this environment.")

reset(key) abstractmethod #

Resets the environment to an initial state.

Parameters:

Name Type Description Default
key PRNGKey

random key used to reset the environment.

required

Returns:

Name Type Description
state State

State object corresponding to the new state of the environment,

timestep TimeStep[Observation]

TimeStep object corresponding the first timestep returned by the environment,

Source code in jumanji/env.py
57
58
59
60
61
62
63
64
65
66
67
@abc.abstractmethod
def reset(self, key: chex.PRNGKey) -> Tuple[State, TimeStep[Observation]]:
    """Resets the environment to an initial state.

    Args:
        key: random key used to reset the environment.

    Returns:
        state: State object corresponding to the new state of the environment,
        timestep: TimeStep object corresponding the first timestep returned by the environment,
    """

step(state, action) abstractmethod #

Run one timestep of the environment's dynamics.

Parameters:

Name Type Description Default
state State

State object containing the dynamics of the environment.

required
action Array

Array containing the action to take.

required

Returns:

Name Type Description
state State

State object corresponding to the next state of the environment,

timestep TimeStep[Observation]

TimeStep object corresponding the timestep returned by the environment,

Source code in jumanji/env.py
69
70
71
72
73
74
75
76
77
78
79
80
@abc.abstractmethod
def step(self, state: State, action: chex.Array) -> Tuple[State, TimeStep[Observation]]:
    """Run one timestep of the environment's dynamics.

    Args:
        state: State object containing the dynamics of the environment.
        action: Array containing the action to take.

    Returns:
        state: State object corresponding to the next state of the environment,
        timestep: TimeStep object corresponding the timestep returned by the environment,
    """