Knapsack
Bases: Environment[State, DiscreteArray, Observation]
Knapsack environment as described in [1].
-
observation: Observation
- weights: jax array (float) of shape (num_items,) the weights of the items.
- values: jax array (float) of shape (num_items,) the values of the items.
- packed_items: jax array (bool) of shape (num_items,) binary mask denoting which items are already packed into the knapsack.
- action_mask: jax array (bool) of shape (num_items,) binary mask denoting which items can be packed into the knapsack.
-
action: jax array (int32) of shape () [0, ..., num_items - 1] -> item to pack.
-
reward: jax array (float) of shape (), could be either:
- dense: the value of the item to pack at the current timestep.
- sparse: the sum of the values of the items packed in the bag at the end of the episode. In both cases, the reward is 0 if the action is invalid, i.e. an item that was previously selected is selected again or has a weight larger than the bag capacity.
-
episode termination:
- if no action can be performed, i.e. all items are packed or each remaining item's weight is larger than the bag capacity.
- if an invalid action is taken, i.e. the chosen item is already packed or has a weight larger than the bag capacity.
-
state:
State
- weights: jax array (float) of shape (num_items,) the weights of the items.
- values: jax array (float) of shape (num_items,) the values of the items.
- packed_items: jax array (bool) of shape (num_items,) binary mask denoting which items are already packed into the knapsack.
- remaining_budget: jax array (float) the budget currently remaining.
[1] https://arxiv.org/abs/2010.16011
1 2 3 4 5 6 7 8 |
|
Instantiates a Knapsack
environment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
generator
|
Optional[Generator]
|
|
None
|
reward_fn
|
Optional[RewardFn]
|
|
None
|
viewer
|
Optional[Viewer[State]]
|
|
None
|
Source code in jumanji/environments/packing/knapsack/env.py
86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 |
|
action_spec: specs.DiscreteArray
cached
property
#
Returns the action spec.
Returns:
Name | Type | Description |
---|---|---|
action_spec |
DiscreteArray
|
a |
observation_spec: specs.Spec[Observation]
cached
property
#
Returns the observation spec.
Returns:
Type | Description |
---|---|
Spec[Observation]
|
Spec for each field in the Observation: |
Spec[Observation]
|
|
Spec[Observation]
|
|
Spec[Observation]
|
|
Spec[Observation]
|
|
animate(states, interval=200, save_path=None)
#
Creates an animated gif of the Knapsack
environment based on the sequence of states.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
states
|
Sequence[State]
|
sequence of environment states corresponding to consecutive timesteps. |
required |
interval
|
int
|
delay between frames in milliseconds, default to 200. |
200
|
save_path
|
Optional[str]
|
the path where the animation file should be saved. If it is None, the plot will not be saved. |
None
|
Returns:
Type | Description |
---|---|
FuncAnimation
|
animation.FuncAnimation: the animation object that was created. |
Source code in jumanji/environments/packing/knapsack/env.py
245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 |
|
close()
#
Perform any necessary cleanup.
Environments will automatically :meth:close()
themselves when
garbage collected or when the program exits.
Source code in jumanji/environments/packing/knapsack/env.py
264 265 266 267 268 269 270 |
|
render(state)
#
Render the environment state, displaying which items have been picked so far, their value, and the remaining budget.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
state
|
State
|
the environment state to be rendered. |
required |
Source code in jumanji/environments/packing/knapsack/env.py
236 237 238 239 240 241 242 243 |
|
reset(key)
#
Resets the environment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
key
|
PRNGKey
|
used to randomly generate the weights and values of the items. |
required |
Returns:
Name | Type | Description |
---|---|---|
state |
State
|
the new state of the environment. |
timestep |
TimeStep[Observation]
|
the first timestep returned by the environment. |
Source code in jumanji/environments/packing/knapsack/env.py
126 127 128 129 130 131 132 133 134 135 136 137 138 |
|
step(state, action)
#
Run one timestep of the environment's dynamics.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
state
|
State
|
State object containing the dynamics of the environment. |
required |
action
|
Numeric
|
index of next item to take. |
required |
Returns:
Name | Type | Description |
---|---|---|
state |
State
|
next state of the environment. |
timestep |
TimeStep[Observation]
|
the timestep to be observed. |
Source code in jumanji/environments/packing/knapsack/env.py
140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 |
|