GraphColoring
Bases: Environment[State, DiscreteArray, Observation]
Environment for the GraphColoring problem. The problem is a combinatorial optimization task where the goal is to assign a color to each vertex of a graph in such a way that no two adjacent vertices share the same color. The problem is usually formulated as minimizing the number of colors used.
-
observation:
Observation
- adj_matrix: jax array (bool) of shape (num_nodes, num_nodes), representing the adjacency matrix of the graph.
- colors: jax array (int32) of shape (num_nodes,), representing the current color assignments for the vertices.
- action_mask: jax array (bool) of shape (num_colors,), indicating which actions are valid in the current state of the environment.
- current_node_index: integer representing the current node being colored.
-
action: int, the color to be assigned to the current node (0 to num_nodes - 1)
-
reward: float, a sparse reward is provided at the end of the episode. Equals the negative of the number of unique colors used to color all vertices in the graph. If an invalid action is taken, the reward is the negative of the total number of colors.
-
episode termination:
- if all nodes have been assigned a color or if an invalid action is taken.
-
state:
State
- adj_matrix: jax array (bool) of shape (num_nodes, num_nodes), representing the adjacency matrix of the graph.
- colors: jax array (int32) of shape (num_nodes,), color assigned to each node, -1 if not assigned.
- current_node_index: jax array (int) with shape (), index of the current node.
- action_mask: jax array (bool) of shape (num_colors,), indicating which actions are valid in the current state of the environment.
- key: jax array (uint32) of shape (2,), random key used to generate random numbers at each step and for auto-reset.
1 2 3 4 5 6 7 8 |
|
Instantiate a GraphColoring
environment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
generator
|
Optional[Generator]
|
callable to instantiate environment instances.
Defaults to |
None
|
viewer
|
Optional[Viewer[State]]
|
environment viewer for rendering.
Defaults to |
None
|
Source code in jumanji/environments/logic/graph_coloring/env.py
86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 |
|
action_spec: specs.DiscreteArray
cached
property
#
Specification of the action for the GraphColoring
environment.
Returns:
Name | Type | Description |
---|---|---|
action_spec |
DiscreteArray
|
specs.DiscreteArray object |
observation_spec: specs.Spec[Observation]
cached
property
#
Returns the observation spec.
Returns:
Type | Description |
---|---|
Spec[Observation]
|
Spec for the |
Spec[Observation]
|
|
Spec[Observation]
|
|
Spec[Observation]
|
|
Spec[Observation]
|
|
animate(states, interval=200, save_path=None)
#
Creates an animated gif of the GraphColoring
environment based on a sequence of states.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
states
|
Sequence[State]
|
is a list of |
required |
interval
|
int
|
the delay between frames in milliseconds, default to 200. |
200
|
save_path
|
Optional[str]
|
the path where the animation file should be saved. If it is None, the plot will not be stored. |
None
|
Returns:
Type | Description |
---|---|
FuncAnimation
|
animation.FuncAnimation: the animation object that was created. |
Source code in jumanji/environments/logic/graph_coloring/env.py
285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 |
|
close()
#
Perform any necessary cleanup.
Environments will automatically :meth:close()
themselves when
garbage collected or when the program exits.
Source code in jumanji/environments/logic/graph_coloring/env.py
304 305 306 307 308 309 310 |
|
render(state)
#
Renders the current state of the GraphColoring
environment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
state
|
State
|
is the current game state to be rendered. |
required |
Source code in jumanji/environments/logic/graph_coloring/env.py
277 278 279 280 281 282 283 |
|
reset(key)
#
Resets the environment to an initial state.
Returns:
Type | Description |
---|---|
Tuple[State, TimeStep[Observation]]
|
The initial state and timestep. |
Source code in jumanji/environments/logic/graph_coloring/env.py
110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 |
|
step(state, action)
#
Updates the environment state after the agent takes an action.
Specifically, this function allows the agent to choose a color for the current node (based on the action taken) in a graph coloring problem. It then updates the state of the environment based on the color chosen and calculates the reward based on the validity of the action and the completion of the coloring task.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
state
|
State
|
the current state of the environment. |
required |
action
|
Array
|
the action taken by the agent. |
required |
Returns:
Name | Type | Description |
---|---|---|
state |
State
|
the new state of the environment. |
timestep |
TimeStep[Observation]
|
the next timestep. |
Source code in jumanji/environments/logic/graph_coloring/env.py
139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 |
|