Skip to content

JobShop

JobShop (Environment) #

The Job Shop Scheduling Problem, as described in [1], is one of the best known combinatorial optimization problems. We are given num_jobs jobs, each consisting of at most max_num_ops ops, which need to be processed on num_machines machines. Each operation (op) has a specific machine that it needs to be processed on and a duration (which must be less than or equal to max_duration_op). The goal is to minimise the total length of the schedule, also known as the makespan.

[1] https://developers.google.com/optimization/scheduling/job_shop.

  • observation: Observation

    • ops_machine_ids: jax array (int32) of (num_jobs, max_num_ops) id of the machine each operation must be processed on.
    • ops_durations: jax array (int32) of (num_jobs, max_num_ops) processing time of each operation.
    • ops_mask: jax array (bool) of (num_jobs, max_num_ops) indicating which operations have yet to be scheduled.
    • machines_job_ids: jax array (int32) of shape (num_machines,) id of the job (or no-op) that each machine is processing.
    • machines_remaining_times: jax array (int32) of shape (num_machines,) specifying, for each machine, the number of time steps until available.
    • action_mask: jax array (bool) of shape (num_machines, num_jobs + 1) indicates which job(s) (or no-op) can legally be scheduled on each machine.
  • action: jax array (int32) of shape (num_machines,).

  • reward: jax array (float) of shape (). A reward of -1 is given each time step. If all machines are simultaneously idle or the agent selects an invalid action, the agent is given a large penalty of -num_jobs * max_num_ops * max_op_duration which is an upper bound on the makespan.

  • episode termination:

    • Finished schedule: all operations (and thus all jobs) every job have been processed.
    • Illegal action: the agent ignores the action mask and takes an illegal action.
    • Simultaneously idle: all machines are inactive at the same time.
  • state: State

    • ops_machine_ids: same as observation.
    • ops_durations: same as observation.
    • ops_mask: same as observation.
    • machines_job_ids: same as observation.
    • machines_remaining_times: same as observation.
    • action_mask: same as observation.
    • step_count: jax array (int32) of shape (), the number of time steps in the episode so far.
    • scheduled_times: jax array (int32) of shape (num_jobs, max_num_ops), specifying the timestep at which every op (scheduled so far) was scheduled.
1
2
3
4
5
6
7
8
from jumanji.environments import JobShop
env = JobShop()
key = jax.random.PRNGKey(0)
state, timestep = jax.jit(env.reset)(key)
env.render(state)
action = env.action_spec.generate_value()
state, timestep = jax.jit(env.step)(state, action)
env.render(state)

observation_spec: jumanji.specs.Spec[jumanji.environments.packing.job_shop.types.Observation] cached property writable #

Specifications of the observation of the JobShop environment.

Returns:

Type Description
Spec containing the specifications for all the `Observation` fields
  • ops_machine_ids: BoundedArray (int32) of shape (num_jobs, max_num_ops).
  • ops_durations: BoundedArray (int32) of shape (num_jobs, max_num_ops).
  • ops_mask: BoundedArray (bool) of shape (num_jobs, max_num_ops).
  • machines_job_ids: BoundedArray (int32) of shape (num_machines,).
  • machines_remaining_times: BoundedArray (int32) of shape (num_machines,).
  • action_mask: BoundedArray (bool) of shape (num_machines, num_jobs + 1).

action_spec: MultiDiscreteArray cached property writable #

Specifications of the action in the JobShop environment. The action gives each machine a job id ranging from 0, 1, ..., num_jobs where the last value corresponds to a no-op.

Returns:

Type Description
action_spec

a specs.MultiDiscreteArray spec.

__init__(self, generator: Optional[jumanji.environments.packing.job_shop.generator.Generator] = None, viewer: Optional[jumanji.viewer.Viewer[jumanji.environments.packing.job_shop.types.State]] = None) special #

Instantiate a JobShop environment.

Parameters:

Name Type Description Default
generator Optional[jumanji.environments.packing.job_shop.generator.Generator]

Generator whose __call__ instantiates an environment instance. Implemented options are ['ToyGenerator', 'RandomGenerator']. Defaults to RandomGenerator with 20 jobs, 10 machines, up to 8 ops for any given job, and a max operation duration of 6.

None
viewer Optional[jumanji.viewer.Viewer[jumanji.environments.packing.job_shop.types.State]]

Viewer used for rendering. Defaults to JobShopViewer.

None

reset(self, key: PRNGKeyArray) -> Tuple[jumanji.environments.packing.job_shop.types.State, jumanji.types.TimeStep[jumanji.environments.packing.job_shop.types.Observation]] #

Resets the environment by creating a new problem instance and initialising the state and timestep.

Parameters:

Name Type Description Default
key PRNGKeyArray

random key used to reset the environment.

required

Returns:

Type Description
state

the environment state after the reset. timestep: the first timestep returned by the environment after the reset.

step(self, state: State, action: Union[jax.Array, numpy.ndarray, numpy.bool_, numpy.number]) -> Tuple[jumanji.environments.packing.job_shop.types.State, jumanji.types.TimeStep[jumanji.environments.packing.job_shop.types.Observation]] #

Updates the status of all machines, the status of the operations, and increments the time step. It updates the environment state and the timestep (which contains the new observation). It calculates the reward based on the three terminal conditions: - The action provided by the agent is invalid. - The schedule has finished. - All machines do a no-op that leads to all machines being simultaneously idle.

Parameters:

Name Type Description Default
state State

the environment state.

required
action Union[jax.Array, numpy.ndarray, numpy.bool_, numpy.number]

the action to take.

required

Returns:

Type Description
state

the updated environment state. timestep: the updated timestep.


Last update: 2024-11-01
Back to top