Recently I’ve been tasked with working on an infrastructure
system which comprises several computationally intensive tasks that can be
scheduled on a series of host machines. Some tasks have dependencies on other tasks,
and cannot be scheduled before its dependencies have finished. In a simplified
setup we assume that any task can run on any machine, provided that machine is
not busy with running another task.
The following picture illustrates the system
Host A runs task 1 and 3, while host B runs task 2 and 4.
Task 4 has a dependency on task 1 and task 3 has one on task 2. When all tasks
are completed the job is considered completed. It is evident from the picture
that a task (task 3) may be delayed in its scheduling even though its
dependencies are satisfied due to resource constraints.