DQ Software
The ability of Deja vu to transparently checkpoint,
recover and migrate applications on the fly enables preemptive
scheduling for traditionally batch queuing applications. Current
cluster batch queuing systems from several vendors provide cooperative
scheduling algorithms. However, cooperative scheduling algorithms
have significant drawbacks that impact scheduling efficiency and
limit their capabilities for true priority queuing and weighted
fair queuing. To see why, assume the following scenario. A technical
computing cluster has 200 processors, with all processors used
up by currently executing jobs. If new jobs are submitted, they
are inserted in an execution queue managed by the queuing system.
Assume that the first job in the queue requests 64 processors.
If a currently executing 32 processor job terminates, the first
job in the queue cannot be started since it requests more computational
resources than available. In a cooperative scheduling system,
this situation presents a trade-off between scheduling efficiency
and process starvation concerns. If the scheduling system schedules
jobs lower down in the queue to use up the available 32 processors,
it results in process starvation – the top entry in the
queue may never get a chance to run. A technique called "backfill"
is commonly used to achieve a better trade-off, however it requires
users to provide accurate estimates of the computational time
their jobs need, which is not usually available.
Operating systems on shared memory multiprocessors
use time-shared preemptive multitasking to achieve high scheduling
efficiency in the above scenario. However, in the absence of true
preemptive multitasking, there were no equivalents for clusters.
Here, the ability of Déjà vu to recover from multiple
failures presents an interesting capability. Since Déjà
vu can recover from "all failures", where all nodes
involved in a coupled computation fail – we can use failure
recovery for preemption. Intuitively, the "all failures"
case is identical to suspending a process and transparently restarting
another in its place – preemptive time-shared multitasking
DQ is the first preemptive scheduler for distributed
memory clusters and comes integrated with systems management software.
DQ enables
- True priority queuing: Jobs may have multiple
priority levels including priority levels that preempt currently
running jobs.
- Weighted Fair Queuing: A technical computing
resource proportionally shared by multiple "groups"
can be allocated guaranteed "shares" of the system.
- On-Demand Computing: Priority queuing and
preemptive multitasking can be used to achieve guaranteed queuing
delays. This enables computational data centers to create priority
based performance guarantees and appropriate pricing strategies.
- Enhanced Systems Administration Control:
Enables fluid control over systems resources including the ability
to preemptively control node availability.
These capabilities enable the use of DQ in domains
ranging from enterprise technical computing resources to upcoming
"on-demand" cycle access data centers.
|