Search Mailing List Archives
[farmshare-discuss] Barley sharing algorithm?
chekh at stanford.edu
Mon Mar 18 14:02:14 PDT 2013
On 03/17/2013 02:40 PM, Victor Liu wrote:
> every cluster I've used, users have to specify an allotted runtime in
> the queue submission script, and the nodes are completely dedicated to
> the job for the duration of that runtime, or until the job finishes. Is
> there such a policy for Barley?
On this cluster, we have 20 nodes, each with 24 CPUs.
Then we create "slots" in the resource manager, and those "slots" map
roughly to CPUs.
But actually we oversubscribe a bit, I think we allow 28 slots per machine.
Furthermore, we don't have any technical means to prevent CPU
oversubscription. E.g. you can submit a job that asks for one slot and
the scheduler grants you one slot, but then your process is actually a
multi-threaded process that uses multiple cores. This is a common issue
for e.g. MATLAB. So often you'll see the host load be much higher than
the number of CPUs. E.g. 100 vs 24
Oversubscribing the CPU does slow down all the currently running
processes, since they share the CPUs.
Maybe you can request more time for your jobs in the future?
Also please take care to only use the number of processes/threads that
matches how many "job slots" you've asked for.
We have a similar configuration and set of issues for memory use on this
Alex Chekholko chekh at stanford.edu 347-401-4860
More information about the farmshare-discuss