Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[farmshare-discuss] Barley sharing algorithm?

Alex Chekholko chekh at stanford.edu
Mon Mar 18 14:02:14 PDT 2013


On 03/17/2013 02:40 PM, Victor Liu wrote:
> In
> every cluster I've used, users have to specify an allotted runtime in
> the queue submission script, and the nodes are completely dedicated to
> the job for the duration of that runtime, or until the job finishes. Is
> there such a policy for Barley?

Hi Victor,

On this cluster, we have 20 nodes, each with 24 CPUs.

Then we create "slots" in the resource manager, and those "slots" map 
roughly to CPUs.

But actually we oversubscribe a bit, I think we allow 28 slots per machine.

Furthermore, we don't have any technical means to prevent CPU 
oversubscription.  E.g. you can submit a job that asks for one slot and 
the scheduler grants you one slot, but then your process is actually a 
multi-threaded process that uses multiple cores.  This is a common issue 
for e.g. MATLAB.  So often you'll see the host load be much higher than 
the number of CPUs.  E.g. 100 vs 24

Oversubscribing the CPU does slow down all the currently running 
processes, since they share the CPUs.

Maybe you can request more time for your jobs in the future?

Also please take care to only use the number of processes/threads that 
matches how many "job slots" you've asked for.


We have a similar configuration and set of issues for memory use on this 
cluster.


Regards,
-- 
Alex Chekholko chekh at stanford.edu 347-401-4860



More information about the farmshare-discuss mailing list