Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[farmshare-discuss] Barley sharing algorithm?

Daniel Becker dub at stanford.edu
Fri Aug 10 11:11:26 PDT 2012


Hi all,

Implementing basic fairness takes literally a minute:

<http://www.gridengine.info/2006/01/17/easy-setup-of-equal-user-fairshare-policy/>

Is there any reason this is not appropriate for farmshare?

Daniel


On Aug 10, 2012, at 10:31 AM, Michael Maxwell Murray <mmurray1 at stanford.edu> wrote:

> Hello Alex,
> 
> Another data point on why fairshare is needed. I know you have lots to 
> do, so I can accept that fairshare won't be implemented for a couple
> more weeks. However, I wanted to provide a succinct example of why
> fairshare is needed, so that the implementation is not delayed.
> After reading this, can you get some assurance from your management
> as to a firm date when fairshare can be implemented?
> 
> Right now, user rpinho has 348 jobs running.
> 
> corn05: qstat -u '*' | grep ' r ' | grep rpinho | wc
>    348    3132   38628
> 
> Other users are running 20 jobs:
> 
> corn05: qstat -u '*' | grep ' r ' | grep -v rpinho | wc
>     20     182    2212
> 
> I have jobs that I would like to run that have been queued for a day:
> From qstat:
> 
> 326846 0.75164 XX         mmurray1     qw    08/09/2012 10:20:14                                    1 1-196:1
> 
> Also, user tflanzer has 66 jobs that have been queued for 18 hours. Yet the most recent 
> jobs to start belong to rpinho:
> 
> 326816 0.75165 m3r05p2118 rpinho       r     08/10/2012 08:45:04 precise.q at barley17.stanford.ed     1        
> 326817 0.75165 m3r05p2119 rpinho       r     08/10/2012 09:21:34 precise.q at barley01.stanford.ed     1        
> 326818 0.75165 m3r05p2120 rpinho       r     08/10/2012 09:30:04 precise.q at barley06.stanford.ed     1        
> 326819 0.75165 m3r05p2121 rpinho       r     08/10/2012 09:49:34 precise.q at barley11.stanford.ed     1    
> 
> Note that only 5 jobs have started in the last hour and a half, i.e. the queue is clearing slowly.
> Furthermore, rpihno has 25 more jobs in the queue that look like they will start ahead of
> other users. (To be fair, at 4:00 p.m. this afternoon, rpinhos older jobs will start
> to hit the 2 day limit, and the queue will start to clear faster.) 
> 
> It seems to me that when a slot becomes available, SGE should be starting jobs belonging to
> other users besides rpinho. Also, I have no problem with rpinho wanting to run a lot of jobs.
> It's just that I would like to see the computing power allocated more evenly.
> 
> Thank you,
> Mike Murray
> Ph.D. Candidate
> Civil and Environmental Engineering
> 
> 
> 
> 
> 
> corn05:/mnt/glusterfs/mmurray1/PhD/evpp/port> qstat -u '*' | grep ' r ' | grep -v rpinho | wc
> ----- Original Message -----
> From: "Alex Chekholko" <chekh at stanford.edu>
> To: farmshare-discuss at lists.stanford.edu
> Sent: Monday, August 6, 2012 10:38:08 AM
> Subject: Re: [farmshare-discuss] Barley sharing algorithm?
> 
> Hi all,
> 
> Thank you for your suggestions.  We will implement basic fairshare in a 
> couple of weeks and send an announcement.
> 
> Regards,
> Alex
> 
> On 8/3/12 9:36 AM, Tomas Babak wrote:
>> I agree, it would be great to implement fairshare that prioritizes jobs
>> starting based on the USER CPU usage rather than just when the job was
>> submitted - especially given the heavy usage of barley. This does not seem
>> to be happening yet but I think was a planned implementation?
>> 
>> Tomas
>> 
>> 
>> 
>> 
>> 
>> 
>> -----Original Message-----
>> From: farmshare-discuss-bounces at lists.stanford.edu
>> [mailto:farmshare-discuss-bounces at lists.stanford.edu] On Behalf Of Michael
>> Maxwell Murray
>> Sent: Friday, August 03, 2012 9:23 AM
>> To: Open discussion for users of FarmShare
>> Subject: [farmshare-discuss] Barley sharing algorithm?
>> 
>> Hello,
>> 
>> Can someone explain the algorithm the Barley's used to allocate
>> slots to users? Currently, there are 444 jobs running. 413
>> jobs belong to the user ocarja, including the most recently
>> launched jobs. Several other users have jobs that have been
>> queued for more than a day (e.g. blhuynh, rpinho, tflanzer)
>> 
>> Given that ocarja's jobs are consuming a large fraction for
>> the CPUs and that there are other users waiting for a significant
>> time, when a slot becomes available, why isn't a different user's
>> job started?
>> 
>> Thank you,
>> Mike Murray
>> Ph.D. Candidate
>> Civil and Environmental Engineering
>> _______________________________________________
>> farmshare-discuss mailing list
>> farmshare-discuss at lists.stanford.edu
>> https://mailman.stanford.edu/mailman/listinfo/farmshare-discuss
>> 
>> _______________________________________________
>> farmshare-discuss mailing list
>> farmshare-discuss at lists.stanford.edu
>> https://mailman.stanford.edu/mailman/listinfo/farmshare-discuss
>> 
> 
> -- 
> Alex Chekholko chekh at stanford.edu 347-401-4860
> _______________________________________________
> farmshare-discuss mailing list
> farmshare-discuss at lists.stanford.edu
> https://mailman.stanford.edu/mailman/listinfo/farmshare-discuss
> _______________________________________________
> farmshare-discuss mailing list
> farmshare-discuss at lists.stanford.edu
> https://mailman.stanford.edu/mailman/listinfo/farmshare-discuss

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4379 bytes
Desc: not available
URL: <http://mailman.stanford.edu/pipermail/farmshare-discuss/attachments/20120810/60670826/attachment.p7s>


More information about the farmshare-discuss mailing list