Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[farmshare-discuss] Barley sharing algorithm?

Matthew Kieber-Emmons mattke at stanford.edu
Fri Aug 10 11:14:37 PDT 2012


I second this email. For those not used to these things, basically the way it works is everyone is assigned a number of tickets. When your jobs run, your tickets are used based on a formula. Your tickets are slowly regenerated over the course of the next week or two with another formula. When a free slot opens, the grid engine goes to the wait list, filters by jobs that can run in the slots/consumables requested, sorts by priority (which is calculated from the number of tickets a user currently has), and takes the top job. Don't worry about the exact nature of the formulas, just realize they exist and work. This is a "solved problem" in general, I am surprised it wasn't turned on when they opened up the system.

On Aug 10, 2012, at 11:11 AM, Daniel Becker <dub at stanford.edu> wrote:

> Hi all,
> 
> Implementing basic fairness takes literally a minute:
> 
> <http://www.gridengine.info/2006/01/17/easy-setup-of-equal-user-fairshare-policy/>
> 
> Is there any reason this is not appropriate for farmshare?
> 
> Daniel
> 
> 
> On Aug 10, 2012, at 10:31 AM, Michael Maxwell Murray <mmurray1 at stanford.edu> wrote:
> 
>> Hello Alex,
>> 
>> Another data point on why fairshare is needed. I know you have lots to 
>> do, so I can accept that fairshare won't be implemented for a couple
>> more weeks. However, I wanted to provide a succinct example of why
>> fairshare is needed, so that the implementation is not delayed.
>> After reading this, can you get some assurance from your management
>> as to a firm date when fairshare can be implemented?
>> 
>> Right now, user rpinho has 348 jobs running.
>> 
>> corn05: qstat -u '*' | grep ' r ' | grep rpinho | wc
>>   348    3132   38628
>> 
>> Other users are running 20 jobs:
>> 
>> corn05: qstat -u '*' | grep ' r ' | grep -v rpinho | wc
>>    20     182    2212
>> 
>> I have jobs that I would like to run that have been queued for a day:
>> From qstat:
>> 
>> 326846 0.75164 XX         mmurray1     qw    08/09/2012 10:20:14                                    1 1-196:1
>> 
>> Also, user tflanzer has 66 jobs that have been queued for 18 hours. Yet the most recent 
>> jobs to start belong to rpinho:
>> 
>> 326816 0.75165 m3r05p2118 rpinho       r     08/10/2012 08:45:04 precise.q at barley17.stanford.ed     1        
>> 326817 0.75165 m3r05p2119 rpinho       r     08/10/2012 09:21:34 precise.q at barley01.stanford.ed     1        
>> 326818 0.75165 m3r05p2120 rpinho       r     08/10/2012 09:30:04 precise.q at barley06.stanford.ed     1        
>> 326819 0.75165 m3r05p2121 rpinho       r     08/10/2012 09:49:34 precise.q at barley11.stanford.ed     1    
>> 
>> Note that only 5 jobs have started in the last hour and a half, i.e. the queue is clearing slowly.
>> Furthermore, rpihno has 25 more jobs in the queue that look like they will start ahead of
>> other users. (To be fair, at 4:00 p.m. this afternoon, rpinhos older jobs will start
>> to hit the 2 day limit, and the queue will start to clear faster.) 
>> 
>> It seems to me that when a slot becomes available, SGE should be starting jobs belonging to
>> other users besides rpinho. Also, I have no problem with rpinho wanting to run a lot of jobs.
>> It's just that I would like to see the computing power allocated more evenly.
>> 
>> Thank you,
>> Mike Murray
>> Ph.D. Candidate
>> Civil and Environmental Engineering
>> 
>> 
>> 
>> 
>> 
>> corn05:/mnt/glusterfs/mmurray1/PhD/evpp/port> qstat -u '*' | grep ' r ' | grep -v rpinho | wc
>> ----- Original Message -----
>> From: "Alex Chekholko" <chekh at stanford.edu>
>> To: farmshare-discuss at lists.stanford.edu
>> Sent: Monday, August 6, 2012 10:38:08 AM
>> Subject: Re: [farmshare-discuss] Barley sharing algorithm?
>> 
>> Hi all,
>> 
>> Thank you for your suggestions.  We will implement basic fairshare in a 
>> couple of weeks and send an announcement.
>> 
>> Regards,
>> Alex
>> 
>> On 8/3/12 9:36 AM, Tomas Babak wrote:
>>> I agree, it would be great to implement fairshare that prioritizes jobs
>>> starting based on the USER CPU usage rather than just when the job was
>>> submitted - especially given the heavy usage of barley. This does not seem
>>> to be happening yet but I think was a planned implementation?
>>> 
>>> Tomas
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> -----Original Message-----
>>> From: farmshare-discuss-bounces at lists.stanford.edu
>>> [mailto:farmshare-discuss-bounces at lists.stanford.edu] On Behalf Of Michael
>>> Maxwell Murray
>>> Sent: Friday, August 03, 2012 9:23 AM
>>> To: Open discussion for users of FarmShare
>>> Subject: [farmshare-discuss] Barley sharing algorithm?
>>> 
>>> Hello,
>>> 
>>> Can someone explain the algorithm the Barley's used to allocate
>>> slots to users? Currently, there are 444 jobs running. 413
>>> jobs belong to the user ocarja, including the most recently
>>> launched jobs. Several other users have jobs that have been
>>> queued for more than a day (e.g. blhuynh, rpinho, tflanzer)
>>> 
>>> Given that ocarja's jobs are consuming a large fraction for
>>> the CPUs and that there are other users waiting for a significant
>>> time, when a slot becomes available, why isn't a different user's
>>> job started?
>>> 
>>> Thank you,
>>> Mike Murray
>>> Ph.D. Candidate
>>> Civil and Environmental Engineering
>>> _______________________________________________
>>> farmshare-discuss mailing list
>>> farmshare-discuss at lists.stanford.edu
>>> https://mailman.stanford.edu/mailman/listinfo/farmshare-discuss
>>> 
>>> _______________________________________________
>>> farmshare-discuss mailing list
>>> farmshare-discuss at lists.stanford.edu
>>> https://mailman.stanford.edu/mailman/listinfo/farmshare-discuss
>>> 
>> 
>> -- 
>> Alex Chekholko chekh at stanford.edu 347-401-4860
>> _______________________________________________
>> farmshare-discuss mailing list
>> farmshare-discuss at lists.stanford.edu
>> https://mailman.stanford.edu/mailman/listinfo/farmshare-discuss
>> _______________________________________________
>> farmshare-discuss mailing list
>> farmshare-discuss at lists.stanford.edu
>> https://mailman.stanford.edu/mailman/listinfo/farmshare-discuss
> 
> _______________________________________________
> farmshare-discuss mailing list
> farmshare-discuss at lists.stanford.edu
> https://mailman.stanford.edu/mailman/listinfo/farmshare-discuss

====================================
Matthew T Kieber-Emmons, PhD
Postdoc - Solomon Laboratory
Tel: (650) 723-9128 Fax: (650) 725-0259
Department of Chemistry, Stanford University
333 Campus Drive - Room 157 Mudd
Stanford, CA 94305-5080
====================================




More information about the farmshare-discuss mailing list