Search Mailing List Archives
[farmshare-discuss] Fw: Jobs Aborting Early in Farmshare 2
William Patrick Marble
wpmarble at stanford.edu
Tue Jan 2 14:15:29 PST 2018
The default time limit for jobs on farmshare is 2 hours. You can request more time using the -time command in your .sbatch file. Like this:
#time you think you need; default is 2 hours
#format could be dd-hh:mm:ss, hh:mm:ss, mm:ss, or mm
#request 8 hours
(more at http://sherlock.stanford.edu/mediawiki/index.php/SLURMSubmit#Sample_Batch_Job)
Hope this helps.
Political Science Department
On Jan 2, 2018, at 5:05 PM, Rehman Ali <rali8 at stanford.edu<mailto:rali8 at stanford.edu>> wrote:
From: Rehman Ali
Sent: Tuesday, January 2, 2018 2:04 PM
Subject: Jobs Aborting Early in Farmshare 2
On the old Farmshare (corn) I was able to perform parallel executions of some MATLAB code that takes roughly 8 hours per thread.
However, now in Farmshare 2, this new system keep aborting the job at roughly two hours into each thread. The message I get in my error files is this:
slurmstepd-wheat01: error: *** JOB 140407 ON wheat01 CANCELLED AT 2018-01-02T12:19:05 DUE TO TIME LIMIT ***
Does someone know what could be causing this error. How small is the time limit?
Based on this (https://web.stanford.edu/group/farmshare/cgi-bin/wiki/index.php/User_Guide), the maximum runtime should be 2 days, so why do my jobs get canceled after only 2 hours?
National Defense Science and Engineering Graduate (NDSEG) Fellow
Electrical Engineering PhD Candidate | Stanford University
Computational and Mathematical Engineering M.S. Student | Stanford University
B.S. Biomedical Engineering, 2016 | Georgia Institute of Technology
Graduate Student Researcher in Jeremy Dahl Ultrasound Lab | Stanford University
farmshare-discuss mailing list
farmshare-discuss at lists.stanford.edu<mailto:farmshare-discuss at lists.stanford.edu>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the farmshare-discuss