Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[farmshare-discuss] Barley AFS dies in a job with large task array

Alex Chekholko chekh at stanford.edu
Tue Jan 10 10:03:00 PST 2012


Hi Long,

Yesterday, there was some AFS server congestion that affected barley18 
specifically, and caused AFS access to time out on that machine.

It may be easiest for you to just copy your data and software to 
/mnt/glusterfs so you don't have the dependency on AFS for those jobs 
(and explicitly specify your job input/output files).

I don't see any jobs under your name right now, did your job error out?

Regards,
Alex

On 01/10/2012 09:48 AM, Long Ouyang wrote:
> Hi everyone,
>
> I'm submitting a job with ~50,000 tasks to Barley but 15,000 tasks
> through, it looks like AFS is breaking - even trying to cd to a folder
> in my home directory gives a "Connection timed out" error. Is there
> something I can do to fix this? Would just running aklog and kinit in my
> script fix things?
>
> -Long
>
>
> _______________________________________________
> farmshare-discuss mailing list
> farmshare-discuss at lists.stanford.edu
> https://mailman.stanford.edu/mailman/listinfo/farmshare-discuss


-- 
Alex Chekholko chekh at stanford.edu 347-401-4860



More information about the farmshare-discuss mailing list