Atlas:Analysis ST 2009 Errors

Un article de lcgwiki.
Revision as of 12:18, 30 janvier 2009 by Chollet (talk | contribs) (ST 209 124-125 : Comments and Errors follow-up)
Jump to: navigation, search


ST 209 124-125 : Comments and Errors follow-up

Note that ATLAS Production was ON on the FR-Cloud on January 29

  • IN2P3-LPC_MCDISK: f(w) - Errors due to the load induced by MC production running at that time. Then ST tests jobs (2 x 50 jobs added)were aborted with Logged Reason by wms

- Got a job held event, reason: Unspecified gridmanager error
- Job got an error while in the CondorG queue.
The submission to the batch system has failed because the maximum number of jobs accepted in queue by the site was reached
- queue atlas max_queuable = 200 in the batch system, Attributes 'GlueCEPolicyMaxTotalJobs' on the queue

Jan 29 23:54:46 clrlcgce03 gridinfo: [25608-30993] Job 1233269583:
lcgpbs:internal_ FAILED during submission to batch system lcgpbs
01/29/2009 23:55:07;0080;PBS_Server;Req;req_reject;Reject reply code=15046(Maximum
number of jobs already in queue), aux=0..