IBM Platform LSF家族安装和配置简介 V1.0 联系客服

发布时间 : 星期六 文章IBM Platform LSF家族安装和配置简介 V1.0更新完毕开始阅读

QUEUE_NAME = verilog

DESCRIPTION = master queue definition cross-queue PRIORITY = 50

FAIRSHARE = USER_SHARES[[user1,100] [default,1]] FAIRSHARE_QUEUES = normal short

HOSTS = hostGroupC # resource contention #RES_REQ = rusage[verilog = 1] End Queue

Begin Queue QUEUE_NAME = short

DESCRIPTION = short jobs PRIORITY = 70 # highest HOSTS = hostGroupC RUNLIMIT = 5 10 End Queue

Begin Queue

QUEUE_NAME = normal DESCRIPTION = default queue PRIORITY = 40 # lowest HOSTS = hostGroupC End Queue 2.6.4 使能配置 badmin reconfig

提交作业,并查看队列的用户动态优先级变化: bqueues –rl normal

9 | Page

2.7 配置抢占调度策略

配置最基本的slots抢占: Begin Queue QUEUE_NAME = short PRIORITY = 70

HOSTS = hostGroupC # potential conflict PREEMPTION = PREEMPTIVE[normal] End Queue

Begin Queue

QUEUE_NAME = normal PRIORITY = 40

HOSTS = hostGroupC # potential conflict PREEMPTION = PREEMPTABLE[short] End Queue

向两个队列提交作业,查看被preempt的作业的pending原因。

2.8 配置全局限制策略

2.8.1 限制用户运行的作业数目 在lsb.users文件中添加: Begin User

USER_NAME MAX_JOBS JL/P user1 4 - user2 2 1 user3 - 2 groupA 8 - groupB@ 1 1

10 | Page

Default 2 - End User

2.8.2 限制节点运行作业数目 在lsb.hosts文件中: Begin Host

HOST_NAME MXJ JL/U host1 4 2 host2 2 1 host3 ! - End Host

2.8.3 限制队列作业的运行限制 在lsb.queues中添加: Begin Queue

QUEUE_NAME = myQueue HJOB_LIMIT = 2 PJOB_LIMIT = 1 UJOB_LIMIT = 4 HOSTS = hostGroupA USERS = userGroupA End Queue

2.8.4 设定General limits

在lsb.resources文件定义全局general limits示例: Begin Limit USERS QUEUES user1 -

HOSTS SLOTS MEM SWP

- 20%

hostB -

11 | Page

user2 normal hostA - End Limit Begin Limit NAME = limit1 USERS = user1

PER_HOST = hostA hostC TMP = 30% SWP = 50% MEM = 10% End Limit Begin Limit

20 -

PER_USER QUEUES HOSTS SLOTS MEM SWP TMP JOBS groupA - hgroup1 - - - - 2 user2 normal - - 200 - - - - short - - - - - 200 End Limit 2.8.5 使能配置 badmin reconfig

2.9 配置提交控制脚本esub

全局esub脚本在作业被提交是调用,可以被自动的或者显式的调用从而控制用户作业提交的行为。

编辑esub.project文件在$LSF_SERVERDIR下面(chmod为可执行): #!/bin/sh

if [ \ . $LSB_SUB_PARM_FILE

if [ \ echo \ exit $LSB_SUB_ABORT_VALUE fi fi

12 | Page