Lines 1-2583
Link Here
|
1 |
###################################################################### |
|
|
2 |
## |
3 |
## condor_config |
4 |
## |
5 |
## This is the global configuration file for condor. Any settings |
6 |
## made here may potentially be overridden in the local configuration |
7 |
## file. KEEP THAT IN MIND! To double-check that a variable is |
8 |
## getting set from the configuration file that you expect, use |
9 |
## condor_config_val -v <variable name> |
10 |
## |
11 |
## The file is divided into four main parts: |
12 |
## Part 1: Settings you likely want to customize |
13 |
## Part 2: Settings you may want to customize |
14 |
## Part 3: Settings that control the policy of when condor will |
15 |
## start and stop jobs on your machines |
16 |
## Part 4: Settings you should probably leave alone (unless you |
17 |
## know what you're doing) |
18 |
## |
19 |
## Please read the INSTALL file (or the Install chapter in the |
20 |
## Condor Administrator's Manual) for detailed explanations of the |
21 |
## various settings in here and possible ways to configure your |
22 |
## pool. |
23 |
## |
24 |
## Unless otherwise specified, settings that are commented out show |
25 |
## the defaults that are used if you don't define a value. Settings |
26 |
## that are defined here MUST BE DEFINED since they have no default |
27 |
## value. |
28 |
## |
29 |
## Unless otherwise indicated, all settings which specify a time are |
30 |
## defined in seconds. |
31 |
## |
32 |
###################################################################### |
33 |
|
34 |
###################################################################### |
35 |
###################################################################### |
36 |
## |
37 |
## ###### # |
38 |
## # # ## ##### ##### ## |
39 |
## # # # # # # # # # |
40 |
## ###### # # # # # # |
41 |
## # ###### ##### # # |
42 |
## # # # # # # # |
43 |
## # # # # # # ##### |
44 |
## |
45 |
## Part 1: Settings you likely want to customize: |
46 |
###################################################################### |
47 |
###################################################################### |
48 |
|
49 |
## What machine is your central manager? |
50 |
CONDOR_HOST = $(FULL_HOSTNAME) |
51 |
|
52 |
##-------------------------------------------------------------------- |
53 |
## Pathnames: |
54 |
##-------------------------------------------------------------------- |
55 |
## Where have you installed the bin, sbin and lib condor directories? |
56 |
RELEASE_DIR = %%PREFIX%% |
57 |
|
58 |
## Where is the local condor directory for each host? |
59 |
## This is where the local config file(s), logs and |
60 |
## spool/execute directories are located. Should be a large |
61 |
## partition if jobs may produce a lot of output. |
62 |
LOCAL_DIR = $(TILDE) |
63 |
|
64 |
## Where is the machine-specific local config file for each host? |
65 |
LOCAL_CONFIG_FILE = $(RELEASE_DIR)/etc/condor_config.local |
66 |
|
67 |
## Where are optional machine-specific local config files located? |
68 |
## Config files are included in lexicographic order. |
69 |
LOCAL_CONFIG_DIR = $(LOCAL_DIR)/config |
70 |
|
71 |
## Blacklist for file processing in the LOCAL_CONFIG_DIR |
72 |
## LOCAL_CONFIG_DIR_EXCLUDE_REGEXP = ^((\..*)|(.*~)|(#.*)|(.*\.rpmsave)|(.*\.rpmnew))$ |
73 |
|
74 |
## If the local config file is not present, is it an error? |
75 |
## WARNING: This is a potential security issue. |
76 |
## If not specificed, the default is True |
77 |
#REQUIRE_LOCAL_CONFIG_FILE = TRUE |
78 |
|
79 |
##-------------------------------------------------------------------- |
80 |
## Mail parameters: |
81 |
##-------------------------------------------------------------------- |
82 |
## When something goes wrong with condor at your site, who should get |
83 |
## the email? |
84 |
CONDOR_ADMIN = root@localhost |
85 |
|
86 |
## Full path to a mail delivery program that understands that "-s" |
87 |
## means you want to specify a subject: |
88 |
MAIL = /usr/bin/mail |
89 |
|
90 |
##-------------------------------------------------------------------- |
91 |
## Network domain parameters: |
92 |
##-------------------------------------------------------------------- |
93 |
## Internet domain of machines sharing a common UID space. If your |
94 |
## machines don't share a common UID space, set it to |
95 |
## UID_DOMAIN = $(FULL_HOSTNAME) |
96 |
## to specify that each machine has its own UID space. |
97 |
UID_DOMAIN = $(FULL_HOSTNAME) |
98 |
|
99 |
## Internet domain of machines sharing a common file system. |
100 |
## If your machines don't use a network file system, set it to |
101 |
## FILESYSTEM_DOMAIN = $(FULL_HOSTNAME) |
102 |
## to specify that each machine has its own file system. |
103 |
FILESYSTEM_DOMAIN = $(FULL_HOSTNAME) |
104 |
|
105 |
## This macro is used to specify a short description of your pool. |
106 |
## It should be about 20 characters long. For example, the name of |
107 |
## the UW-Madison Computer Science Condor Pool is ``UW-Madison CS''. |
108 |
COLLECTOR_NAME = My Pool - $(CONDOR_HOST) |
109 |
|
110 |
###################################################################### |
111 |
###################################################################### |
112 |
## |
113 |
## ###### ##### |
114 |
## # # ## ##### ##### # # |
115 |
## # # # # # # # # |
116 |
## ###### # # # # # ##### |
117 |
## # ###### ##### # # |
118 |
## # # # # # # # |
119 |
## # # # # # # ####### |
120 |
## |
121 |
## Part 2: Settings you may want to customize: |
122 |
## (it is generally safe to leave these untouched) |
123 |
###################################################################### |
124 |
###################################################################### |
125 |
|
126 |
## |
127 |
## The user/group ID <uid>.<gid> of the "Condor" user. |
128 |
## (this can also be specified in the environment) |
129 |
## Note: the CONDOR_IDS setting is ignored on Win32 platforms |
130 |
CONDOR_IDS=0.0 |
131 |
|
132 |
##-------------------------------------------------------------------- |
133 |
## Flocking: Submitting jobs to more than one pool |
134 |
##-------------------------------------------------------------------- |
135 |
## Flocking allows you to run your jobs in other pools, or lets |
136 |
## others run jobs in your pool. |
137 |
## |
138 |
## To let others flock to you, define FLOCK_FROM. |
139 |
## |
140 |
## To flock to others, define FLOCK_TO. |
141 |
|
142 |
## FLOCK_FROM defines the machines where you would like to grant |
143 |
## people access to your pool via flocking. (i.e. you are granting |
144 |
## access to these machines to join your pool). |
145 |
FLOCK_FROM = |
146 |
## An example of this is: |
147 |
#FLOCK_FROM = somehost.friendly.domain, anotherhost.friendly.domain |
148 |
|
149 |
## FLOCK_TO defines the central managers of the pools that you want |
150 |
## to flock to. (i.e. you are specifying the machines that you |
151 |
## want your jobs to be negotiated at -- thereby specifying the |
152 |
## pools they will run in.) |
153 |
FLOCK_TO = |
154 |
## An example of this is: |
155 |
#FLOCK_TO = central_manager.friendly.domain, condor.cs.wisc.edu |
156 |
|
157 |
## FLOCK_COLLECTOR_HOSTS should almost always be the same as |
158 |
## FLOCK_NEGOTIATOR_HOSTS (as shown below). The only reason it would be |
159 |
## different is if the collector and negotiator in the pool that you are |
160 |
## flocking too are running on different machines (not recommended). |
161 |
## The collectors must be specified in the same corresponding order as |
162 |
## the FLOCK_NEGOTIATOR_HOSTS list. |
163 |
FLOCK_NEGOTIATOR_HOSTS = $(FLOCK_TO) |
164 |
FLOCK_COLLECTOR_HOSTS = $(FLOCK_TO) |
165 |
## An example of having the negotiator and the collector on different |
166 |
## machines is: |
167 |
#FLOCK_NEGOTIATOR_HOSTS = condor.cs.wisc.edu, condor-negotiator.friendly.domain |
168 |
#FLOCK_COLLECTOR_HOSTS = condor.cs.wisc.edu, condor-collector.friendly.domain |
169 |
|
170 |
##-------------------------------------------------------------------- |
171 |
## Host/IP access levels |
172 |
##-------------------------------------------------------------------- |
173 |
## Please see the administrator's manual for details on these |
174 |
## settings, what they're for, and how to use them. |
175 |
|
176 |
## What machines have administrative rights for your pool? This |
177 |
## defaults to your central manager. You should set it to the |
178 |
## machine(s) where whoever is the condor administrator(s) works |
179 |
## (assuming you trust all the users who log into that/those |
180 |
## machine(s), since this is machine-wide access you're granting). |
181 |
|
182 |
ALLOW_ADMINISTRATOR = $(CONDOR_HOST) |
183 |
|
184 |
## If there are no machines that should have administrative access |
185 |
## to your pool (for example, there's no machine where only trusted |
186 |
## users have accounts), you can uncomment this setting. |
187 |
## Unfortunately, this will mean that administering your pool will |
188 |
## be more difficult. |
189 |
#DENY_ADMINISTRATOR = * |
190 |
|
191 |
## What machines should have "owner" access to your machines, meaning |
192 |
## they can issue commands that a machine owner should be able to |
193 |
## issue to their own machine (like condor_vacate). This defaults to |
194 |
## machines with administrator access, and the local machine. This |
195 |
## is probably what you want. |
196 |
ALLOW_OWNER = $(FULL_HOSTNAME), $(ALLOW_ADMINISTRATOR) |
197 |
|
198 |
## Read access. Machines listed as allow (and/or not listed as deny) |
199 |
## can view the status of your pool, but cannot join your pool |
200 |
## or run jobs. |
201 |
## NOTE: By default, without these entries customized, you |
202 |
## are granting read access to the whole world. You may want to |
203 |
## restrict that to hosts in your domain. If possible, please also |
204 |
## grant read access to "*.cs.wisc.edu", so the Condor developers |
205 |
## will be able to view the status of your pool and more easily help |
206 |
## you install, configure or debug your Condor installation. |
207 |
## It is important to have this defined. |
208 |
ALLOW_READ = *.your.domain |
209 |
#ALLOW_READ = *.your.domain, *.cs.wisc.edu |
210 |
#DENY_READ = *.bad.subnet, bad-machine.your.domain, 144.77.88.* |
211 |
|
212 |
## Write access. Machines listed here can join your pool, submit |
213 |
## jobs, etc. Note: Any machine which has WRITE access must |
214 |
## also be granted READ access. Granting WRITE access below does |
215 |
## not also automatically grant READ access; you must change |
216 |
## ALLOW_READ above as well. |
217 |
## |
218 |
## You must set this to something else before Condor will run. |
219 |
## This most simple option is: |
220 |
## ALLOW_WRITE = * |
221 |
## but note that this will allow anyone to submit jobs or add |
222 |
## machines to your pool and is a serious security risk. |
223 |
|
224 |
ALLOW_WRITE = *.your.domain |
225 |
#ALLOW_WRITE = $(FULL_HOSTNAME), $(IP_ADDRESS) |
226 |
#ALLOW_WRITE = *.your.domain, your-friend's-machine.other.domain |
227 |
#DENY_WRITE = bad-machine.your.domain |
228 |
|
229 |
## Are you upgrading to a new version of Condor and confused about |
230 |
## why the above ALLOW_WRITE setting is causing Condor to refuse to |
231 |
## start up? If you are upgrading from a configuration that uses |
232 |
## HOSTALLOW/HOSTDENY instead of ALLOW/DENY we recommend that you |
233 |
## convert all uses of the former to the latter. The syntax of the |
234 |
## authorization settings is identical. They both support |
235 |
## unauthenticated IP-based authorization as well as authenticated |
236 |
## user-based authorization. To avoid confusion, the use of |
237 |
## HOSTALLOW/HOSTDENY is discouraged. Support for it may be removed |
238 |
## in the future. |
239 |
|
240 |
## Negotiator access. Machines listed here are trusted central |
241 |
## managers. You should normally not have to change this. |
242 |
ALLOW_NEGOTIATOR = $(CONDOR_HOST) |
243 |
## Now, with flocking we need to let the SCHEDD trust the other |
244 |
## negotiators we are flocking with as well. You should normally |
245 |
## not have to change this either. |
246 |
ALLOW_NEGOTIATOR_SCHEDD = $(CONDOR_HOST), $(FLOCK_NEGOTIATOR_HOSTS) |
247 |
|
248 |
## Config access. Machines listed here can use the condor_config_val |
249 |
## tool to modify all daemon configurations. This level of host-wide |
250 |
## access should only be granted with extreme caution. By default, |
251 |
## config access is denied from all hosts. |
252 |
#ALLOW_CONFIG = trusted-host.your.domain |
253 |
|
254 |
## Flocking Configs. These are the real things that Condor looks at, |
255 |
## but we set them from the FLOCK_FROM/TO macros above. It is safe |
256 |
## to leave these unchanged. |
257 |
ALLOW_WRITE_COLLECTOR = $(ALLOW_WRITE), $(FLOCK_FROM) |
258 |
ALLOW_WRITE_STARTD = $(ALLOW_WRITE), $(FLOCK_FROM) |
259 |
ALLOW_READ_COLLECTOR = $(ALLOW_READ), $(FLOCK_FROM) |
260 |
ALLOW_READ_STARTD = $(ALLOW_READ), $(FLOCK_FROM) |
261 |
|
262 |
|
263 |
##-------------------------------------------------------------------- |
264 |
## Security parameters for setting configuration values remotely: |
265 |
##-------------------------------------------------------------------- |
266 |
## These parameters define the list of attributes that can be set |
267 |
## remotely with condor_config_val for the security access levels |
268 |
## defined above (for example, WRITE, ADMINISTRATOR, CONFIG, etc). |
269 |
## Please see the administrator's manual for futher details on these |
270 |
## settings, what they're for, and how to use them. There are no |
271 |
## default values for any of these settings. If they are not |
272 |
## defined, no attributes can be set with condor_config_val. |
273 |
|
274 |
## Do you want to allow condor_config_val -rset to work at all? |
275 |
## This feature is disabled by default, so to enable, you must |
276 |
## uncomment the following setting and change the value to "True". |
277 |
## Note: changing this requires a restart not just a reconfig. |
278 |
#ENABLE_RUNTIME_CONFIG = False |
279 |
|
280 |
## Do you want to allow condor_config_val -set to work at all? |
281 |
## This feature is disabled by default, so to enable, you must |
282 |
## uncomment the following setting and change the value to "True". |
283 |
## Note: changing this requires a restart not just a reconfig. |
284 |
#ENABLE_PERSISTENT_CONFIG = False |
285 |
|
286 |
## Directory where daemons should write persistent config files (used |
287 |
## to support condor_config_val -set). This directory should *ONLY* |
288 |
## be writable by root (or the user the Condor daemons are running as |
289 |
## if non-root). There is no default, administrators must define this. |
290 |
## Note: changing this requires a restart not just a reconfig. |
291 |
#PERSISTENT_CONFIG_DIR = /full/path/to/root-only/local/directory |
292 |
|
293 |
## Attributes that can be set by hosts with "CONFIG" permission (as |
294 |
## defined with ALLOW_CONFIG and DENY_CONFIG above). |
295 |
## The commented-out value here was the default behavior of Condor |
296 |
## prior to version 6.3.3. If you don't need this behavior, you |
297 |
## should leave this commented out. |
298 |
#SETTABLE_ATTRS_CONFIG = * |
299 |
|
300 |
## Attributes that can be set by hosts with "ADMINISTRATOR" |
301 |
## permission (as defined above) |
302 |
#SETTABLE_ATTRS_ADMINISTRATOR = *_DEBUG, MAX_*_LOG |
303 |
|
304 |
## Attributes that can be set by hosts with "OWNER" permission (as |
305 |
## defined above) NOTE: any Condor job running on a given host will |
306 |
## have OWNER permission on that host by default. If you grant this |
307 |
## kind of access, Condor jobs will be able to modify any attributes |
308 |
## you list below on the machine where they are running. This has |
309 |
## obvious security implications, so only grant this kind of |
310 |
## permission for custom attributes that you define for your own use |
311 |
## at your pool (custom attributes about your machines that are |
312 |
## published with the STARTD_ATTRS setting, for example). |
313 |
#SETTABLE_ATTRS_OWNER = your_custom_attribute, another_custom_attr |
314 |
|
315 |
## You can also define daemon-specific versions of each of these |
316 |
## settings. For example, to define settings that can only be |
317 |
## changed in the condor_startd's configuration by hosts with OWNER |
318 |
## permission, you would use: |
319 |
#STARTD_SETTABLE_ATTRS_OWNER = your_custom_attribute_name |
320 |
|
321 |
|
322 |
##-------------------------------------------------------------------- |
323 |
## Network filesystem parameters: |
324 |
##-------------------------------------------------------------------- |
325 |
## Do you want to use NFS for file access instead of remote system |
326 |
## calls? |
327 |
#USE_NFS = False |
328 |
|
329 |
## Do you want to use AFS for file access instead of remote system |
330 |
## calls? |
331 |
#USE_AFS = False |
332 |
|
333 |
##-------------------------------------------------------------------- |
334 |
## Checkpoint server: |
335 |
##-------------------------------------------------------------------- |
336 |
## Do you want to use a checkpoint server if one is available? If a |
337 |
## checkpoint server isn't available or USE_CKPT_SERVER is set to |
338 |
## False, checkpoints will be written to the local SPOOL directory on |
339 |
## the submission machine. |
340 |
#USE_CKPT_SERVER = True |
341 |
|
342 |
## What's the hostname of this machine's nearest checkpoint server? |
343 |
#CKPT_SERVER_HOST = checkpoint-server-hostname.your.domain |
344 |
|
345 |
## Do you want the starter on the execute machine to choose the |
346 |
## checkpoint server? If False, the CKPT_SERVER_HOST set on |
347 |
## the submit machine is used. Otherwise, the CKPT_SERVER_HOST set |
348 |
## on the execute machine is used. The default is true. |
349 |
#STARTER_CHOOSES_CKPT_SERVER = True |
350 |
|
351 |
##-------------------------------------------------------------------- |
352 |
## Miscellaneous: |
353 |
##-------------------------------------------------------------------- |
354 |
## Try to save this much swap space by not starting new shadows. |
355 |
## Specified in megabytes. |
356 |
#RESERVED_SWAP = 0 |
357 |
|
358 |
## What's the maximum number of jobs you want a single submit machine |
359 |
## to spawn shadows for? The default is a function of $(DETECTED_MEMORY) |
360 |
## and a guess at the number of ephemeral ports available. |
361 |
|
362 |
## Example 1: |
363 |
#MAX_JOBS_RUNNING = 10000 |
364 |
|
365 |
## Example 2: |
366 |
## This is more complicated, but it produces the same limit as the default. |
367 |
## First define some expressions to use in our calculation. |
368 |
## Assume we can use up to 80% of memory and estimate shadow private data |
369 |
## size of 800k. |
370 |
#MAX_SHADOWS_MEM = ceiling($(DETECTED_MEMORY)*0.8*1024/800) |
371 |
## Assume we can use ~21,000 ephemeral ports (avg ~2.1 per shadow). |
372 |
## Under Linux, the range is set in /proc/sys/net/ipv4/ip_local_port_range. |
373 |
#MAX_SHADOWS_PORTS = 10000 |
374 |
## Under windows, things are much less scalable, currently. |
375 |
## Note that this can probably be safely increased a bit under 64-bit windows. |
376 |
#MAX_SHADOWS_OPSYS = ifThenElse(regexp("WIN.*","$(OPSYS)"),200,100000) |
377 |
## Now build up the expression for MAX_JOBS_RUNNING. This is complicated |
378 |
## due to lack of a min() function. |
379 |
#MAX_JOBS_RUNNING = $(MAX_SHADOWS_MEM) |
380 |
#MAX_JOBS_RUNNING = \ |
381 |
# ifThenElse( $(MAX_SHADOWS_PORTS) < $(MAX_JOBS_RUNNING), \ |
382 |
# $(MAX_SHADOWS_PORTS), \ |
383 |
# $(MAX_JOBS_RUNNING) ) |
384 |
#MAX_JOBS_RUNNING = \ |
385 |
# ifThenElse( $(MAX_SHADOWS_OPSYS) < $(MAX_JOBS_RUNNING), \ |
386 |
# $(MAX_SHADOWS_OPSYS), \ |
387 |
# $(MAX_JOBS_RUNNING) ) |
388 |
|
389 |
|
390 |
## Maximum number of simultaneous downloads of output files from |
391 |
## execute machines to the submit machine (limit applied per schedd). |
392 |
## The value 0 means unlimited. |
393 |
#MAX_CONCURRENT_DOWNLOADS = 10 |
394 |
|
395 |
## Maximum number of simultaneous uploads of input files from the |
396 |
## submit machine to execute machines (limit applied per schedd). |
397 |
## The value 0 means unlimited. |
398 |
#MAX_CONCURRENT_UPLOADS = 10 |
399 |
|
400 |
## Condor needs to create a few lock files to synchronize access to |
401 |
## various log files. Because of problems we've had with network |
402 |
## filesystems and file locking over the years, we HIGHLY recommend |
403 |
## that you put these lock files on a local partition on each |
404 |
## machine. If you don't have your LOCAL_DIR on a local partition, |
405 |
## be sure to change this entry. Whatever user (or group) condor is |
406 |
## running as needs to have write access to this directory. If |
407 |
## you're not running as root, this is whatever user you started up |
408 |
## the condor_master as. If you are running as root, and there's a |
409 |
## condor account, it's probably condor. Otherwise, it's whatever |
410 |
## you've set in the CONDOR_IDS environment variable. See the Admin |
411 |
## manual for details on this. |
412 |
LOCK = $(LOG) |
413 |
|
414 |
## If you don't use a fully qualified name in your /etc/hosts file |
415 |
## (or NIS, etc.) for either your official hostname or as an alias, |
416 |
## Condor wouldn't normally be able to use fully qualified names in |
417 |
## places that it'd like to. You can set this parameter to the |
418 |
## domain you'd like appended to your hostname, if changing your host |
419 |
## information isn't a good option. This parameter must be set in |
420 |
## the global config file (not the LOCAL_CONFIG_FILE from above). |
421 |
#DEFAULT_DOMAIN_NAME = your.domain.name |
422 |
|
423 |
## If you don't have DNS set up, Condor will normally fail in many |
424 |
## places because it can't resolve hostnames to IP addresses and |
425 |
## vice-versa. If you enable this option, Condor will use |
426 |
## pseudo-hostnames constructed from a machine's IP address and the |
427 |
## DEFAULT_DOMAIN_NAME. Both NO_DNS and DEFAULT_DOMAIN must be set in |
428 |
## your top-level config file for this mode of operation to work |
429 |
## properly. |
430 |
#NO_DNS = True |
431 |
|
432 |
## Condor can be told whether or not you want the Condor daemons to |
433 |
## create a core file if something really bad happens. This just |
434 |
## sets the resource limit for the size of a core file. By default, |
435 |
## we don't do anything, and leave in place whatever limit was in |
436 |
## effect when you started the Condor daemons. If this parameter is |
437 |
## set and "True", we increase the limit to as large as it gets. If |
438 |
## it's set to "False", we set the limit at 0 (which means that no |
439 |
## core files are even created). Core files greatly help the Condor |
440 |
## developers debug any problems you might be having. |
441 |
#CREATE_CORE_FILES = True |
442 |
|
443 |
## When Condor daemons detect a fatal internal exception, they |
444 |
## normally log an error message and exit. If you have turned on |
445 |
## CREATE_CORE_FILES, in some cases you may also want to turn on |
446 |
## ABORT_ON_EXCEPTION so that core files are generated when an |
447 |
## exception occurs. Set the following to True if that is what you |
448 |
## want. |
449 |
#ABORT_ON_EXCEPTION = False |
450 |
|
451 |
## Condor Glidein downloads binaries from a remote server for the |
452 |
## machines into which you're gliding. This saves you from manually |
453 |
## downloading and installing binaries for every architecture you |
454 |
## might want to glidein to. The default server is one maintained at |
455 |
## The University of Wisconsin. If you don't want to use the UW |
456 |
## server, you can set up your own and change the following to |
457 |
## point to it, instead. |
458 |
GLIDEIN_SERVER_URLS = \ |
459 |
http://www.cs.wisc.edu/condor/glidein/binaries |
460 |
|
461 |
## List the sites you want to GlideIn to on the GLIDEIN_SITES. For example, |
462 |
## if you'd like to GlideIn to some Alliance GiB resources, |
463 |
## uncomment the line below. |
464 |
## Make sure that $(GLIDEIN_SITES) is included in ALLOW_READ and |
465 |
## ALLOW_WRITE, or else your GlideIns won't be able to join your pool. |
466 |
## This is _NOT_ done for you by default, because it is an even better |
467 |
## idea to use a strong security method (such as GSI) rather than |
468 |
## host-based security for authorizing glideins. |
469 |
#GLIDEIN_SITES = *.ncsa.uiuc.edu, *.cs.wisc.edu, *.mcs.anl.gov |
470 |
#GLIDEIN_SITES = |
471 |
|
472 |
## If your site needs to use UID_DOMAIN settings (defined above) that |
473 |
## are not real Internet domains that match the hostnames, you can |
474 |
## tell Condor to trust whatever UID_DOMAIN a submit machine gives to |
475 |
## the execute machine and just make sure the two strings match. The |
476 |
## default for this setting is False, since it is more secure this |
477 |
## way. |
478 |
#TRUST_UID_DOMAIN = False |
479 |
|
480 |
## If you would like to be informed in near real-time via condor_q when |
481 |
## a vanilla/standard/java job is in a suspension state, set this attribute to |
482 |
## TRUE. However, this real-time update of the condor_schedd by the shadows |
483 |
## could cause performance issues if there are thousands of concurrently |
484 |
## running vanilla/standard/java jobs under a single condor_schedd and they |
485 |
## are allowed to suspend and resume. |
486 |
#REAL_TIME_JOB_SUSPEND_UPDATES = False |
487 |
|
488 |
## A standard universe job can perform arbitrary shell calls via the |
489 |
## libc 'system()' function. This function call is routed back to the shadow |
490 |
## which performs the actual system() invocation in the initialdir of the |
491 |
## running program and as the user who submitted the job. However, since the |
492 |
## user job can request ARBITRARY shell commands to be run by the shadow, this |
493 |
## is a generally unsafe practice. This should only be made available if it is |
494 |
## actually needed. If this attribute is not defined, then it is the same as |
495 |
## it being defined to False. Set it to True to allow the shadow to execute |
496 |
## arbitrary shell code from the user job. |
497 |
#SHADOW_ALLOW_UNSAFE_REMOTE_EXEC = False |
498 |
|
499 |
## KEEP_OUTPUT_SANDBOX is an optional feature to tell Condor-G to not |
500 |
## remove the job spool when the job leaves the queue. To use, just |
501 |
## set to TRUE. Since you will be operating Condor-G in this manner, |
502 |
## you may want to put leave_in_queue = false in your job submit |
503 |
## description files, to tell Condor-G to simply remove the job from |
504 |
## the queue immediately when the job completes (since the output files |
505 |
## will stick around no matter what). |
506 |
#KEEP_OUTPUT_SANDBOX = False |
507 |
|
508 |
## This setting tells the negotiator to ignore user priorities. This |
509 |
## avoids problems where jobs from different users won't run when using |
510 |
## condor_advertise instead of a full-blown startd (some of the user |
511 |
## priority system in Condor relies on information from the startd -- |
512 |
## we will remove this reliance when we support the user priority |
513 |
## system for grid sites in the negotiator; for now, this setting will |
514 |
## just disable it). |
515 |
#NEGOTIATOR_IGNORE_USER_PRIORITIES = False |
516 |
|
517 |
## This is a list of libraries containing ClassAd plug-in functions. |
518 |
#CLASSAD_USER_LIBS = |
519 |
|
520 |
## This setting tells Condor whether to delegate or copy GSI X509 |
521 |
## credentials when sending them over the wire between daemons. |
522 |
## Delegation can take up to a second, which is very slow when |
523 |
## submitting a large number of jobs. Copying exposes the credential |
524 |
## to third parties if Condor isn't set to encrypt communications. |
525 |
## By default, Condor will delegate rather than copy. |
526 |
#DELEGATE_JOB_GSI_CREDENTIALS = True |
527 |
|
528 |
## This setting controls whether Condor delegates a full or limited |
529 |
## X509 credential for jobs. Currently, this only affects grid-type |
530 |
## gt2 grid universe jobs. The default is False. |
531 |
#DELEGATE_FULL_JOB_GSI_CREDENTIALS = False |
532 |
|
533 |
## This setting controls the default behaviour for the spooling of files |
534 |
## into, or out of, the Condor system by such tools as condor_submit |
535 |
## and condor_transfer_data. Here is the list of valid settings for this |
536 |
## parameter and what they mean: |
537 |
## |
538 |
## stm_use_schedd_only |
539 |
## Ask the condor_schedd to solely store/retreive the sandbox |
540 |
## |
541 |
## stm_use_transferd |
542 |
## Ask the condor_schedd for a location of a condor_transferd, then |
543 |
## store/retreive the sandbox from the transferd itself. |
544 |
## |
545 |
## The allowed values are case insensitive. |
546 |
## The default of this parameter if not specified is: stm_use_schedd_only |
547 |
#SANDBOX_TRANSFER_METHOD = stm_use_schedd_only |
548 |
|
549 |
## This setting specifies an IP address that depends on the setting of |
550 |
## BIND_ALL_INTERFACES. If BIND_ALL_INTERFACES is True (the default), then |
551 |
## this variable controls what IP address will be advertised as the public |
552 |
## address of the daemon. If BIND_ALL_INTERFACES is False, then this variable |
553 |
## specifies which IP address to bind network sockets to. If |
554 |
## BIND_ALL_INTERFACES is False and NETWORK_INTERFACE is not defined, Condor |
555 |
## chooses a network interface automatically. It tries to choose a public |
556 |
## interface if one is available. If it cannot decide which of two interfaces |
557 |
## to choose from, it will pick the first one. |
558 |
#NETWORK_INTERFACE = |
559 |
|
560 |
##-------------------------------------------------------------------- |
561 |
## Settings that control the daemon's debugging output: |
562 |
##-------------------------------------------------------------------- |
563 |
|
564 |
## |
565 |
## The flags given in ALL_DEBUG are shared between all daemons. |
566 |
## |
567 |
|
568 |
ALL_DEBUG = |
569 |
|
570 |
MAX_COLLECTOR_LOG = 1000000 |
571 |
COLLECTOR_DEBUG = |
572 |
|
573 |
MAX_KBDD_LOG = 1000000 |
574 |
KBDD_DEBUG = |
575 |
|
576 |
MAX_NEGOTIATOR_LOG = 1000000 |
577 |
NEGOTIATOR_DEBUG = D_MATCH |
578 |
MAX_NEGOTIATOR_MATCH_LOG = 1000000 |
579 |
|
580 |
MAX_SCHEDD_LOG = 1000000 |
581 |
SCHEDD_DEBUG = D_PID |
582 |
|
583 |
MAX_SHADOW_LOG = 1000000 |
584 |
SHADOW_DEBUG = |
585 |
|
586 |
MAX_STARTD_LOG = 1000000 |
587 |
STARTD_DEBUG = |
588 |
|
589 |
MAX_STARTER_LOG = 1000000 |
590 |
|
591 |
MAX_MASTER_LOG = 1000000 |
592 |
MASTER_DEBUG = |
593 |
## When the master starts up, should it truncate it's log file? |
594 |
#TRUNC_MASTER_LOG_ON_OPEN = False |
595 |
|
596 |
MAX_JOB_ROUTER_LOG = 1000000 |
597 |
JOB_ROUTER_DEBUG = |
598 |
|
599 |
MAX_ROOSTER_LOG = 1000000 |
600 |
ROOSTER_DEBUG = |
601 |
|
602 |
MAX_SHARED_PORT_LOG = 1000000 |
603 |
SHARED_PORT_DEBUG = |
604 |
|
605 |
MAX_HDFS_LOG = 1000000 |
606 |
HDFS_DEBUG = |
607 |
|
608 |
# High Availability Logs |
609 |
MAX_HAD_LOG = 1000000 |
610 |
HAD_DEBUG = |
611 |
MAX_REPLICATION_LOG = 1000000 |
612 |
REPLICATION_DEBUG = |
613 |
MAX_TRANSFERER_LOG = 1000000 |
614 |
TRANSFERER_DEBUG = |
615 |
|
616 |
|
617 |
## The daemons touch their log file periodically, even when they have |
618 |
## nothing to write. When a daemon starts up, it prints the last time |
619 |
## the log file was modified. This lets you estimate when a previous |
620 |
## instance of a daemon stopped running. This paramete controls how often |
621 |
## the daemons touch the file (in seconds). |
622 |
#TOUCH_LOG_INTERVAL = 60 |
623 |
|
624 |
###################################################################### |
625 |
###################################################################### |
626 |
## |
627 |
## ###### ##### |
628 |
## # # ## ##### ##### # # |
629 |
## # # # # # # # # |
630 |
## ###### # # # # # ##### |
631 |
## # ###### ##### # # |
632 |
## # # # # # # # # |
633 |
## # # # # # # ##### |
634 |
## |
635 |
## Part 3: Settings control the policy for running, stopping, and |
636 |
## periodically checkpointing condor jobs: |
637 |
###################################################################### |
638 |
###################################################################### |
639 |
|
640 |
## This section contains macros are here to help write legible |
641 |
## expressions: |
642 |
MINUTE = 60 |
643 |
HOUR = (60 * $(MINUTE)) |
644 |
StateTimer = (time() - EnteredCurrentState) |
645 |
ActivityTimer = (time() - EnteredCurrentActivity) |
646 |
ActivationTimer = ifThenElse(JobStart =!= UNDEFINED, (time() - JobStart), 0) |
647 |
LastCkpt = (time() - LastPeriodicCheckpoint) |
648 |
|
649 |
## The JobUniverse attribute is just an int. These macros can be |
650 |
## used to specify the universe in a human-readable way: |
651 |
STANDARD = 1 |
652 |
VANILLA = 5 |
653 |
MPI = 8 |
654 |
VM = 13 |
655 |
IsMPI = (TARGET.JobUniverse == $(MPI)) |
656 |
IsVanilla = (TARGET.JobUniverse == $(VANILLA)) |
657 |
IsStandard = (TARGET.JobUniverse == $(STANDARD)) |
658 |
IsVM = (TARGET.JobUniverse == $(VM)) |
659 |
|
660 |
NonCondorLoadAvg = (LoadAvg - CondorLoadAvg) |
661 |
BackgroundLoad = 0.3 |
662 |
HighLoad = 0.5 |
663 |
StartIdleTime = 15 * $(MINUTE) |
664 |
ContinueIdleTime = 5 * $(MINUTE) |
665 |
MaxSuspendTime = 10 * $(MINUTE) |
666 |
MaxVacateTime = 10 * $(MINUTE) |
667 |
|
668 |
KeyboardBusy = (KeyboardIdle < $(MINUTE)) |
669 |
ConsoleBusy = (ConsoleIdle < $(MINUTE)) |
670 |
CPUIdle = ($(NonCondorLoadAvg) <= $(BackgroundLoad)) |
671 |
CPUBusy = ($(NonCondorLoadAvg) >= $(HighLoad)) |
672 |
KeyboardNotBusy = ($(KeyboardBusy) == False) |
673 |
|
674 |
BigJob = (TARGET.ImageSize >= (50 * 1024)) |
675 |
MediumJob = (TARGET.ImageSize >= (15 * 1024) && TARGET.ImageSize < (50 * 1024)) |
676 |
SmallJob = (TARGET.ImageSize < (15 * 1024)) |
677 |
|
678 |
JustCPU = ($(CPUBusy) && ($(KeyboardBusy) == False)) |
679 |
MachineBusy = ($(CPUBusy) || $(KeyboardBusy)) |
680 |
|
681 |
## The RANK expression controls which jobs this machine prefers to |
682 |
## run over others. Some examples from the manual include: |
683 |
## RANK = TARGET.ImageSize |
684 |
## RANK = (Owner == "coltrane") + (Owner == "tyner") \ |
685 |
## + ((Owner == "garrison") * 10) + (Owner == "jones") |
686 |
## By default, RANK is always 0, meaning that all jobs have an equal |
687 |
## ranking. |
688 |
#RANK = 0 |
689 |
|
690 |
|
691 |
##################################################################### |
692 |
## This where you choose the configuration that you would like to |
693 |
## use. It has no defaults so it must be defined. We start this |
694 |
## file off with the UWCS_* policy. |
695 |
###################################################################### |
696 |
|
697 |
## Also here is what is referred to as the TESTINGMODE_*, which is |
698 |
## a quick hardwired way to test Condor with a simple no-preemption policy. |
699 |
## Replace UWCS_* with TESTINGMODE_* if you wish to do testing mode. |
700 |
## For example: |
701 |
## WANT_SUSPEND = $(UWCS_WANT_SUSPEND) |
702 |
## becomes |
703 |
## WANT_SUSPEND = $(TESTINGMODE_WANT_SUSPEND) |
704 |
|
705 |
# When should we only consider SUSPEND instead of PREEMPT? |
706 |
WANT_SUSPEND = $(UWCS_WANT_SUSPEND) |
707 |
|
708 |
# When should we preempt gracefully instead of hard-killing? |
709 |
WANT_VACATE = $(UWCS_WANT_VACATE) |
710 |
|
711 |
## When is this machine willing to start a job? |
712 |
START = $(UWCS_START) |
713 |
|
714 |
## When should a local universe job be allowed to start? |
715 |
#START_LOCAL_UNIVERSE = TotalLocalJobsRunning < 200 |
716 |
|
717 |
## When should a scheduler universe job be allowed to start? |
718 |
#START_SCHEDULER_UNIVERSE = TotalSchedulerJobsRunning < 200 |
719 |
|
720 |
## When to suspend a job? |
721 |
SUSPEND = $(UWCS_SUSPEND) |
722 |
|
723 |
## When to resume a suspended job? |
724 |
CONTINUE = $(UWCS_CONTINUE) |
725 |
|
726 |
## When to nicely stop a job? |
727 |
## (as opposed to killing it instantaneously) |
728 |
PREEMPT = $(UWCS_PREEMPT) |
729 |
|
730 |
## When to instantaneously kill a preempting job |
731 |
## (e.g. if a job is in the pre-empting stage for too long) |
732 |
KILL = $(UWCS_KILL) |
733 |
|
734 |
PERIODIC_CHECKPOINT = $(UWCS_PERIODIC_CHECKPOINT) |
735 |
PREEMPTION_REQUIREMENTS = $(UWCS_PREEMPTION_REQUIREMENTS) |
736 |
PREEMPTION_RANK = $(UWCS_PREEMPTION_RANK) |
737 |
NEGOTIATOR_PRE_JOB_RANK = $(UWCS_NEGOTIATOR_PRE_JOB_RANK) |
738 |
NEGOTIATOR_POST_JOB_RANK = $(UWCS_NEGOTIATOR_POST_JOB_RANK) |
739 |
MaxJobRetirementTime = $(UWCS_MaxJobRetirementTime) |
740 |
CLAIM_WORKLIFE = $(UWCS_CLAIM_WORKLIFE) |
741 |
|
742 |
##################################################################### |
743 |
## This is the UWisc - CS Department Configuration. |
744 |
##################################################################### |
745 |
|
746 |
# When should we only consider SUSPEND instead of PREEMPT? |
747 |
# Only when SUSPEND is True and one of the following is also true: |
748 |
# - the job is small |
749 |
# - the keyboard is idle |
750 |
# - it is a vanilla universe job |
751 |
UWCS_WANT_SUSPEND = ( $(SmallJob) || $(KeyboardNotBusy) || $(IsVanilla) ) && \ |
752 |
( $(SUSPEND) ) |
753 |
|
754 |
# When should we preempt gracefully instead of hard-killing? |
755 |
UWCS_WANT_VACATE = ( $(ActivationTimer) > 10 * $(MINUTE) || $(IsVanilla) ) |
756 |
|
757 |
# Only start jobs if: |
758 |
# 1) the keyboard has been idle long enough, AND |
759 |
# 2) the load average is low enough OR the machine is currently |
760 |
# running a Condor job |
761 |
# (NOTE: Condor will only run 1 job at a time on a given resource. |
762 |
# The reasons Condor might consider running a different job while |
763 |
# already running one are machine Rank (defined above), and user |
764 |
# priorities.) |
765 |
UWCS_START = ( (KeyboardIdle > $(StartIdleTime)) \ |
766 |
&& ( $(CPUIdle) || \ |
767 |
(State != "Unclaimed" && State != "Owner")) ) |
768 |
|
769 |
# Suspend jobs if: |
770 |
# 1) the keyboard has been touched, OR |
771 |
# 2a) The cpu has been busy for more than 2 minutes, AND |
772 |
# 2b) the job has been running for more than 90 seconds |
773 |
UWCS_SUSPEND = ( $(KeyboardBusy) || \ |
774 |
( (CpuBusyTime > 2 * $(MINUTE)) \ |
775 |
&& $(ActivationTimer) > 90 ) ) |
776 |
|
777 |
# Continue jobs if: |
778 |
# 1) the cpu is idle, AND |
779 |
# 2) we've been suspended more than 10 seconds, AND |
780 |
# 3) the keyboard hasn't been touched in a while |
781 |
UWCS_CONTINUE = ( $(CPUIdle) && ($(ActivityTimer) > 10) \ |
782 |
&& (KeyboardIdle > $(ContinueIdleTime)) ) |
783 |
|
784 |
# Preempt jobs if: |
785 |
# 1) The job is suspended and has been suspended longer than we want |
786 |
# 2) OR, we don't want to suspend this job, but the conditions to |
787 |
# suspend jobs have been met (someone is using the machine) |
788 |
UWCS_PREEMPT = ( ((Activity == "Suspended") && \ |
789 |
($(ActivityTimer) > $(MaxSuspendTime))) \ |
790 |
|| (SUSPEND && (WANT_SUSPEND == False)) ) |
791 |
|
792 |
# Maximum time (in seconds) to wait for a job to finish before kicking |
793 |
# it off (due to PREEMPT, a higher priority claim, or the startd |
794 |
# gracefully shutting down). This is computed from the time the job |
795 |
# was started, minus any suspension time. Once the retirement time runs |
796 |
# out, the usual preemption process will take place. The job may |
797 |
# self-limit the retirement time to _less_ than what is given here. |
798 |
# By default, nice user jobs and standard universe jobs set their |
799 |
# MaxJobRetirementTime to 0, so they will not wait in retirement. |
800 |
|
801 |
UWCS_MaxJobRetirementTime = 0 |
802 |
|
803 |
## If you completely disable preemption of claims to machines, you |
804 |
## should consider limiting the timespan over which new jobs will be |
805 |
## accepted on the same claim. See the manual section on disabling |
806 |
## preemption for a comprehensive discussion. Since this example |
807 |
## configuration does not disable preemption of claims, we leave |
808 |
## CLAIM_WORKLIFE undefined (infinite). |
809 |
#UWCS_CLAIM_WORKLIFE = 1200 |
810 |
|
811 |
# Kill jobs if they have taken too long to vacate gracefully |
812 |
UWCS_KILL = $(ActivityTimer) > $(MaxVacateTime) |
813 |
|
814 |
## Only define vanilla versions of these if you want to make them |
815 |
## different from the above settings. |
816 |
#SUSPEND_VANILLA = ( $(KeyboardBusy) || \ |
817 |
# ((CpuBusyTime > 2 * $(MINUTE)) && $(ActivationTimer) > 90) ) |
818 |
#CONTINUE_VANILLA = ( $(CPUIdle) && ($(ActivityTimer) > 10) \ |
819 |
# && (KeyboardIdle > $(ContinueIdleTime)) ) |
820 |
#PREEMPT_VANILLA = ( ((Activity == "Suspended") && \ |
821 |
# ($(ActivityTimer) > $(MaxSuspendTime))) \ |
822 |
# || (SUSPEND_VANILLA && (WANT_SUSPEND == False)) ) |
823 |
#KILL_VANILLA = $(ActivityTimer) > $(MaxVacateTime) |
824 |
|
825 |
## Checkpoint every 3 hours on average, with a +-30 minute random |
826 |
## factor to avoid having many jobs hit the checkpoint server at |
827 |
## the same time. |
828 |
UWCS_PERIODIC_CHECKPOINT = $(LastCkpt) > (3 * $(HOUR) + \ |
829 |
$RANDOM_INTEGER(-30,30,1) * $(MINUTE) ) |
830 |
|
831 |
## You might want to checkpoint a little less often. A good |
832 |
## example of this is below. For jobs smaller than 60 megabytes, we |
833 |
## periodic checkpoint every 6 hours. For larger jobs, we only |
834 |
## checkpoint every 12 hours. |
835 |
#UWCS_PERIODIC_CHECKPOINT = \ |
836 |
# ( (TARGET.ImageSize < 60000) && \ |
837 |
# ($(LastCkpt) > (6 * $(HOUR) + $RANDOM_INTEGER(-30,30,1))) ) || \ |
838 |
# ( $(LastCkpt) > (12 * $(HOUR) + $RANDOM_INTEGER(-30,30,1)) ) |
839 |
|
840 |
## The rank expressions used by the negotiator are configured below. |
841 |
## This is the order in which ranks are applied by the negotiator: |
842 |
## 1. NEGOTIATOR_PRE_JOB_RANK |
843 |
## 2. rank in job ClassAd |
844 |
## 3. NEGOTIATOR_POST_JOB_RANK |
845 |
## 4. cause of preemption (0=user priority,1=startd rank,2=no preemption) |
846 |
## 5. PREEMPTION_RANK |
847 |
|
848 |
## The NEGOTIATOR_PRE_JOB_RANK expression overrides all other ranks |
849 |
## that are used to pick a match from the set of possibilities. |
850 |
## The following expression matches jobs to unclaimed resources |
851 |
## whenever possible, regardless of the job-supplied rank. |
852 |
UWCS_NEGOTIATOR_PRE_JOB_RANK = RemoteOwner =?= UNDEFINED |
853 |
|
854 |
## The NEGOTIATOR_POST_JOB_RANK expression chooses between |
855 |
## resources that are equally preferred by the job. |
856 |
## The following example expression steers jobs toward |
857 |
## faster machines and tends to fill a cluster of multi-processors |
858 |
## breadth-first instead of depth-first. It also prefers online |
859 |
## machines over offline (hibernating) ones. In this example, |
860 |
## the expression is chosen to have no effect when preemption |
861 |
## would take place, allowing control to pass on to |
862 |
## PREEMPTION_RANK. |
863 |
UWCS_NEGOTIATOR_POST_JOB_RANK = \ |
864 |
(RemoteOwner =?= UNDEFINED) * (KFlops - SlotID - 1.0e10*(Offline=?=True)) |
865 |
|
866 |
## The negotiator will not preempt a job running on a given machine |
867 |
## unless the PREEMPTION_REQUIREMENTS expression evaluates to true |
868 |
## and the owner of the idle job has a better priority than the owner |
869 |
## of the running job. This expression defaults to true. |
870 |
UWCS_PREEMPTION_REQUIREMENTS = ( $(StateTimer) > (1 * $(HOUR)) && \ |
871 |
RemoteUserPrio > TARGET.SubmitterUserPrio * 1.2 ) || (MY.NiceUser == True) |
872 |
|
873 |
## The PREEMPTION_RANK expression is used in a case where preemption |
874 |
## is the only option and all other negotiation ranks are equal. For |
875 |
## example, if the job has no preference, it is usually preferable to |
876 |
## preempt a job with a small ImageSize instead of a job with a large |
877 |
## ImageSize. The default is to rank all preemptable matches the |
878 |
## same. However, the negotiator will always prefer to match the job |
879 |
## with an idle machine over a preemptable machine, if all other |
880 |
## negotiation ranks are equal. |
881 |
UWCS_PREEMPTION_RANK = (RemoteUserPrio * 1000000) - TARGET.ImageSize |
882 |
|
883 |
|
884 |
##################################################################### |
885 |
## This is a Configuration that will cause your Condor jobs to |
886 |
## always run. This is intended for testing only. |
887 |
###################################################################### |
888 |
|
889 |
## This mode will cause your jobs to start on a machine an will let |
890 |
## them run to completion. Condor will ignore all of what is going |
891 |
## on in the machine (load average, keyboard activity, etc.) |
892 |
|
893 |
TESTINGMODE_WANT_SUSPEND = False |
894 |
TESTINGMODE_WANT_VACATE = False |
895 |
TESTINGMODE_START = True |
896 |
TESTINGMODE_SUSPEND = False |
897 |
TESTINGMODE_CONTINUE = True |
898 |
TESTINGMODE_PREEMPT = False |
899 |
TESTINGMODE_KILL = False |
900 |
TESTINGMODE_PERIODIC_CHECKPOINT = False |
901 |
TESTINGMODE_PREEMPTION_REQUIREMENTS = False |
902 |
TESTINGMODE_PREEMPTION_RANK = 0 |
903 |
|
904 |
# Prevent machine claims from being reused indefinitely, since |
905 |
# preemption of claims is disabled in the TESTINGMODE configuration. |
906 |
TESTINGMODE_CLAIM_WORKLIFE = 1200 |
907 |
|
908 |
|
909 |
###################################################################### |
910 |
###################################################################### |
911 |
## |
912 |
## ###### # |
913 |
## # # ## ##### ##### # # |
914 |
## # # # # # # # # # |
915 |
## ###### # # # # # # # |
916 |
## # ###### ##### # ####### |
917 |
## # # # # # # # |
918 |
## # # # # # # # |
919 |
## |
920 |
## Part 4: Settings you should probably leave alone: |
921 |
## (unless you know what you're doing) |
922 |
###################################################################### |
923 |
###################################################################### |
924 |
|
925 |
###################################################################### |
926 |
## Daemon-wide settings: |
927 |
###################################################################### |
928 |
|
929 |
## Pathnames |
930 |
LOG = $(LOCAL_DIR)/log |
931 |
SPOOL = $(LOCAL_DIR)/spool |
932 |
EXECUTE = $(LOCAL_DIR)/execute |
933 |
BIN = $(RELEASE_DIR)/bin |
934 |
LIB = $(RELEASE_DIR)/lib |
935 |
INCLUDE = $(RELEASE_DIR)/include |
936 |
SBIN = $(RELEASE_DIR)/sbin |
937 |
LIBEXEC = $(RELEASE_DIR)/libexec |
938 |
|
939 |
## If you leave HISTORY undefined (comment it out), no history file |
940 |
## will be created. |
941 |
HISTORY = $(SPOOL)/history |
942 |
|
943 |
## Log files |
944 |
COLLECTOR_LOG = $(LOG)/CollectorLog |
945 |
KBDD_LOG = $(LOG)/KbdLog |
946 |
MASTER_LOG = $(LOG)/MasterLog |
947 |
NEGOTIATOR_LOG = $(LOG)/NegotiatorLog |
948 |
NEGOTIATOR_MATCH_LOG = $(LOG)/MatchLog |
949 |
SCHEDD_LOG = $(LOG)/SchedLog |
950 |
SHADOW_LOG = $(LOG)/ShadowLog |
951 |
STARTD_LOG = $(LOG)/StartLog |
952 |
STARTER_LOG = $(LOG)/StarterLog |
953 |
JOB_ROUTER_LOG = $(LOG)/JobRouterLog |
954 |
ROOSTER_LOG = $(LOG)/RoosterLog |
955 |
SHARED_PORT_LOG = $(LOG)/SharedPortLog |
956 |
# High Availability Logs |
957 |
HAD_LOG = $(LOG)/HADLog |
958 |
REPLICATION_LOG = $(LOG)/ReplicationLog |
959 |
TRANSFERER_LOG = $(LOG)/TransfererLog |
960 |
HDFS_LOG = $(LOG)/HDFSLog |
961 |
|
962 |
## Lock files |
963 |
SHADOW_LOCK = $(LOCK)/ShadowLock |
964 |
|
965 |
## This setting controls how often any lock files currently in use have their |
966 |
## timestamp updated. Updating the timestamp prevents administrative programs |
967 |
## like 'tmpwatch' from deleting long lived lock files. The parameter is |
968 |
## an integer in seconds with a minimum of 60 seconds. The default if not |
969 |
## specified is 28800 seconds, or 8 hours. |
970 |
## This attribute only takes effect on restart of the daemons or at the next |
971 |
## update time. |
972 |
# LOCK_FILE_UPDATE_INTERVAL = 28800 |
973 |
|
974 |
## This setting primarily allows you to change the port that the |
975 |
## collector is listening on. By default, the collector uses port |
976 |
## 9618, but you can set the port with a ":port", such as: |
977 |
## COLLECTOR_HOST = $(CONDOR_HOST):1234 |
978 |
COLLECTOR_HOST = $(CONDOR_HOST) |
979 |
|
980 |
## The NEGOTIATOR_HOST parameter has been deprecated. The port where |
981 |
## the negotiator is listening is now dynamically allocated and the IP |
982 |
## and port are now obtained from the collector, just like all the |
983 |
## other daemons. However, if your pool contains any machines that |
984 |
## are running version 6.7.3 or earlier, you can uncomment this |
985 |
## setting to go back to the old fixed-port (9614) for the negotiator. |
986 |
#NEGOTIATOR_HOST = $(CONDOR_HOST) |
987 |
|
988 |
## How long are you willing to let daemons try their graceful |
989 |
## shutdown methods before they do a hard shutdown? (30 minutes) |
990 |
#SHUTDOWN_GRACEFUL_TIMEOUT = 1800 |
991 |
|
992 |
## How much disk space would you like reserved from Condor? In |
993 |
## places where Condor is computing the free disk space on various |
994 |
## partitions, it subtracts the amount it really finds by this |
995 |
## many megabytes. (If undefined, defaults to 0). |
996 |
RESERVED_DISK = 5 |
997 |
|
998 |
## If your machine is running AFS and the AFS cache lives on the same |
999 |
## partition as the other Condor directories, and you want Condor to |
1000 |
## reserve the space that your AFS cache is configured to use, set |
1001 |
## this to true. |
1002 |
#RESERVE_AFS_CACHE = False |
1003 |
|
1004 |
## By default, if a user does not specify "notify_user" in the submit |
1005 |
## description file, any email Condor sends about that job will go to |
1006 |
## "username@UID_DOMAIN". If your machines all share a common UID |
1007 |
## domain (so that you would set UID_DOMAIN to be the same across all |
1008 |
## machines in your pool), *BUT* email to user@UID_DOMAIN is *NOT* |
1009 |
## the right place for Condor to send email for your site, you can |
1010 |
## define the default domain to use for email. A common example |
1011 |
## would be to set EMAIL_DOMAIN to the fully qualified hostname of |
1012 |
## each machine in your pool, so users submitting jobs from a |
1013 |
## specific machine would get email sent to user@machine.your.domain, |
1014 |
## instead of user@your.domain. In general, you should leave this |
1015 |
## setting commented out unless two things are true: 1) UID_DOMAIN is |
1016 |
## set to your domain, not $(FULL_HOSTNAME), and 2) email to |
1017 |
## user@UID_DOMAIN won't work. |
1018 |
#EMAIL_DOMAIN = $(FULL_HOSTNAME) |
1019 |
|
1020 |
## Should Condor daemons create a UDP command socket (for incomming |
1021 |
## UDP-based commands) in addition to the TCP command socket? By |
1022 |
## default, classified ad updates sent to the collector use UDP, in |
1023 |
## addition to some keep alive messages and other non-essential |
1024 |
## communication. However, in certain situations, it might be |
1025 |
## desirable to disable the UDP command port (for example, to reduce |
1026 |
## the number of ports represented by a GCB broker, etc). If not |
1027 |
## defined, the UDP command socket is enabled by default, and to |
1028 |
## modify this, you must restart your Condor daemons. Also, this |
1029 |
## setting must be defined machine-wide. For example, setting |
1030 |
## "STARTD.WANT_UDP_COMMAND_SOCKET = False" while the global setting |
1031 |
## is "True" will still result in the startd creating a UDP socket. |
1032 |
#WANT_UDP_COMMAND_SOCKET = True |
1033 |
|
1034 |
## If your site needs to use TCP updates to the collector, instead of |
1035 |
## UDP, you can enable this feature. HOWEVER, WE DO NOT RECOMMEND |
1036 |
## THIS FOR MOST SITES! In general, the only sites that might want |
1037 |
## this feature are pools made up of machines connected via a |
1038 |
## wide-area network where UDP packets are frequently or always |
1039 |
## dropped. If you enable this feature, you *MUST* turn on the |
1040 |
## COLLECTOR_SOCKET_CACHE_SIZE setting at your collector, and each |
1041 |
## entry in the socket cache uses another file descriptor. If not |
1042 |
## defined, this feature is disabled by default. |
1043 |
#UPDATE_COLLECTOR_WITH_TCP = True |
1044 |
|
1045 |
## HIGHPORT and LOWPORT let you set the range of ports that Condor |
1046 |
## will use. This may be useful if you are behind a firewall. By |
1047 |
## default, Condor uses port 9618 for the collector, 9614 for the |
1048 |
## negotiator, and system-assigned (apparently random) ports for |
1049 |
## everything else. HIGHPORT and LOWPORT only affect these |
1050 |
## system-assigned ports, but will restrict them to the range you |
1051 |
## specify here. If you want to change the well-known ports for the |
1052 |
## collector or negotiator, see COLLECTOR_HOST or NEGOTIATOR_HOST. |
1053 |
## Note that both LOWPORT and HIGHPORT must be at least 1024 if you |
1054 |
## are not starting your daemons as root. You may also specify |
1055 |
## different port ranges for incoming and outgoing connections by |
1056 |
## using IN_HIGHPORT/IN_LOWPORT and OUT_HIGHPORT/OUT_LOWPORT. |
1057 |
#HIGHPORT = 9700 |
1058 |
#LOWPORT = 9600 |
1059 |
|
1060 |
## If a daemon doens't respond for too long, do you want go generate |
1061 |
## a core file? This bascially controls the type of the signal |
1062 |
## sent to the child process, and mostly affects the Condor Master |
1063 |
#NOT_RESPONDING_WANT_CORE = False |
1064 |
|
1065 |
|
1066 |
###################################################################### |
1067 |
## Daemon-specific settings: |
1068 |
###################################################################### |
1069 |
|
1070 |
##-------------------------------------------------------------------- |
1071 |
## condor_master |
1072 |
##-------------------------------------------------------------------- |
1073 |
## Daemons you want the master to keep running for you: |
1074 |
DAEMON_LIST = MASTER, STARTD, SCHEDD |
1075 |
|
1076 |
## Which daemons use the Condor DaemonCore library (i.e., not the |
1077 |
## checkpoint server or custom user daemons)? |
1078 |
#DC_DAEMON_LIST = \ |
1079 |
#MASTER, STARTD, SCHEDD, KBDD, COLLECTOR, NEGOTIATOR, EVENTD, \ |
1080 |
#VIEW_SERVER, CONDOR_VIEW, VIEW_COLLECTOR, HAWKEYE, CREDD, HAD, \ |
1081 |
#DBMSD, QUILL, JOB_ROUTER, ROOSTER, LEASEMANAGER, HDFS, SHARED_PORT |
1082 |
|
1083 |
|
1084 |
## Where are the binaries for these daemons? |
1085 |
MASTER = $(SBIN)/condor_master |
1086 |
STARTD = $(SBIN)/condor_startd |
1087 |
SCHEDD = $(SBIN)/condor_schedd |
1088 |
KBDD = $(SBIN)/condor_kbdd |
1089 |
NEGOTIATOR = $(SBIN)/condor_negotiator |
1090 |
COLLECTOR = $(SBIN)/condor_collector |
1091 |
STARTER_LOCAL = $(SBIN)/condor_starter |
1092 |
JOB_ROUTER = $(LIBEXEC)/condor_job_router |
1093 |
ROOSTER = $(LIBEXEC)/condor_rooster |
1094 |
HDFS = $(SBIN)/condor_hdfs |
1095 |
SHARED_PORT = $(LIBEXEC)/condor_shared_port |
1096 |
TRANSFERER = $(LIBEXEC)/condor_transferer |
1097 |
|
1098 |
## When the master starts up, it can place it's address (IP and port) |
1099 |
## into a file. This way, tools running on the local machine don't |
1100 |
## need to query the central manager to find the master. This |
1101 |
## feature can be turned off by commenting out this setting. |
1102 |
MASTER_ADDRESS_FILE = $(LOG)/.master_address |
1103 |
|
1104 |
## Where should the master find the condor_preen binary? If you don't |
1105 |
## want preen to run at all, set it to nothing. |
1106 |
PREEN = $(SBIN)/condor_preen |
1107 |
|
1108 |
## How do you want preen to behave? The "-m" means you want email |
1109 |
## about files preen finds that it thinks it should remove. The "-r" |
1110 |
## means you want preen to actually remove these files. If you don't |
1111 |
## want either of those things to happen, just remove the appropriate |
1112 |
## one from this setting. |
1113 |
PREEN_ARGS = -m -r |
1114 |
|
1115 |
## How often should the master start up condor_preen? (once a day) |
1116 |
#PREEN_INTERVAL = 86400 |
1117 |
|
1118 |
## If a daemon dies an unnatural death, do you want email about it? |
1119 |
#PUBLISH_OBITUARIES = True |
1120 |
|
1121 |
## If you're getting obituaries, how many lines of the end of that |
1122 |
## daemon's log file do you want included in the obituary? |
1123 |
#OBITUARY_LOG_LENGTH = 20 |
1124 |
|
1125 |
## Should the master run? |
1126 |
#START_MASTER = True |
1127 |
|
1128 |
## Should the master start up the daemons you want it to? |
1129 |
#START_DAEMONS = True |
1130 |
|
1131 |
## How often do you want the master to send an update to the central |
1132 |
## manager? |
1133 |
#MASTER_UPDATE_INTERVAL = 300 |
1134 |
|
1135 |
## How often do you want the master to check the timestamps of the |
1136 |
## daemons it's running? If any daemons have been modified, the |
1137 |
## master restarts them. |
1138 |
#MASTER_CHECK_NEW_EXEC_INTERVAL = 300 |
1139 |
|
1140 |
## Once you notice new binaries, how long should you wait before you |
1141 |
## try to execute them? |
1142 |
#MASTER_NEW_BINARY_DELAY = 120 |
1143 |
|
1144 |
## What's the maximum amount of time you're willing to give the |
1145 |
## daemons to quickly shutdown before you just kill them outright? |
1146 |
#SHUTDOWN_FAST_TIMEOUT = 120 |
1147 |
|
1148 |
###### |
1149 |
## Exponential backoff settings: |
1150 |
###### |
1151 |
## When a daemon keeps crashing, we use "exponential backoff" so we |
1152 |
## wait longer and longer before restarting it. This is the base of |
1153 |
## the exponent used to determine how long to wait before starting |
1154 |
## the daemon again: |
1155 |
#MASTER_BACKOFF_FACTOR = 2.0 |
1156 |
|
1157 |
## What's the maximum amount of time you want the master to wait |
1158 |
## between attempts to start a given daemon? (With 2.0 as the |
1159 |
## MASTER_BACKOFF_FACTOR, you'd hit 1 hour in 12 restarts...) |
1160 |
#MASTER_BACKOFF_CEILING = 3600 |
1161 |
|
1162 |
## How long should a daemon run without crashing before we consider |
1163 |
## it "recovered". Once a daemon has recovered, we reset the number |
1164 |
## of restarts so the exponential backoff stuff goes back to normal. |
1165 |
#MASTER_RECOVER_FACTOR = 300 |
1166 |
|
1167 |
|
1168 |
##-------------------------------------------------------------------- |
1169 |
## condor_collector |
1170 |
##-------------------------------------------------------------------- |
1171 |
## Address to which Condor will send a weekly e-mail with output of |
1172 |
## condor_status. |
1173 |
#CONDOR_DEVELOPERS = condor-admin@cs.wisc.edu |
1174 |
|
1175 |
## Global Collector to periodically advertise basic information about |
1176 |
## your pool. |
1177 |
#CONDOR_DEVELOPERS_COLLECTOR = condor.cs.wisc.edu |
1178 |
|
1179 |
|
1180 |
##-------------------------------------------------------------------- |
1181 |
## condor_negotiator |
1182 |
##-------------------------------------------------------------------- |
1183 |
## Determine if the Negotiator will honor SlotWeight attributes, which |
1184 |
## may be used to give a slot greater weight when calculating usage. |
1185 |
#NEGOTIATOR_USE_SLOT_WEIGHTS = True |
1186 |
|
1187 |
|
1188 |
## How often the Negotaitor starts a negotiation cycle, defined in |
1189 |
## seconds. |
1190 |
#NEGOTIATOR_INTERVAL = 60 |
1191 |
|
1192 |
## Should the Negotiator publish an update to the Collector after |
1193 |
## every negotiation cycle. It is useful to have this set to True |
1194 |
## to get immediate updates on LastNegotiationCycle statistics. |
1195 |
#NEGOTIATOR_UPDATE_AFTER_CYCLE = False |
1196 |
|
1197 |
|
1198 |
##-------------------------------------------------------------------- |
1199 |
## condor_startd |
1200 |
##-------------------------------------------------------------------- |
1201 |
## Where are the various condor_starter binaries installed? |
1202 |
STARTER_LIST = STARTER, STARTER_STANDARD |
1203 |
STARTER = $(SBIN)/condor_starter |
1204 |
STARTER_STANDARD = $(SBIN)/condor_starter.std |
1205 |
STARTER_LOCAL = $(SBIN)/condor_starter |
1206 |
|
1207 |
## When the startd starts up, it can place it's address (IP and port) |
1208 |
## into a file. This way, tools running on the local machine don't |
1209 |
## need to query the central manager to find the startd. This |
1210 |
## feature can be turned off by commenting out this setting. |
1211 |
STARTD_ADDRESS_FILE = $(LOG)/.startd_address |
1212 |
|
1213 |
## When a machine is claimed, how often should we poll the state of |
1214 |
## the machine to see if we need to evict/suspend the job, etc? |
1215 |
#POLLING_INTERVAL = 5 |
1216 |
|
1217 |
## How often should the startd send updates to the central manager? |
1218 |
#UPDATE_INTERVAL = 300 |
1219 |
|
1220 |
## How long is the startd willing to stay in the "matched" state? |
1221 |
#MATCH_TIMEOUT = 300 |
1222 |
|
1223 |
## How long is the startd willing to stay in the preempting/killing |
1224 |
## state before it just kills the starter directly? |
1225 |
#KILLING_TIMEOUT = 30 |
1226 |
|
1227 |
## When a machine unclaimed, when should it run benchmarks? |
1228 |
## LastBenchmark is initialized to 0, so this expression says as soon |
1229 |
## as we're unclaimed, run the benchmarks. Thereafter, if we're |
1230 |
## unclaimed and it's been at least 4 hours since we ran the last |
1231 |
## benchmarks, run them again. The startd keeps a weighted average |
1232 |
## of the benchmark results to provide more accurate values. |
1233 |
## Note, if you don't want any benchmarks run at all, either comment |
1234 |
## RunBenchmarks out, or set it to "False". |
1235 |
BenchmarkTimer = (time() - LastBenchmark) |
1236 |
RunBenchmarks : (LastBenchmark == 0 ) || ($(BenchmarkTimer) >= (4 * $(HOUR))) |
1237 |
#RunBenchmarks : False |
1238 |
|
1239 |
## When the startd does benchmarks, which set of benchmarks should we |
1240 |
## run? The default is the same as pre-7.5.6: MIPS and KFLOPS. |
1241 |
benchmarks_joblist = mips kflops |
1242 |
|
1243 |
## What's the max "load" of all running benchmarks? With the default |
1244 |
## (1.01), the startd will run the benchmarks serially. |
1245 |
benchmarks_max_job_load = 1.0 |
1246 |
|
1247 |
# MIPS (Dhrystone 2.1) benchmark: load 1.0 |
1248 |
benchmarks_mips_executable = $(LIBEXEC)/condor_mips |
1249 |
benchmarks_mips_job_load = 1.0 |
1250 |
|
1251 |
# KFLOPS (clinpack) benchmark: load 1.0 |
1252 |
benchmarks_kflops_executable = $(LIBEXEC)/condor_kflops |
1253 |
benchmarks_kflops_job_load = 1.0 |
1254 |
|
1255 |
|
1256 |
## Normally, when the startd is computing the idle time of all the |
1257 |
## users of the machine (both local and remote), it checks the utmp |
1258 |
## file to find all the currently active ttys, and only checks access |
1259 |
## time of the devices associated with active logins. Unfortunately, |
1260 |
## on some systems, utmp is unreliable, and the startd might miss |
1261 |
## keyboard activity by doing this. So, if your utmp is unreliable, |
1262 |
## set this setting to True and the startd will check the access time |
1263 |
## on all tty and pty devices. |
1264 |
#STARTD_HAS_BAD_UTMP = False |
1265 |
|
1266 |
## This entry allows the startd to monitor console (keyboard and |
1267 |
## mouse) activity by checking the access times on special files in |
1268 |
## /dev. Activity on these files shows up as "ConsoleIdle" time in |
1269 |
## the startd's ClassAd. Just give a comma-separated list of the |
1270 |
## names of devices you want considered the console, without the |
1271 |
## "/dev/" portion of the pathname. |
1272 |
#CONSOLE_DEVICES = mouse, console |
1273 |
|
1274 |
|
1275 |
## The STARTD_ATTRS (and legacy STARTD_EXPRS) entry allows you to |
1276 |
## have the startd advertise arbitrary attributes from the config |
1277 |
## file in its ClassAd. Give the comma-separated list of entries |
1278 |
## from the config file you want in the startd ClassAd. |
1279 |
## NOTE: because of the different syntax of the config file and |
1280 |
## ClassAds, you might have to do a little extra work to get a given |
1281 |
## entry into the ClassAd. In particular, ClassAds require double |
1282 |
## quotes (") around your strings. Numeric values can go in |
1283 |
## directly, as can boolean expressions. For example, if you wanted |
1284 |
## the startd to advertise its list of console devices, when it's |
1285 |
## configured to run benchmarks, and how often it sends updates to |
1286 |
## the central manager, you'd have to define the following helper |
1287 |
## macro: |
1288 |
#MY_CONSOLE_DEVICES = "$(CONSOLE_DEVICES)" |
1289 |
## Note: this must come before you define STARTD_ATTRS because macros |
1290 |
## must be defined before you use them in other macros or |
1291 |
## expressions. |
1292 |
## Then, you'd set the STARTD_ATTRS setting to this: |
1293 |
#STARTD_ATTRS = MY_CONSOLE_DEVICES, RunBenchmarks, UPDATE_INTERVAL |
1294 |
## |
1295 |
## STARTD_ATTRS can also be defined on a per-slot basis. The startd |
1296 |
## builds the list of attributes to advertise by combining the lists |
1297 |
## in this order: STARTD_ATTRS, SLOTx_STARTD_ATTRS. In the below |
1298 |
## example, the startd ad for slot1 will have the value for |
1299 |
## favorite_color, favorite_season, and favorite_movie, and slot2 |
1300 |
## will have favorite_color, favorite_season, and favorite_song. |
1301 |
## |
1302 |
#STARTD_ATTRS = favorite_color, favorite_season |
1303 |
#SLOT1_STARTD_ATTRS = favorite_movie |
1304 |
#SLOT2_STARTD_ATTRS = favorite_song |
1305 |
## |
1306 |
## Attributes in the STARTD_ATTRS list can also be on a per-slot basis. |
1307 |
## For example, the following configuration: |
1308 |
## |
1309 |
#favorite_color = "blue" |
1310 |
#favorite_season = "spring" |
1311 |
#SLOT2_favorite_color = "green" |
1312 |
#SLOT3_favorite_season = "summer" |
1313 |
#STARTD_ATTRS = favorite_color, favorite_season |
1314 |
## |
1315 |
## will result in the following attributes in the slot classified |
1316 |
## ads: |
1317 |
## |
1318 |
## slot1 - favorite_color = "blue"; favorite_season = "spring" |
1319 |
## slot2 - favorite_color = "green"; favorite_season = "spring" |
1320 |
## slot3 - favorite_color = "blue"; favorite_season = "summer" |
1321 |
## |
1322 |
## Finally, the recommended default value for this setting, is to |
1323 |
## publish the COLLECTOR_HOST setting as a string. This can be |
1324 |
## useful using the "$$(COLLECTOR_HOST)" syntax in the submit file |
1325 |
## for jobs to know (for example, via their environment) what pool |
1326 |
## they're running in. |
1327 |
COLLECTOR_HOST_STRING = "$(COLLECTOR_HOST)" |
1328 |
STARTD_ATTRS = COLLECTOR_HOST_STRING |
1329 |
|
1330 |
## When the startd is claimed by a remote user, it can also advertise |
1331 |
## arbitrary attributes from the ClassAd of the job its working on. |
1332 |
## Just list the attribute names you want advertised. |
1333 |
## Note: since this is already a ClassAd, you don't have to do |
1334 |
## anything funny with strings, etc. This feature can be turned off |
1335 |
## by commenting out this setting (there is no default). |
1336 |
STARTD_JOB_EXPRS = ImageSize, ExecutableSize, JobUniverse, NiceUser |
1337 |
|
1338 |
## If you want to "lie" to Condor about how many CPUs your machine |
1339 |
## has, you can use this setting to override Condor's automatic |
1340 |
## computation. If you modify this, you must restart the startd for |
1341 |
## the change to take effect (a simple condor_reconfig will not do). |
1342 |
## Please read the section on "condor_startd Configuration File |
1343 |
## Macros" in the Condor Administrators Manual for a further |
1344 |
## discussion of this setting. Its use is not recommended. This |
1345 |
## must be an integer ("N" isn't a valid setting, that's just used to |
1346 |
## represent the default). |
1347 |
#NUM_CPUS = N |
1348 |
|
1349 |
## If you never want Condor to detect more the "N" CPUs, uncomment this |
1350 |
## line out. You must restart the startd for this setting to take |
1351 |
## effect. If set to 0 or a negative number, it is ignored. |
1352 |
## By default, it is ignored. Otherwise, it must be a positive |
1353 |
## integer ("N" isn't a valid setting, that's just used to |
1354 |
## represent the default). |
1355 |
#MAX_NUM_CPUS = N |
1356 |
|
1357 |
## Normally, Condor will automatically detect the amount of physical |
1358 |
## memory available on your machine. Define MEMORY to tell Condor |
1359 |
## how much physical memory (in MB) your machine has, overriding the |
1360 |
## value Condor computes automatically. For example: |
1361 |
#MEMORY = 128 |
1362 |
|
1363 |
## How much memory would you like reserved from Condor? By default, |
1364 |
## Condor considers all the physical memory of your machine as |
1365 |
## available to be used by Condor jobs. If RESERVED_MEMORY is |
1366 |
## defined, Condor subtracts it from the amount of memory it |
1367 |
## advertises as available. |
1368 |
#RESERVED_MEMORY = 0 |
1369 |
|
1370 |
###### |
1371 |
## SMP startd settings |
1372 |
## |
1373 |
## By default, Condor will evenly divide the resources in an SMP |
1374 |
## machine (such as RAM, swap space and disk space) among all the |
1375 |
## CPUs, and advertise each CPU as its own slot with an even share of |
1376 |
## the system resources. If you want something other than this, |
1377 |
## there are a few options available to you. Please read the section |
1378 |
## on "Configuring The Startd for SMP Machines" in the Condor |
1379 |
## Administrator's Manual for full details. The various settings are |
1380 |
## only briefly listed and described here. |
1381 |
###### |
1382 |
|
1383 |
## The maximum number of different slot types. |
1384 |
#MAX_SLOT_TYPES = 10 |
1385 |
|
1386 |
## Use this setting to define your own slot types. This |
1387 |
## allows you to divide system resources unevenly among your CPUs. |
1388 |
## You must use a different setting for each different type you |
1389 |
## define. The "<N>" in the name of the macro listed below must be |
1390 |
## an integer from 1 to MAX_SLOT_TYPES (defined above), |
1391 |
## and you use this number to refer to your type. There are many |
1392 |
## different formats these settings can take, so be sure to refer to |
1393 |
## the section on "Configuring The Startd for SMP Machines" in the |
1394 |
## Condor Administrator's Manual for full details. In particular, |
1395 |
## read the section titled "Defining Slot Types" to help |
1396 |
## understand this setting. If you modify any of these settings, you |
1397 |
## must restart the condor_start for the change to take effect. |
1398 |
#SLOT_TYPE_<N> = 1/4 |
1399 |
#SLOT_TYPE_<N> = cpus=1, ram=25%, swap=1/4, disk=1/4 |
1400 |
# For example: |
1401 |
#SLOT_TYPE_1 = 1/8 |
1402 |
#SLOT_TYPE_2 = 1/4 |
1403 |
|
1404 |
## If you define your own slot types, you must specify how |
1405 |
## many slots of each type you wish to advertise. You do |
1406 |
## this with the setting below, replacing the "<N>" with the |
1407 |
## corresponding integer you used to define the type above. You can |
1408 |
## change the number of a given type being advertised at run-time, |
1409 |
## with a simple condor_reconfig. |
1410 |
#NUM_SLOTS_TYPE_<N> = M |
1411 |
# For example: |
1412 |
#NUM_SLOTS_TYPE_1 = 6 |
1413 |
#NUM_SLOTS_TYPE_2 = 1 |
1414 |
|
1415 |
## The number of evenly-divided slots you want Condor to |
1416 |
## report to your pool (if less than the total number of CPUs). This |
1417 |
## setting is only considered if the "type" settings described above |
1418 |
## are not in use. By default, all CPUs are reported. This setting |
1419 |
## must be an integer ("N" isn't a valid setting, that's just used to |
1420 |
## represent the default). |
1421 |
#NUM_SLOTS = N |
1422 |
|
1423 |
## How many of the slots the startd is representing should |
1424 |
## be "connected" to the console (in other words, notice when there's |
1425 |
## console activity)? This defaults to all slots (N in a |
1426 |
## machine with N CPUs). This must be an integer ("N" isn't a valid |
1427 |
## setting, that's just used to represent the default). |
1428 |
#SLOTS_CONNECTED_TO_CONSOLE = N |
1429 |
|
1430 |
## How many of the slots the startd is representing should |
1431 |
## be "connected" to the keyboard (for remote tty activity, as well |
1432 |
## as console activity). Defaults to 1. |
1433 |
#SLOTS_CONNECTED_TO_KEYBOARD = 1 |
1434 |
|
1435 |
## If there are slots that aren't connected to the |
1436 |
## keyboard or the console (see the above two settings), the |
1437 |
## corresponding idle time reported will be the time since the startd |
1438 |
## was spawned, plus the value of this parameter. It defaults to 20 |
1439 |
## minutes. We do this because, if the slot is configured |
1440 |
## not to care about keyboard activity, we want it to be available to |
1441 |
## Condor jobs as soon as the startd starts up, instead of having to |
1442 |
## wait for 15 minutes or more (which is the default time a machine |
1443 |
## must be idle before Condor will start a job). If you don't want |
1444 |
## this boost, just set the value to 0. If you change your START |
1445 |
## expression to require more than 15 minutes before a job starts, |
1446 |
## but you still want jobs to start right away on some of your SMP |
1447 |
## nodes, just increase this parameter. |
1448 |
#DISCONNECTED_KEYBOARD_IDLE_BOOST = 1200 |
1449 |
|
1450 |
###### |
1451 |
## Settings for computing optional resource availability statistics: |
1452 |
###### |
1453 |
## If STARTD_COMPUTE_AVAIL_STATS = True, the startd will compute |
1454 |
## statistics about resource availability to be included in the |
1455 |
## classad(s) sent to the collector describing the resource(s) the |
1456 |
## startd manages. The following attributes will always be included |
1457 |
## in the resource classad(s) if STARTD_COMPUTE_AVAIL_STATS = True: |
1458 |
## AvailTime = What proportion of the time (between 0.0 and 1.0) |
1459 |
## has this resource been in a state other than "Owner"? |
1460 |
## LastAvailInterval = What was the duration (in seconds) of the |
1461 |
## last period between "Owner" states? |
1462 |
## The following attributes will also be included if the resource is |
1463 |
## not in the "Owner" state: |
1464 |
## AvailSince = At what time did the resource last leave the |
1465 |
## "Owner" state? Measured in the number of seconds since the |
1466 |
## epoch (00:00:00 UTC, Jan 1, 1970). |
1467 |
## AvailTimeEstimate = Based on past history, this is an estimate |
1468 |
## of how long the current period between "Owner" states will |
1469 |
## last. |
1470 |
#STARTD_COMPUTE_AVAIL_STATS = False |
1471 |
|
1472 |
## If STARTD_COMPUTE_AVAIL_STATS = True, STARTD_AVAIL_CONFIDENCE sets |
1473 |
## the confidence level of the AvailTimeEstimate. By default, the |
1474 |
## estimate is based on the 80th percentile of past values. |
1475 |
#STARTD_AVAIL_CONFIDENCE = 0.8 |
1476 |
|
1477 |
## STARTD_MAX_AVAIL_PERIOD_SAMPLES limits the number of samples of |
1478 |
## past available intervals stored by the startd to limit memory and |
1479 |
## disk consumption. Each sample requires 4 bytes of memory and |
1480 |
## approximately 10 bytes of disk space. |
1481 |
#STARTD_MAX_AVAIL_PERIOD_SAMPLES = 100 |
1482 |
|
1483 |
## CKPT_PROBE is the location of a program which computes aspects of the |
1484 |
## CheckpointPlatform classad attribute. By default the location of this |
1485 |
## executable will be here: $(LIBEXEC)/condor_ckpt_probe |
1486 |
CKPT_PROBE = $(LIBEXEC)/condor_ckpt_probe |
1487 |
|
1488 |
##-------------------------------------------------------------------- |
1489 |
## condor_schedd |
1490 |
##-------------------------------------------------------------------- |
1491 |
## Where are the various shadow binaries installed? |
1492 |
SHADOW_LIST = SHADOW, SHADOW_STANDARD |
1493 |
SHADOW = $(SBIN)/condor_shadow |
1494 |
SHADOW_STANDARD = $(SBIN)/condor_shadow.std |
1495 |
|
1496 |
## When the schedd starts up, it can place it's address (IP and port) |
1497 |
## into a file. This way, tools running on the local machine don't |
1498 |
## need to query the central manager to find the schedd. This |
1499 |
## feature can be turned off by commenting out this setting. |
1500 |
SCHEDD_ADDRESS_FILE = $(SPOOL)/.schedd_address |
1501 |
|
1502 |
## Additionally, a daemon may store its ClassAd on the local filesystem |
1503 |
## as well as sending it to the collector. This way, tools that need |
1504 |
## information about a daemon do not have to contact the central manager |
1505 |
## to get information about a daemon on the same machine. |
1506 |
## This feature is necessary for Quill to work. |
1507 |
SCHEDD_DAEMON_AD_FILE = $(SPOOL)/.schedd_classad |
1508 |
|
1509 |
## How often should the schedd send an update to the central manager? |
1510 |
#SCHEDD_INTERVAL = 300 |
1511 |
|
1512 |
## How long should the schedd wait between spawning each shadow? |
1513 |
#JOB_START_DELAY = 2 |
1514 |
|
1515 |
## How many concurrent sub-processes should the schedd spawn to handle |
1516 |
## queries? (Unix only) |
1517 |
#SCHEDD_QUERY_WORKERS = 3 |
1518 |
|
1519 |
## How often should the schedd send a keep alive message to any |
1520 |
## startds it has claimed? (5 minutes) |
1521 |
#ALIVE_INTERVAL = 300 |
1522 |
|
1523 |
## This setting controls the maximum number of times that a |
1524 |
## condor_shadow processes can have a fatal error (exception) before |
1525 |
## the condor_schedd will simply relinquish the match associated with |
1526 |
## the dying shadow. |
1527 |
#MAX_SHADOW_EXCEPTIONS = 5 |
1528 |
|
1529 |
## Estimated virtual memory size of each condor_shadow process. |
1530 |
## Specified in kilobytes. |
1531 |
# SHADOW_SIZE_ESTIMATE = 800 |
1532 |
|
1533 |
## The condor_schedd can renice the condor_shadow processes on your |
1534 |
## submit machines. How "nice" do you want the shadows? (1-19). |
1535 |
## The higher the number, the lower priority the shadows have. |
1536 |
# SHADOW_RENICE_INCREMENT = 0 |
1537 |
|
1538 |
## The condor_schedd can renice scheduler universe processes |
1539 |
## (e.g. DAGMan) on your submit machines. How "nice" do you want the |
1540 |
## scheduler universe processes? (1-19). The higher the number, the |
1541 |
## lower priority the processes have. |
1542 |
# SCHED_UNIV_RENICE_INCREMENT = 0 |
1543 |
|
1544 |
## By default, when the schedd fails to start an idle job, it will |
1545 |
## not try to start any other idle jobs in the same cluster during |
1546 |
## that negotiation cycle. This makes negotiation much more |
1547 |
## efficient for large job clusters. However, in some cases other |
1548 |
## jobs in the cluster can be started even though an earlier job |
1549 |
## can't. For example, the jobs' requirements may differ, because of |
1550 |
## different disk space, memory, or operating system requirements. |
1551 |
## Or, machines may be willing to run only some jobs in the cluster, |
1552 |
## because their requirements reference the jobs' virtual memory size |
1553 |
## or other attribute. Setting NEGOTIATE_ALL_JOBS_IN_CLUSTER to True |
1554 |
## will force the schedd to try to start all idle jobs in each |
1555 |
## negotiation cycle. This will make negotiation cycles last longer, |
1556 |
## but it will ensure that all jobs that can be started will be |
1557 |
## started. |
1558 |
#NEGOTIATE_ALL_JOBS_IN_CLUSTER = False |
1559 |
|
1560 |
## This setting controls how often, in seconds, the schedd considers |
1561 |
## periodic job actions given by the user in the submit file. |
1562 |
## (Currently, these are periodic_hold, periodic_release, and periodic_remove.) |
1563 |
#PERIODIC_EXPR_INTERVAL = 60 |
1564 |
|
1565 |
###### |
1566 |
## Queue management settings: |
1567 |
###### |
1568 |
## How often should the schedd truncate it's job queue transaction |
1569 |
## log? (Specified in seconds, once a day is the default.) |
1570 |
#QUEUE_CLEAN_INTERVAL = 86400 |
1571 |
|
1572 |
## How often should the schedd commit "wall clock" run time for jobs |
1573 |
## to the queue, so run time statistics remain accurate when the |
1574 |
## schedd crashes? (Specified in seconds, once per hour is the |
1575 |
## default. Set to 0 to disable.) |
1576 |
#WALL_CLOCK_CKPT_INTERVAL = 3600 |
1577 |
|
1578 |
## What users do you want to grant super user access to this job |
1579 |
## queue? (These users will be able to remove other user's jobs). |
1580 |
## By default, this only includes root. |
1581 |
QUEUE_SUPER_USERS = root, condor |
1582 |
|
1583 |
|
1584 |
##-------------------------------------------------------------------- |
1585 |
## condor_shadow |
1586 |
##-------------------------------------------------------------------- |
1587 |
## If the shadow is unable to read a checkpoint file from the |
1588 |
## checkpoint server, it keeps trying only if the job has accumulated |
1589 |
## more than MAX_DISCARDED_RUN_TIME seconds of CPU usage. Otherwise, |
1590 |
## the job is started from scratch. Defaults to 1 hour. This |
1591 |
## setting is only used if USE_CKPT_SERVER (from above) is True. |
1592 |
#MAX_DISCARDED_RUN_TIME = 3600 |
1593 |
|
1594 |
## Should periodic checkpoints be compressed? |
1595 |
#COMPRESS_PERIODIC_CKPT = False |
1596 |
|
1597 |
## Should vacate checkpoints be compressed? |
1598 |
#COMPRESS_VACATE_CKPT = False |
1599 |
|
1600 |
## Should we commit the application's dirty memory pages to swap |
1601 |
## space during a periodic checkpoint? |
1602 |
#PERIODIC_MEMORY_SYNC = False |
1603 |
|
1604 |
## Should we write vacate checkpoints slowly? If nonzero, this |
1605 |
## parameter specifies the speed at which vacate checkpoints should |
1606 |
## be written, in kilobytes per second. |
1607 |
#SLOW_CKPT_SPEED = 0 |
1608 |
|
1609 |
## How often should the shadow update the job queue with job |
1610 |
## attributes that periodically change? Specified in seconds. |
1611 |
#SHADOW_QUEUE_UPDATE_INTERVAL = 15 * 60 |
1612 |
|
1613 |
## Should the shadow wait to update certain job attributes for the |
1614 |
## next periodic update, or should it immediately these update |
1615 |
## attributes as they change? Due to performance concerns of |
1616 |
## aggressive updates to a busy condor_schedd, the default is True. |
1617 |
#SHADOW_LAZY_QUEUE_UPDATE = TRUE |
1618 |
|
1619 |
|
1620 |
##-------------------------------------------------------------------- |
1621 |
## condor_starter |
1622 |
##-------------------------------------------------------------------- |
1623 |
## The condor_starter can renice the processes of Condor |
1624 |
## jobs on your execute machines. If you want this, uncomment the |
1625 |
## following entry and set it to how "nice" you want the user |
1626 |
## jobs. (1-19) The larger the number, the lower priority the |
1627 |
## process gets on your machines. |
1628 |
## Note on Win32 platforms, this number needs to be greater than |
1629 |
## zero (i.e. the job must be reniced) or the mechanism that |
1630 |
## monitors CPU load on Win32 systems will give erratic results. |
1631 |
#JOB_RENICE_INCREMENT = 10 |
1632 |
|
1633 |
## Should the starter do local logging to its own log file, or send |
1634 |
## debug information back to the condor_shadow where it will end up |
1635 |
## in the ShadowLog? |
1636 |
#STARTER_LOCAL_LOGGING = TRUE |
1637 |
|
1638 |
## If the UID_DOMAIN settings match on both the execute and submit |
1639 |
## machines, but the UID of the user who submitted the job isn't in |
1640 |
## the passwd file of the execute machine, the starter will normally |
1641 |
## exit with an error. Do you want the starter to just start up the |
1642 |
## job with the specified UID, even if it's not in the passwd file? |
1643 |
#SOFT_UID_DOMAIN = FALSE |
1644 |
|
1645 |
## honor the run_as_owner option from the condor submit file. |
1646 |
## |
1647 |
#STARTER_ALLOW_RUNAS_OWNER = TRUE |
1648 |
|
1649 |
## Tell the Starter/Startd what program to use to remove a directory |
1650 |
## condor_rmdir.exe is a windows-only command that does a better job |
1651 |
## than the built-in rmdir command when it is run with elevated privileges |
1652 |
## Such as when when Condor is running as a service. |
1653 |
## /s is delete subdirectories |
1654 |
## /c is continue on error |
1655 |
WINDOWS_RMDIR = $(SBIN)\condor_rmdir.exe |
1656 |
#WINDOWS_RMDIR_OPTIONS = /s /c |
1657 |
|
1658 |
##-------------------------------------------------------------------- |
1659 |
## condor_procd |
1660 |
##-------------------------------------------------------------------- |
1661 |
## |
1662 |
# the path to the procd binary |
1663 |
# |
1664 |
PROCD = $(SBIN)/condor_procd |
1665 |
|
1666 |
# the path to the procd "address" |
1667 |
# - on UNIX this will be a named pipe; we'll put it in the |
1668 |
# $(LOCK) directory by default (note that multiple named pipes |
1669 |
# will be created in this directory for when the procd responds |
1670 |
# to its clients) |
1671 |
# - on Windows, this will be a named pipe as well (but named pipes on |
1672 |
# Windows are not even close to the same thing as named pipes on |
1673 |
# UNIX); the name will be something like: |
1674 |
# \\.\pipe\condor_procd |
1675 |
# |
1676 |
PROCD_ADDRESS = $(LOCK)/procd_pipe |
1677 |
|
1678 |
# The procd currently uses a very simplistic logging system. Since this |
1679 |
# log will not be rotated like other Condor logs, it is only recommended |
1680 |
# to set PROCD_LOG when attempting to debug a problem. In other Condor |
1681 |
# daemons, turning on D_PROCFAMILY will result in that daemon logging |
1682 |
# all of its interactions with the ProcD. |
1683 |
# |
1684 |
#PROCD_LOG = $(LOG)/ProcLog |
1685 |
|
1686 |
# This is the maximum period that the procd will use for taking |
1687 |
# snapshots (the actual period may be lower if a condor daemon registers |
1688 |
# a family for which it wants more frequent snapshots) |
1689 |
# |
1690 |
PROCD_MAX_SNAPSHOT_INTERVAL = 60 |
1691 |
|
1692 |
# On Windows, we send a process a "soft kill" via a WM_CLOSE message. |
1693 |
# This binary is used by the ProcD (and other Condor daemons if PRIVSEP |
1694 |
# is not enabled) to help when sending soft kills. |
1695 |
WINDOWS_SOFTKILL = $(SBIN)/condor_softkill |
1696 |
|
1697 |
##-------------------------------------------------------------------- |
1698 |
## condor_submit |
1699 |
##-------------------------------------------------------------------- |
1700 |
## If you want condor_submit to automatically append an expression to |
1701 |
## the Requirements expression or Rank expression of jobs at your |
1702 |
## site, uncomment these entries. |
1703 |
#APPEND_REQUIREMENTS = (expression to append job requirements) |
1704 |
#APPEND_RANK = (expression to append job rank) |
1705 |
|
1706 |
## If you want expressions only appended for either standard or |
1707 |
## vanilla universe jobs, you can uncomment these entries. If any of |
1708 |
## them are defined, they are used for the given universe, instead of |
1709 |
## the generic entries above. |
1710 |
#APPEND_REQ_VANILLA = (expression to append to vanilla job requirements) |
1711 |
#APPEND_REQ_STANDARD = (expression to append to standard job requirements) |
1712 |
#APPEND_RANK_STANDARD = (expression to append to vanilla job rank) |
1713 |
#APPEND_RANK_VANILLA = (expression to append to standard job rank) |
1714 |
|
1715 |
## This can be used to define a default value for the rank expression |
1716 |
## if one is not specified in the submit file. |
1717 |
#DEFAULT_RANK = (default rank expression for all jobs) |
1718 |
|
1719 |
## If you want universe-specific defaults, you can use the following |
1720 |
## entries: |
1721 |
#DEFAULT_RANK_VANILLA = (default rank expression for vanilla jobs) |
1722 |
#DEFAULT_RANK_STANDARD = (default rank expression for standard jobs) |
1723 |
|
1724 |
## If you want condor_submit to automatically append expressions to |
1725 |
## the job ClassAds it creates, you can uncomment and define the |
1726 |
## SUBMIT_EXPRS setting. It works just like the STARTD_EXPRS |
1727 |
## described above with respect to ClassAd vs. config file syntax, |
1728 |
## strings, etc. One common use would be to have the full hostname |
1729 |
## of the machine where a job was submitted placed in the job |
1730 |
## ClassAd. You would do this by uncommenting the following lines: |
1731 |
#MACHINE = "$(FULL_HOSTNAME)" |
1732 |
#SUBMIT_EXPRS = MACHINE |
1733 |
|
1734 |
## Condor keeps a buffer of recently-used data for each file an |
1735 |
## application opens. This macro specifies the default maximum number |
1736 |
## of bytes to be buffered for each open file at the executing |
1737 |
## machine. |
1738 |
#DEFAULT_IO_BUFFER_SIZE = 524288 |
1739 |
|
1740 |
## Condor will attempt to consolidate small read and write operations |
1741 |
## into large blocks. This macro specifies the default block size |
1742 |
## Condor will use. |
1743 |
#DEFAULT_IO_BUFFER_BLOCK_SIZE = 32768 |
1744 |
|
1745 |
##-------------------------------------------------------------------- |
1746 |
## condor_preen |
1747 |
##-------------------------------------------------------------------- |
1748 |
## Who should condor_preen send email to? |
1749 |
#PREEN_ADMIN = $(CONDOR_ADMIN) |
1750 |
|
1751 |
## What files should condor_preen leave in the spool directory? |
1752 |
VALID_SPOOL_FILES = job_queue.log, job_queue.log.tmp, history, \ |
1753 |
Accountant.log, Accountantnew.log, \ |
1754 |
local_univ_execute, .quillwritepassword, \ |
1755 |
.pgpass, \ |
1756 |
.schedd_address, .schedd_classad |
1757 |
|
1758 |
## What files should condor_preen remove from the log directory? |
1759 |
INVALID_LOG_FILES = core |
1760 |
|
1761 |
##-------------------------------------------------------------------- |
1762 |
## Java parameters: |
1763 |
##-------------------------------------------------------------------- |
1764 |
## If you would like this machine to be able to run Java jobs, |
1765 |
## then set JAVA to the path of your JVM binary. If you are not |
1766 |
## interested in Java, there is no harm in leaving this entry |
1767 |
## empty or incorrect. |
1768 |
|
1769 |
JAVA = /usr/bin/java |
1770 |
|
1771 |
## JAVA_CLASSPATH_DEFAULT gives the default set of paths in which |
1772 |
## Java classes are to be found. Each path is separated by spaces. |
1773 |
## If your JVM needs to be informed of additional directories, add |
1774 |
## them here. However, do not remove the existing entries, as Condor |
1775 |
## needs them. |
1776 |
|
1777 |
JAVA_CLASSPATH_DEFAULT = $(LIB) $(LIB)/scimark2lib.jar . |
1778 |
|
1779 |
## JAVA_CLASSPATH_ARGUMENT describes the command-line parameter |
1780 |
## used to introduce a new classpath: |
1781 |
|
1782 |
JAVA_CLASSPATH_ARGUMENT = -classpath |
1783 |
|
1784 |
## JAVA_CLASSPATH_SEPARATOR describes the character used to mark |
1785 |
## one path element from another: |
1786 |
|
1787 |
JAVA_CLASSPATH_SEPARATOR = : |
1788 |
|
1789 |
## JAVA_BENCHMARK_TIME describes the number of seconds for which |
1790 |
## to run Java benchmarks. A longer time yields a more accurate |
1791 |
## benchmark, but consumes more otherwise useful CPU time. |
1792 |
## If this time is zero or undefined, no Java benchmarks will be run. |
1793 |
|
1794 |
JAVA_BENCHMARK_TIME = 2 |
1795 |
|
1796 |
## If your JVM requires any special arguments not mentioned in |
1797 |
## the options above, then give them here. |
1798 |
|
1799 |
JAVA_EXTRA_ARGUMENTS = |
1800 |
|
1801 |
## |
1802 |
##-------------------------------------------------------------------- |
1803 |
## Condor-G settings |
1804 |
##-------------------------------------------------------------------- |
1805 |
## Where is the GridManager binary installed? |
1806 |
|
1807 |
GRIDMANAGER = $(SBIN)/condor_gridmanager |
1808 |
GT2_GAHP = $(SBIN)/gahp_server |
1809 |
GRID_MONITOR = $(SBIN)/grid_monitor.sh |
1810 |
|
1811 |
##-------------------------------------------------------------------- |
1812 |
## Settings that control the daemon's debugging output: |
1813 |
##-------------------------------------------------------------------- |
1814 |
## |
1815 |
## Note that the Gridmanager runs as the User, not a Condor daemon, so |
1816 |
## all users must have write permssion to the directory that the |
1817 |
## Gridmanager will use for it's logfile. Our suggestion is to create a |
1818 |
## directory called GridLogs in $(LOG) with UNIX permissions 1777 |
1819 |
## (just like /tmp ) |
1820 |
## Another option is to use /tmp as the location of the GridManager log. |
1821 |
## |
1822 |
|
1823 |
MAX_GRIDMANAGER_LOG = 1000000 |
1824 |
GRIDMANAGER_DEBUG = |
1825 |
|
1826 |
GRIDMANAGER_LOG = $(LOG)/GridmanagerLog.$(USERNAME) |
1827 |
GRIDMANAGER_LOCK = $(LOCK)/GridmanagerLock.$(USERNAME) |
1828 |
|
1829 |
##-------------------------------------------------------------------- |
1830 |
## Various other settings that the Condor-G can use. |
1831 |
##-------------------------------------------------------------------- |
1832 |
|
1833 |
## For grid-type gt2 jobs (pre-WS GRAM), limit the number of jobmanager |
1834 |
## processes the gridmanager will let run on the headnode. Letting too |
1835 |
## many jobmanagers run causes severe load on the headnode. |
1836 |
GRIDMANAGER_MAX_JOBMANAGERS_PER_RESOURCE = 10 |
1837 |
|
1838 |
## If we're talking to a Globus 2.0 resource, Condor-G will use the new |
1839 |
## version of the GRAM protocol. The first option is how often to check the |
1840 |
## proxy on the submit site of things. If the GridManager discovers a new |
1841 |
## proxy, it will restart itself and use the new proxy for all future |
1842 |
## jobs launched. In seconds, and defaults to 10 minutes |
1843 |
#GRIDMANAGER_CHECKPROXY_INTERVAL = 600 |
1844 |
|
1845 |
## The GridManager will shut things down 3 minutes before loosing Contact |
1846 |
## because of an expired proxy. |
1847 |
## In seconds, and defaults to 3 minutes |
1848 |
#GRDIMANAGER_MINIMUM_PROXY_TIME = 180 |
1849 |
|
1850 |
## Condor requires that each submitted job be designated to run under a |
1851 |
## particular "universe". |
1852 |
## |
1853 |
## If no universe is specificed in the submit file, Condor must pick one |
1854 |
## for the job to use. By default, it chooses the "vanilla" universe. |
1855 |
## The default can be overridden in the config file with the DEFAULT_UNIVERSE |
1856 |
## setting, which is a string to insert into a job submit description if the |
1857 |
## job does not try and define it's own universe |
1858 |
## |
1859 |
#DEFAULT_UNIVERSE = vanilla |
1860 |
|
1861 |
# |
1862 |
# The Cred_min_time_left is the first-pass at making sure that Condor-G |
1863 |
# does not submit your job without it having enough time left for the |
1864 |
# job to finish. For example, if you have a job that runs for 20 minutes, and |
1865 |
# you might spend 40 minutes in the queue, it's a bad idea to submit with less |
1866 |
# than an hour left before your proxy expires. |
1867 |
# 2 hours seemed like a reasonable default. |
1868 |
# |
1869 |
CRED_MIN_TIME_LEFT = 120 |
1870 |
|
1871 |
|
1872 |
## |
1873 |
## The GridMonitor allows you to submit many more jobs to a GT2 GRAM server |
1874 |
## than is normally possible. |
1875 |
#ENABLE_GRID_MONITOR = TRUE |
1876 |
|
1877 |
## |
1878 |
## When an error occurs with the GridMonitor, how long should the |
1879 |
## gridmanager wait before trying to submit a new GridMonitor job? |
1880 |
## The default is 1 hour (3600 seconds). |
1881 |
#GRID_MONITOR_DISABLE_TIME = 3600 |
1882 |
|
1883 |
## |
1884 |
## The location of the wrapper for invoking |
1885 |
## Condor GAHP server |
1886 |
## |
1887 |
CONDOR_GAHP = $(SBIN)/condor_c-gahp |
1888 |
CONDOR_GAHP_WORKER = $(SBIN)/condor_c-gahp_worker_thread |
1889 |
|
1890 |
## |
1891 |
## The Condor GAHP server has it's own log. Like the Gridmanager, the |
1892 |
## GAHP server is run as the User, not a Condor daemon, so all users must |
1893 |
## have write permssion to the directory used for the logfile. Our |
1894 |
## suggestion is to create a directory called GridLogs in $(LOG) with |
1895 |
## UNIX permissions 1777 (just like /tmp ) |
1896 |
## Another option is to use /tmp as the location of the CGAHP log. |
1897 |
## |
1898 |
MAX_C_GAHP_LOG = 1000000 |
1899 |
|
1900 |
#C_GAHP_LOG = $(LOG)/GridLogs/CGAHPLog.$(USERNAME) |
1901 |
C_GAHP_LOG = /tmp/CGAHPLog.$(USERNAME) |
1902 |
C_GAHP_LOCK = /tmp/CGAHPLock.$(USERNAME) |
1903 |
C_GAHP_WORKER_THREAD_LOG = /tmp/CGAHPWorkerLog.$(USERNAME) |
1904 |
C_GAHP_WORKER_THREAD_LOCK = /tmp/CGAHPWorkerLock.$(USERNAME) |
1905 |
|
1906 |
## |
1907 |
## The location of the wrapper for invoking |
1908 |
## GT4 GAHP server |
1909 |
## |
1910 |
GT4_GAHP = $(SBIN)/gt4_gahp |
1911 |
|
1912 |
## |
1913 |
## The location of GT4 files. This should normally be lib/gt4 |
1914 |
## |
1915 |
GT4_LOCATION = $(LIB)/gt4 |
1916 |
|
1917 |
## |
1918 |
## The location of the wrapper for invoking |
1919 |
## GT4 GAHP server |
1920 |
## |
1921 |
GT42_GAHP = $(SBIN)/gt42_gahp |
1922 |
|
1923 |
## |
1924 |
## The location of GT4 files. This should normally be lib/gt4 |
1925 |
## |
1926 |
GT42_LOCATION = $(LIB)/gt42 |
1927 |
|
1928 |
## |
1929 |
## gt4 gram requires a gridftp server to perform file transfers. |
1930 |
## If GRIDFTP_URL_BASE is set, then Condor assumes there is a gridftp |
1931 |
## server set up at that URL suitable for its use. Otherwise, Condor |
1932 |
## will start its own gridftp servers as needed, using the binary |
1933 |
## pointed at by GRIDFTP_SERVER. GRIDFTP_SERVER_WRAPPER points to a |
1934 |
## wrapper script needed to properly set the path to the gridmap file. |
1935 |
## |
1936 |
#GRIDFTP_URL_BASE = gsiftp://$(FULL_HOSTNAME) |
1937 |
GRIDFTP_SERVER = $(LIBEXEC)/globus-gridftp-server |
1938 |
GRIDFTP_SERVER_WRAPPER = $(LIBEXEC)/gridftp_wrapper.sh |
1939 |
|
1940 |
## |
1941 |
## Location of the PBS/LSF gahp and its associated binaries |
1942 |
## |
1943 |
GLITE_LOCATION = $(LIBEXEC)/glite |
1944 |
PBS_GAHP = $(GLITE_LOCATION)/bin/batch_gahp |
1945 |
LSF_GAHP = $(GLITE_LOCATION)/bin/batch_gahp |
1946 |
|
1947 |
## |
1948 |
## The location of the wrapper for invoking the Unicore GAHP server |
1949 |
## |
1950 |
UNICORE_GAHP = $(SBIN)/unicore_gahp |
1951 |
|
1952 |
## |
1953 |
## The location of the wrapper for invoking the NorduGrid GAHP server |
1954 |
## |
1955 |
NORDUGRID_GAHP = $(SBIN)/nordugrid_gahp |
1956 |
|
1957 |
## The location of the CREAM GAHP server |
1958 |
CREAM_GAHP = $(SBIN)/cream_gahp |
1959 |
|
1960 |
## Condor-G and CredD can use MyProxy to refresh GSI proxies which are |
1961 |
## about to expire. |
1962 |
#MYPROXY_GET_DELEGATION = /path/to/myproxy-get-delegation |
1963 |
|
1964 |
## The location of the Deltacloud GAHP server |
1965 |
DELTACLOUD_GAHP = $(SBIN)/deltacloud_gahp |
1966 |
|
1967 |
## |
1968 |
## EC2: Universe = Grid, Grid_Resource = Amazon |
1969 |
## |
1970 |
|
1971 |
## The location of the amazon_gahp program, required |
1972 |
AMAZON_GAHP = $(SBIN)/amazon_gahp |
1973 |
|
1974 |
## Location of log files, useful for debugging, must be in |
1975 |
## a directory writable by any user, such as /tmp |
1976 |
#AMAZON_GAHP_DEBUG = D_FULLDEBUG |
1977 |
AMAZON_GAHP_LOG = /tmp/AmazonGahpLog.$(USERNAME) |
1978 |
|
1979 |
## The number of seconds between status update requests to EC2. You can |
1980 |
## make this short (5 seconds) if you want Condor to respond quickly to |
1981 |
## instances as they terminate, or you can make it long (300 seconds = 5 |
1982 |
## minutes) if you know your instances will run for awhile and don't mind |
1983 |
## delay between when they stop and when Condor responds to them |
1984 |
## stopping. |
1985 |
GRIDMANAGER_JOB_PROBE_INTERVAL = 300 |
1986 |
|
1987 |
## As of this writing Amazon EC2 has a hard limit of 20 concurrently |
1988 |
## running instances, so a limit of 20 is imposed so the GridManager |
1989 |
## does not waste its time sending requests that will be rejected. |
1990 |
GRIDMANAGER_MAX_SUBMITTED_JOBS_PER_RESOURCE_AMAZON = 20 |
1991 |
|
1992 |
## |
1993 |
##-------------------------------------------------------------------- |
1994 |
## condor_credd credential managment daemon |
1995 |
##-------------------------------------------------------------------- |
1996 |
## Where is the CredD binary installed? |
1997 |
CREDD = $(SBIN)/condor_credd |
1998 |
|
1999 |
## When the credd starts up, it can place it's address (IP and port) |
2000 |
## into a file. This way, tools running on the local machine don't |
2001 |
## need an additional "-n host:port" command line option. This |
2002 |
## feature can be turned off by commenting out this setting. |
2003 |
CREDD_ADDRESS_FILE = $(LOG)/.credd_address |
2004 |
|
2005 |
## Specify a remote credd server here, |
2006 |
#CREDD_HOST = $(CONDOR_HOST):$(CREDD_PORT) |
2007 |
|
2008 |
## CredD startup arguments |
2009 |
## Start the CredD on a well-known port. Uncomment to to simplify |
2010 |
## connecting to a remote CredD. Note: that this interface may change |
2011 |
## in a future release. |
2012 |
CREDD_PORT = 9620 |
2013 |
CREDD_ARGS = -p $(CREDD_PORT) -f |
2014 |
|
2015 |
## CredD daemon debugging log |
2016 |
CREDD_LOG = $(LOG)/CredLog |
2017 |
CREDD_DEBUG = D_FULLDEBUG |
2018 |
MAX_CREDD_LOG = 4000000 |
2019 |
|
2020 |
## The credential owner submits the credential. This list specififies |
2021 |
## other user who are also permitted to see all credentials. Defaults |
2022 |
## to root on Unix systems, and Administrator on Windows systems. |
2023 |
#CRED_SUPER_USERS = |
2024 |
|
2025 |
## Credential storage location. This directory must exist |
2026 |
## prior to starting condor_credd. It is highly recommended to |
2027 |
## restrict access permissions to _only_ the directory owner. |
2028 |
CRED_STORE_DIR = $(LOCAL_DIR)/cred_dir |
2029 |
|
2030 |
## Index file path of saved credentials. |
2031 |
## This file will be automatically created if it does not exist. |
2032 |
#CRED_INDEX_FILE = $(CRED_STORE_DIR/cred-index |
2033 |
|
2034 |
## condor_credd will attempt to refresh credentials when their |
2035 |
## remaining lifespan is less than this value. Units = seconds. |
2036 |
#DEFAULT_CRED_EXPIRE_THRESHOLD = 3600 |
2037 |
|
2038 |
## condor-credd periodically checks remaining lifespan of stored |
2039 |
## credentials, at this interval. |
2040 |
#CRED_CHECK_INTERVAL = 60 |
2041 |
|
2042 |
## |
2043 |
##-------------------------------------------------------------------- |
2044 |
## Stork data placment server |
2045 |
##-------------------------------------------------------------------- |
2046 |
## Where is the Stork binary installed? |
2047 |
STORK = $(SBIN)/stork_server |
2048 |
|
2049 |
## When Stork starts up, it can place it's address (IP and port) |
2050 |
## into a file. This way, tools running on the local machine don't |
2051 |
## need an additional "-n host:port" command line option. This |
2052 |
## feature can be turned off by commenting out this setting. |
2053 |
STORK_ADDRESS_FILE = $(LOG)/.stork_address |
2054 |
|
2055 |
## Specify a remote Stork server here, |
2056 |
#STORK_HOST = $(CONDOR_HOST):$(STORK_PORT) |
2057 |
|
2058 |
## STORK_LOG_BASE specifies the basename for heritage Stork log files. |
2059 |
## Stork uses this macro to create the following output log files: |
2060 |
## $(STORK_LOG_BASE): Stork server job queue classad collection |
2061 |
## journal file. |
2062 |
## $(STORK_LOG_BASE).history: Used to track completed jobs. |
2063 |
## $(STORK_LOG_BASE).user_log: User level log, also used by DAGMan. |
2064 |
STORK_LOG_BASE = $(LOG)/Stork |
2065 |
|
2066 |
## Modern Condor DaemonCore logging feature. |
2067 |
STORK_LOG = $(LOG)/StorkLog |
2068 |
STORK_DEBUG = D_FULLDEBUG |
2069 |
MAX_STORK_LOG = 4000000 |
2070 |
|
2071 |
## Stork startup arguments |
2072 |
## Start Stork on a well-known port. Uncomment to to simplify |
2073 |
## connecting to a remote Stork. Note: that this interface may change |
2074 |
## in a future release. |
2075 |
#STORK_PORT = 34048 |
2076 |
STORK_PORT = 9621 |
2077 |
STORK_ARGS = -p $(STORK_PORT) -f -Serverlog $(STORK_LOG_BASE) |
2078 |
|
2079 |
## Stork environment. Stork modules may require external programs and |
2080 |
## shared object libraries. These are located using the PATH and |
2081 |
## LD_LIBRARY_PATH environments. Further, some modules may require |
2082 |
## further specific environments. By default, Stork inherits a full |
2083 |
## environment when invoked from condor_master or the shell. If the |
2084 |
## default environment is not adequate for all Stork modules, specify |
2085 |
## a replacement environment here. This environment will be set by |
2086 |
## condor_master before starting Stork, but does not apply if Stork is |
2087 |
## started directly from the command line. |
2088 |
#STORK_ENVIRONMENT = TMP=/tmp;CONDOR_CONFIG=/special/config;PATH=/lib |
2089 |
|
2090 |
## Limits the number of concurrent data placements handled by Stork. |
2091 |
#STORK_MAX_NUM_JOBS = 5 |
2092 |
|
2093 |
## Limits the number of retries for a failed data placement. |
2094 |
#STORK_MAX_RETRY = 5 |
2095 |
|
2096 |
## Limits the run time for a data placement job, after which the |
2097 |
## placement is considered failed. |
2098 |
#STORK_MAXDELAY_INMINUTES = 10 |
2099 |
|
2100 |
## Temporary credential storage directory used by Stork. |
2101 |
#STORK_TMP_CRED_DIR = /tmp |
2102 |
|
2103 |
## Directory containing Stork modules. |
2104 |
#STORK_MODULE_DIR = $(LIBEXEC) |
2105 |
|
2106 |
## |
2107 |
##-------------------------------------------------------------------- |
2108 |
## Quill Job Queue Mirroring Server |
2109 |
##-------------------------------------------------------------------- |
2110 |
## Where is the Quill binary installed and what arguments should be passed? |
2111 |
QUILL = $(SBIN)/condor_quill |
2112 |
#QUILL_ARGS = |
2113 |
|
2114 |
# Where is the log file for the quill daemon? |
2115 |
QUILL_LOG = $(LOG)/QuillLog |
2116 |
|
2117 |
# The identification and location of the quill daemon for local clients. |
2118 |
QUILL_ADDRESS_FILE = $(LOG)/.quill_address |
2119 |
|
2120 |
# If this is set to true, then the rest of the QUILL arguments must be defined |
2121 |
# for quill to function. If it is Fase or left undefined, then quill will not |
2122 |
# be consulted by either the scheduler or the tools, but in the case of a |
2123 |
# remote quill query where the local client has quill turned off, but the |
2124 |
# remote client has quill turned on, things will still function normally. |
2125 |
#QUILL_ENABLED = TRUE |
2126 |
|
2127 |
# |
2128 |
# If Quill is enabled, by default it will only mirror the current job |
2129 |
# queue into the database. For historical jobs, and classads from other |
2130 |
# sources, the SQL Log must be enabled. |
2131 |
#QUILL_USE_SQL_LOG=FALSE |
2132 |
|
2133 |
# |
2134 |
# The SQL Log can be enabled on a per-daemon basis. For example, to collect |
2135 |
# historical job information, but store no information about execute machines, |
2136 |
# uncomment these two lines |
2137 |
#QUILL_USE_SQL_LOG = FALSE |
2138 |
#SCHEDD.QUILL_USE_SQL_LOG = TRUE |
2139 |
|
2140 |
# This will be the name of a quill daemon using this config file. This name |
2141 |
# should not conflict with any other quill name--or schedd name. |
2142 |
#QUILL_NAME = quill@postgresql-server.machine.com |
2143 |
|
2144 |
# The Postgreql server requires usernames that can manipulate tables. This will |
2145 |
# be the username associated with this instance of the quill daemon mirroring |
2146 |
# a schedd's job queue. Each quill daemon must have a unique username |
2147 |
# associated with it otherwise multiple quill daemons will corrupt the data |
2148 |
# held under an indentical user name. |
2149 |
#QUILL_DB_NAME = name_of_db |
2150 |
|
2151 |
# The required password for the DB user which quill will use to read |
2152 |
# information from the database about the queue. |
2153 |
#QUILL_DB_QUERY_PASSWORD = foobar |
2154 |
|
2155 |
# What kind of database server is this? |
2156 |
# For now, only PGSQL is supported |
2157 |
#QUILL_DB_TYPE = PGSQL |
2158 |
|
2159 |
# The machine and port of the postgres server. |
2160 |
# Although this says IP Addr, it can be a DNS name. |
2161 |
# It must match whatever format you used for the .pgpass file, however |
2162 |
#QUILL_DB_IP_ADDR = machine.domain.com:5432 |
2163 |
|
2164 |
# The login to use to attach to the database for updating information. |
2165 |
# There should be an entry in file $SPOOL/.pgpass that gives the password |
2166 |
# for this login id. |
2167 |
#QUILL_DB_USER = quillwriter |
2168 |
|
2169 |
# Polling period, in seconds, for when quill reads transactions out of the |
2170 |
# schedd's job queue log file and puts them into the database. |
2171 |
#QUILL_POLLING_PERIOD = 10 |
2172 |
|
2173 |
# Allows or disallows a remote query to the quill daemon and database |
2174 |
# which is reading this log file. Defaults to true. |
2175 |
#QUILL_IS_REMOTELY_QUERYABLE = TRUE |
2176 |
|
2177 |
# Add debugging flags to here if you need to debug quill for some reason. |
2178 |
#QUILL_DEBUG = D_FULLDEBUG |
2179 |
|
2180 |
# Number of seconds the master should wait for the Quill daemon to respond |
2181 |
# before killing it. This number might need to be increased for very |
2182 |
# large logfiles. |
2183 |
# The default is 3600 (one hour), but kicking it up to a few hours won't hurt |
2184 |
#QUILL_NOT_RESPONDING_TIMEOUT = 3600 |
2185 |
|
2186 |
# Should Quill hold open a database connection to the DBMSD? |
2187 |
# Each open connection consumes resources at the server, so large pools |
2188 |
# (100 or more machines) should set this variable to FALSE. Note the |
2189 |
# default is TRUE. |
2190 |
#QUILL_MAINTAIN_DB_CONN = TRUE |
2191 |
|
2192 |
## |
2193 |
##-------------------------------------------------------------------- |
2194 |
## Database Management Daemon settings |
2195 |
##-------------------------------------------------------------------- |
2196 |
## Where is the DBMSd binary installed and what arguments should be passed? |
2197 |
DBMSD = $(SBIN)/condor_dbmsd |
2198 |
DBMSD_ARGS = -f |
2199 |
|
2200 |
# Where is the log file for the quill daemon? |
2201 |
DBMSD_LOG = $(LOG)/DbmsdLog |
2202 |
|
2203 |
# Interval between consecutive purging calls (in seconds) |
2204 |
#DATABASE_PURGE_INTERVAL = 86400 |
2205 |
|
2206 |
# Interval between consecutive database reindexing operations |
2207 |
# This is only used when dbtype = PGSQL |
2208 |
#DATABASE_REINDEX_INTERVAL = 86400 |
2209 |
|
2210 |
# Number of days before purging resource classad history |
2211 |
# This includes things like machine ads, daemon ads, submitters |
2212 |
#QUILL_RESOURCE_HISTORY_DURATION = 7 |
2213 |
|
2214 |
# Number of days before purging job run information |
2215 |
# This includes job events, file transfers, matchmaker matches, etc |
2216 |
# This does NOT include the final job ad. condor_history does not need |
2217 |
# any of this information to work. |
2218 |
#QUILL_RUN_HISTORY_DURATION = 7 |
2219 |
|
2220 |
# Number of days before purging job classad history |
2221 |
# This is the information needed to run condor_history |
2222 |
#QUILL_JOB_HISTORY_DURATION = 3650 |
2223 |
|
2224 |
# DB size threshold for warning the condor administrator. This is checked |
2225 |
# after every purge. The size is given in gigabytes. |
2226 |
#QUILL_DBSIZE_LIMIT = 20 |
2227 |
|
2228 |
# Number of seconds the master should wait for the DBMSD to respond before |
2229 |
# killing it. This number might need to be increased for very large databases |
2230 |
# The default is 3600 (one hour). |
2231 |
#DBMSD_NOT_RESPONDING_TIMEOUT = 3600 |
2232 |
|
2233 |
## |
2234 |
##-------------------------------------------------------------------- |
2235 |
## VM Universe Parameters |
2236 |
##-------------------------------------------------------------------- |
2237 |
## Where is the Condor VM-GAHP installed? (Required) |
2238 |
VM_GAHP_SERVER = $(SBIN)/condor_vm-gahp |
2239 |
|
2240 |
## If the VM-GAHP is to have its own log, define |
2241 |
## the location of log file. |
2242 |
## |
2243 |
## Optionally, if you do NOT define VM_GAHP_LOG, logs of VM-GAHP will |
2244 |
## be stored in the starter's log file. |
2245 |
## However, on Windows machine you must always define VM_GAHP_LOG. |
2246 |
# |
2247 |
VM_GAHP_LOG = $(LOG)/VMGahpLog |
2248 |
MAX_VM_GAHP_LOG = 1000000 |
2249 |
#VM_GAHP_DEBUG = D_FULLDEBUG |
2250 |
|
2251 |
## What kind of virtual machine program will be used for |
2252 |
## the VM universe? |
2253 |
## The two options are vmware and xen. (Required) |
2254 |
#VM_TYPE = vmware |
2255 |
|
2256 |
## How much memory can be used for the VM universe? (Required) |
2257 |
## This value is the maximum amount of memory that can be used by the |
2258 |
## virtual machine program. |
2259 |
#VM_MEMORY = 128 |
2260 |
|
2261 |
## Want to support networking for VM universe? |
2262 |
## Default value is FALSE |
2263 |
#VM_NETWORKING = FALSE |
2264 |
|
2265 |
## What kind of networking types are supported? |
2266 |
## |
2267 |
## If you set VM_NETWORKING to TRUE, you must define this parameter. |
2268 |
## VM_NETWORKING_TYPE = nat |
2269 |
## VM_NETWORKING_TYPE = bridge |
2270 |
## VM_NETWORKING_TYPE = nat, bridge |
2271 |
## |
2272 |
## If multiple networking types are defined, you may define |
2273 |
## VM_NETWORKING_DEFAULT_TYPE for default networking type. |
2274 |
## Otherwise, nat is used for default networking type. |
2275 |
## VM_NETWORKING_DEFAULT_TYPE = nat |
2276 |
#VM_NETWORKING_DEFAULT_TYPE = nat |
2277 |
#VM_NETWORKING_TYPE = nat |
2278 |
|
2279 |
## In default, the number of possible virtual machines is same as |
2280 |
## NUM_CPUS. |
2281 |
## Since too many virtual machines can cause the system to be too slow |
2282 |
## and lead to unexpected problems, limit the number of running |
2283 |
## virtual machines on this machine with |
2284 |
#VM_MAX_NUMBER = 2 |
2285 |
|
2286 |
## When a VM universe job is started, a status command is sent |
2287 |
## to the VM-GAHP to see if the job is finished. |
2288 |
## If the interval between checks is too short, it will consume |
2289 |
## too much of the CPU. If the VM-GAHP fails to get status 5 times in a row, |
2290 |
## an error will be reported to startd, and then startd will check |
2291 |
## the availability of VM universe. |
2292 |
## Default value is 60 seconds and minimum value is 30 seconds |
2293 |
#VM_STATUS_INTERVAL = 60 |
2294 |
|
2295 |
## How long will we wait for a request sent to the VM-GAHP to be completed? |
2296 |
## If a request is not completed within the timeout, an error will be reported |
2297 |
## to the startd, and then the startd will check |
2298 |
## the availability of vm universe. Default value is 5 mins. |
2299 |
#VM_GAHP_REQ_TIMEOUT = 300 |
2300 |
|
2301 |
## When VMware or Xen causes an error, the startd will disable the |
2302 |
## VM universe. However, because some errors are just transient, |
2303 |
## we will test one more |
2304 |
## whether vm universe is still unavailable after some time. |
2305 |
## In default, startd will recheck vm universe after 10 minutes. |
2306 |
## If the test also fails, vm universe will be disabled. |
2307 |
#VM_RECHECK_INTERVAL = 600 |
2308 |
|
2309 |
## Usually, when we suspend a VM, the memory being used by the VM |
2310 |
## will be saved into a file and then freed. |
2311 |
## However, when we use soft suspend, neither saving nor memory freeing |
2312 |
## will occur. |
2313 |
## For VMware, we send SIGSTOP to a process for VM in order to |
2314 |
## stop the VM temporarily and send SIGCONT to resume the VM. |
2315 |
## For Xen, we pause CPU. Pausing CPU doesn't save the memory of VM |
2316 |
## into a file. It only stops the execution of a VM temporarily. |
2317 |
#VM_SOFT_SUSPEND = TRUE |
2318 |
|
2319 |
## If Condor runs as root and a job comes from a different UID domain, |
2320 |
## Condor generally uses "nobody", unless SLOTx_USER is defined. |
2321 |
## If "VM_UNIV_NOBODY_USER" is defined, a VM universe job will run |
2322 |
## as the user defined in "VM_UNIV_NOBODY_USER" instead of "nobody". |
2323 |
## |
2324 |
## Notice: In VMware VM universe, "nobody" can not create a VMware VM. |
2325 |
## So we need to define "VM_UNIV_NOBODY_USER" with a regular user. |
2326 |
## For VMware, the user defined in "VM_UNIV_NOBODY_USER" must have a |
2327 |
## home directory. So SOFT_UID_DOMAIN doesn't work for VMware VM universe job. |
2328 |
## If neither "VM_UNIV_NOBODY_USER" nor "SLOTx_VMUSER"/"SLOTx_USER" is defined, |
2329 |
## VMware VM universe job will run as "condor" instead of "nobody". |
2330 |
## As a result, the preference of local users for a VMware VM universe job |
2331 |
## which comes from the different UID domain is |
2332 |
## "VM_UNIV_NOBODY_USER" -> "SLOTx_VMUSER" -> "SLOTx_USER" -> "condor". |
2333 |
#VM_UNIV_NOBODY_USER = login name of a user who has home directory |
2334 |
|
2335 |
## If Condor runs as root and "ALWAYS_VM_UNIV_USE_NOBODY" is set to TRUE, |
2336 |
## all VM universe jobs will run as a user defined in "VM_UNIV_NOBODY_USER". |
2337 |
#ALWAYS_VM_UNIV_USE_NOBODY = FALSE |
2338 |
|
2339 |
##-------------------------------------------------------------------- |
2340 |
## VM Universe Parameters Specific to VMware |
2341 |
##-------------------------------------------------------------------- |
2342 |
|
2343 |
## Where is perl program? (Required) |
2344 |
VMWARE_PERL = perl |
2345 |
|
2346 |
## Where is the Condor script program to control VMware? (Required) |
2347 |
VMWARE_SCRIPT = $(SBIN)/condor_vm_vmware.pl |
2348 |
|
2349 |
## Networking parameters for VMware |
2350 |
## |
2351 |
## What kind of VMware networking is used? |
2352 |
## |
2353 |
## If multiple networking types are defined, you may specify different |
2354 |
## parameters for each networking type. |
2355 |
## |
2356 |
## Examples |
2357 |
## (e.g.) VMWARE_NAT_NETWORKING_TYPE = nat |
2358 |
## (e.g.) VMWARE_BRIDGE_NETWORKING_TYPE = bridged |
2359 |
## |
2360 |
## If there is no parameter for specific networking type, VMWARE_NETWORKING_TYPE is used. |
2361 |
## |
2362 |
#VMWARE_NAT_NETWORKING_TYPE = nat |
2363 |
#VMWARE_BRIDGE_NETWORKING_TYPE = bridged |
2364 |
VMWARE_NETWORKING_TYPE = nat |
2365 |
|
2366 |
## The contents of this file will be inserted into the .vmx file of |
2367 |
## the VMware virtual machine before Condor starts it. |
2368 |
#VMWARE_LOCAL_SETTINGS_FILE = /path/to/file |
2369 |
|
2370 |
##-------------------------------------------------------------------- |
2371 |
## VM Universe Parameters common to libvirt controlled vm's (xen & kvm) |
2372 |
##-------------------------------------------------------------------- |
2373 |
|
2374 |
## Networking parameters for Xen & KVM |
2375 |
## |
2376 |
## This is the path to the XML helper command; the libvirt_simple_script.awk |
2377 |
## script just reproduces what Condor already does for the kvm/xen VM |
2378 |
## universe |
2379 |
LIBVIRT_XML_SCRIPT = $(LIBEXEC)/libvirt_simple_script.awk |
2380 |
|
2381 |
## This is the optional debugging output file for the xml helper |
2382 |
## script. Scripts that need to output debugging messages should |
2383 |
## write them to the file specified by this argument, which will be |
2384 |
## passed as the second command line argument when the script is |
2385 |
## executed |
2386 |
|
2387 |
#LIBVRT_XML_SCRIPT_ARGS = /dev/stderr |
2388 |
|
2389 |
##-------------------------------------------------------------------- |
2390 |
## VM Universe Parameters Specific to Xen |
2391 |
##-------------------------------------------------------------------- |
2392 |
|
2393 |
## Where is bootloader for Xen domainU? (Required) |
2394 |
## |
2395 |
## The bootloader will be used in the case that a kernel image includes |
2396 |
## a disk image |
2397 |
#XEN_BOOTLOADER = /usr/bin/pygrub |
2398 |
|
2399 |
## The contents of this file will be added to the Xen virtual machine |
2400 |
## description that Condor writes. |
2401 |
#XEN_LOCAL_SETTINGS_FILE = /path/to/file |
2402 |
|
2403 |
## |
2404 |
##-------------------------------------------------------------------- |
2405 |
## condor_lease_manager lease manager daemon |
2406 |
##-------------------------------------------------------------------- |
2407 |
## Where is the LeaseManager binary installed? |
2408 |
LeaseManager = $(SBIN)/condor_lease_manager |
2409 |
|
2410 |
# Turn on the lease manager |
2411 |
#DAEMON_LIST = $(DAEMON_LIST), LeaseManager |
2412 |
|
2413 |
# The identification and location of the lease manager for local clients. |
2414 |
LeaseManger_ADDRESS_FILE = $(LOG)/.lease_manager_address |
2415 |
|
2416 |
## LeaseManager startup arguments |
2417 |
#LeaseManager_ARGS = -local-name generic |
2418 |
|
2419 |
## LeaseManager daemon debugging log |
2420 |
LeaseManager_LOG = $(LOG)/LeaseManagerLog |
2421 |
LeaseManager_DEBUG = D_FULLDEBUG |
2422 |
MAX_LeaseManager_LOG = 1000000 |
2423 |
|
2424 |
# Basic parameters |
2425 |
LeaseManager.GETADS_INTERVAL = 60 |
2426 |
LeaseManager.UPDATE_INTERVAL = 300 |
2427 |
LeaseManager.PRUNE_INTERVAL = 60 |
2428 |
LeaseManager.DEBUG_ADS = False |
2429 |
|
2430 |
LeaseManager.CLASSAD_LOG = $(SPOOL)/LeaseManagerState |
2431 |
#LeaseManager.QUERY_ADTYPE = Any |
2432 |
#LeaseManager.QUERY_CONSTRAINTS = MyType == "SomeType" |
2433 |
#LeaseManager.QUERY_CONSTRAINTS = TargetType == "SomeType" |
2434 |
|
2435 |
## |
2436 |
##-------------------------------------------------------------------- |
2437 |
## KBDD - keyboard activity detection daemon |
2438 |
##-------------------------------------------------------------------- |
2439 |
## When the KBDD starts up, it can place it's address (IP and port) |
2440 |
## into a file. This way, tools running on the local machine don't |
2441 |
## need an additional "-n host:port" command line option. This |
2442 |
## feature can be turned off by commenting out this setting. |
2443 |
KBDD_ADDRESS_FILE = $(LOG)/.kbdd_address |
2444 |
|
2445 |
## |
2446 |
##-------------------------------------------------------------------- |
2447 |
## condor_ssh_to_job |
2448 |
##-------------------------------------------------------------------- |
2449 |
# NOTE: condor_ssh_to_job is not supported under Windows. |
2450 |
|
2451 |
# Tell the starter (execute side) whether to allow the job owner or |
2452 |
# queue super user on the schedd from which the job was submitted to |
2453 |
# use condor_ssh_to_job to access the job interactively (e.g. for |
2454 |
# debugging). TARGET is the job; MY is the machine. |
2455 |
#ENABLE_SSH_TO_JOB = true |
2456 |
|
2457 |
# Tell the schedd (submit side) whether to allow the job owner or |
2458 |
# queue super user to use condor_ssh_to_job to access the job |
2459 |
# interactively (e.g. for debugging). MY is the job; TARGET is not |
2460 |
# defined. |
2461 |
#SCHEDD_ENABLE_SSH_TO_JOB = true |
2462 |
|
2463 |
# Command condor_ssh_to_job should use to invoke the ssh client. |
2464 |
# %h --> remote host |
2465 |
# %i --> ssh key file |
2466 |
# %k --> known hosts file |
2467 |
# %u --> remote user |
2468 |
# %x --> proxy command |
2469 |
# %% --> % |
2470 |
#SSH_TO_JOB_SSH_CMD = ssh -oUser=%u -oIdentityFile=%i -oStrictHostKeyChecking=yes -oUserKnownHostsFile=%k -oGlobalKnownHostsFile=%k -oProxyCommand=%x %h |
2471 |
|
2472 |
# Additional ssh clients may be configured. They all have the same |
2473 |
# default as ssh, except for scp, which omits the %h: |
2474 |
#SSH_TO_JOB_SCP_CMD = scp -oUser=%u -oIdentityFile=%i -oStrictHostKeyChecking=yes -oUserKnownHostsFile=%k -oGlobalKnownHostsFile=%k -oProxyCommand=%x |
2475 |
|
2476 |
# Path to sshd |
2477 |
#SSH_TO_JOB_SSHD = /usr/sbin/sshd |
2478 |
|
2479 |
# Arguments the starter should use to invoke sshd in inetd mode. |
2480 |
# %f --> sshd config file |
2481 |
# %% --> % |
2482 |
#SSH_TO_JOB_SSHD_ARGS = "-i -e -f %f" |
2483 |
|
2484 |
# sshd configuration template used by condor_ssh_to_job_sshd_setup. |
2485 |
#SSH_TO_JOB_SSHD_CONFIG_TEMPLATE = $(LIB)/condor_ssh_to_job_sshd_config_template |
2486 |
|
2487 |
# Path to ssh-keygen |
2488 |
#SSH_TO_JOB_SSH_KEYGEN = /usr/bin/ssh-keygen |
2489 |
|
2490 |
# Arguments to ssh-keygen |
2491 |
# %f --> key file to generate |
2492 |
# %% --> % |
2493 |
#SSH_TO_JOB_SSH_KEYGEN_ARGS = "-N '' -C '' -q -f %f -t rsa" |
2494 |
|
2495 |
###################################################################### |
2496 |
## |
2497 |
## Condor HDFS |
2498 |
## |
2499 |
## This is the default local configuration file for configuring Condor |
2500 |
## daemon responsible for running services related to hadoop |
2501 |
## distributed storage system.You should copy this file to the |
2502 |
## appropriate location and customize it for your needs. |
2503 |
## |
2504 |
## Unless otherwise specified, settings that are commented out show |
2505 |
## the defaults that are used if you don't define a value. Settings |
2506 |
## that are defined here MUST BE DEFINED since they have no default |
2507 |
## value. |
2508 |
## |
2509 |
###################################################################### |
2510 |
|
2511 |
###################################################################### |
2512 |
## FOLLOWING MUST BE CHANGED |
2513 |
###################################################################### |
2514 |
|
2515 |
## The location for hadoop installation directory. The default location |
2516 |
## is under 'libexec' directory. The directory pointed by HDFS_HOME |
2517 |
## should contain a lib folder that contains all the required Jars necessary |
2518 |
## to run HDFS name and data nodes. |
2519 |
#HDFS_HOME = $(RELEASE_DIR)/libexec/hdfs |
2520 |
|
2521 |
## The host and port for hadoop's name node. If this machine is the |
2522 |
## name node (see HDFS_SERVICES) then the specified port will be used |
2523 |
## to run name node. |
2524 |
HDFS_NAMENODE = hdfs://example.com:9000 |
2525 |
HDFS_NAMENODE_WEB = example.com:8000 |
2526 |
|
2527 |
HDFS_BACKUPNODE = hdfs://example.com:50100 |
2528 |
HDFS_BACKUPNODE_WEB = example.com:50105 |
2529 |
|
2530 |
## You need to pick one machine as name node by setting this parameter |
2531 |
## to HDFS_NAMENODE. The remaining machines in a storage cluster will |
2532 |
## act as data nodes (HDFS_DATANODE). |
2533 |
HDFS_NODETYPE = HDFS_DATANODE |
2534 |
|
2535 |
## If machine is selected to be NameNode then by a role should defined. |
2536 |
## If it selected to be DataNode then this paramer is ignored. |
2537 |
## Available options: |
2538 |
## ACTIVE: Active NameNode role (default value) |
2539 |
## BACKUP: Always synchronized with the active NameNode state, thus |
2540 |
## creating a backup of the namespace. Currently the NameNode |
2541 |
## supports one Backup node at a time. |
2542 |
## CHECKPOINT: Periodically creates checkpoints of the namespace. |
2543 |
HDFS_NAMENODE_ROLE = ACTIVE |
2544 |
|
2545 |
## The two set of directories that are required by HDFS are for name |
2546 |
## node (HDFS_NAMENODE_DIR) and data node (HDFS_DATANODE_DIR). The |
2547 |
## directory for name node is only required for a machine running |
2548 |
## name node service and is used to store critical meta data for |
2549 |
## files. The data node needs its directory to store file blocks and |
2550 |
## their replicas. |
2551 |
HDFS_NAMENODE_DIR = /tmp/hadoop_name |
2552 |
HDFS_DATANODE_DIR = /scratch/tmp/hadoop_data |
2553 |
|
2554 |
## Unlike name node address settings (HDFS_NAMENODE), that needs to be |
2555 |
## well known across the storage cluster, data node can run on any |
2556 |
## arbitrary port of given host. |
2557 |
#HDFS_DATANODE_ADDRESS = 0.0.0.0:0 |
2558 |
|
2559 |
#################################################################### |
2560 |
## OPTIONAL |
2561 |
##################################################################### |
2562 |
|
2563 |
## Sets the log4j debug level. All the emitted debug output from HDFS |
2564 |
## will go in 'hdfs.log' under $(LOG) directory. |
2565 |
#HDFS_LOG4J=DEBUG |
2566 |
|
2567 |
## The access to HDFS services both name node and data node can be |
2568 |
## restricted by specifying IP/host based filters. By default settings |
2569 |
## from ALLOW_READ/ALLOW_WRITE and DENY_READ/DENY_WRITE |
2570 |
## are used to specify allow and deny list. The below two parameters can |
2571 |
## be used to override these settings. Read the Condor manual for |
2572 |
## specification of these filters. |
2573 |
## WARN: HDFS doesn't make any distinction between read or write based connection. |
2574 |
#HDFS_ALLOW=* |
2575 |
#HDFS_DENY=* |
2576 |
|
2577 |
#Fully qualified name for Name node and Datanode class. |
2578 |
#HDFS_NAMENODE_CLASS=org.apache.hadoop.hdfs.server.namenode.NameNode |
2579 |
#HDFS_DATANODE_CLASS=org.apache.hadoop.hdfs.server.datanode.DataNode |
2580 |
#HDFS_DFSADMIN_CLASS=org.apache.hadoop.hdfs.tools.DFSAdmin |
2581 |
|
2582 |
## In case an old name for hdfs configuration files is required. |
2583 |
#HDFS_SITE_FILE = hdfs-site.xml |