HTCondor mini user guide¶
Table of Contents
Description¶
Deploy HTCondor mini, a technology preview of an all-in-one (“minicondor”) HTCondor. This type of install is useful for testing and experimentation.
Prerequisites¶
The user has to be registered in the IAM system for INFN-Cloud https://iam.cloud.infn.it/. Only registered users can login into the INFN-Cloud dashboard https://my.cloud.infn.it.
- For more details regarding registration please see Getting Started
User responsibilities¶
Important
The solution described in this guide consists on instantiation of Virtual Machines instantiated on INFN-CLOUD infrastructure. The instantiation of a VM comes with the responsibility of maintaining it and all the services it hosts.
Please read the INFN Cloud AUP in order to understand the responsibilities you have in managing this service.
Deployment of the service¶
After login into the INFN-Cloud dashboard, select the “HTCondor mini” button from the list of available solutions:
Insert into the corresponding fields a Deployment description and choose a flavour in order to specify the number of vCPUs and memory size of the Virtual Machine, as shown in the image below:
Once the deployment is ready, it will be possible to access the VM via SSH.
Submit a simple job to HTCondor mini¶
As a first step, it is necessary to switch to the submituser user by simply issuing the condor command:
~$ condor [submituser@c9c00e2e28c8 ~]$
This command allows to execute a user shell in the docker container. At this point it is possible to proceed to the job submission.
Create a submit file like this:
[submituser@c9c00e2e28c8 ~]$ cat submit.sub executable = /bin/hostname output = output.txt error = error.txt log = log.txt queue 1
Then submit the job and see its status to check if it is correctly running:
[submituser@c9c00e2e28c8 ~]$ condor_submit submit.sub Submitting job(s). 1 job(s) submitted to cluster 2. [submituser@c9c00e2e28c8 ~]$ [submituser@c9c00e2e28c8 ~]$ condor_q -- Schedd: c9c00e2e28c8 : <127.0.0.1:9618?... @ 04/26/23 14:25:51 OWNER BATCH_NAME SUBMITTED DONE RUN IDLE TOTAL JOB_IDS submituser ID: 2 4/26 14:25 _ 1 _ 1 2.0 Total for query: 1 jobs; 0 completed, 0 removed, 0 idle, 1 running, 0 held, 0 suspended Total for submituser: 1 jobs; 0 completed, 0 removed, 0 idle, 1 running, 0 held, 0 suspended Total for all users: 1 jobs; 0 completed, 0 removed, 0 idle, 1 running, 0 held, 0 suspended
It’s also possible to check the status of the cluster:
[submituser@c9c00e2e28c8 ~]$ condor_status Name OpSys Arch State Activity LoadAv Mem ActvtyTime slot1@c9c00e2e28c8 LINUX X86_64 Unclaimed Idle 0.000 1983 0+02:38:52 Total Owner Claimed Unclaimed Matched Preempting Backfill Drain X86_64/LINUX 1 0 0 1 0 0 0 0 Total 1 0 0 1 0 0 0 0