Software-update: HTCondor 23.0.2
Het HTCondor Team van de Universiteit van Wisconsin-Madison heeft een nieuwe longtermsupportversie uitgebracht van zijn 'workload management system' HTCondor. Het versienummer is beland bij 23.0.2. HTCondor richt zich op het beheer van rekenintensieve taken en kan deze over verschillende aangesloten nodes verdelen. De gebruiker stuurt zijn taak naar HTCondor, waarna dit het proces afhandelt op basis van ingestelde policies en de beschikbaarheid van aangesloten resources, om tot slot de resultaten naar de gebruiker terug te sturen. HTCondor kan bijvoorbeeld een dedicated Beowulf-cluster aansturen, maar ook gewone desktops die even niets te doen hebben. Tijdens SC16 hebben Google, Fermilab en het HTCondor Team een 160k-core cloud-based elastic compute cluster gedemonstreerd, en in 2020 heeft de National Science Foundation gekozen voor HTCondor als onderdeel van haar Partnership to Advance Throughput Computing. De beknopte lijst van aanpassingen van deze uitgave ziet er als volgt uit:
Version 23.0.2Fixed a bug when Hashicorp Vault is configured to issue data transfer tokens (which is not the default), job submission could hang and then fail.Improved sandbox and ssh-agent clean up for batch grid universe jobsFix bug where daemons with a private network address couldn’t communicateFix cgroup v2 memory enforcement for custom configurationsAdd DISABLE_SWAP_FOR_JOB support on cgroup v2 systemsFix log rotation for OAuth and Vault credmon daemons
Fixed a bug when Hashicorp Vault is configured to issue data transfer tokens (which is not the default), job submission could hang and then fail.Improved sandbox and ssh-agent clean up for batch grid universe jobsFix bug where daemons with a private network address couldn’t communicateFix cgroup v2 memory enforcement for custom configurationsAdd DISABLE_SWAP_FOR_JOB support on cgroup v2 systemsFix log rotation for OAuth and Vault credmon daemonsVersion 23.0.1Add HTCondor Python wheel in PyPI for Python 3.12Update to apptainer version 1.2.4 in the HTCondor tarballsFix 10.6.0 bug that broke PID namespacesFix Debian and Ubuntu install bug when ‘condor’ user was in LDAPFix bug where execution times for ARC CE jobs were 60 times too largeFix bug where a failed ‘Service’ node would crash DAGManCondor-C and Job Router jobs now get resources provisioned updatesUpdate Windows binaries to address curl CVE-2023-38545
Add HTCondor Python wheel in PyPI for Python 3.12Update to apptainer version 1.2.4 in the HTCondor tarballsFix 10.6.0 bug that broke PID namespacesFix Debian and Ubuntu install bug when ‘condor’ user was in LDAPFix bug where execution times for ARC CE jobs were 60 times too largeFix bug where a failed ‘Service’ node would crash DAGManCondor-C and Job Router jobs now get resources provisioned updatesUpdate Windows binaries to address curl CVE-2023-38545Version 23.0.0Absent slot configuration, execution points will use a partitionable slotLinux cgroups enforce maximum memory utilization by defaultCan now define DAGMan save points to be able to rerun DAGs from thereMuch better control over environment variables when using DAGManAdministrators can enable and disable job submission for a specific userCan set a minimum number of CPUs allocated to a usercondor_status -gpus shows nodes with GPUs and the GPU propertiescondor_status -compact shows a row for each slot typeContainer images may now be transferred via a file transfer pluginSupport for Enterprise Linux 9, Amazon Linux 2023, and Debian 12Can write job information in AP history file for every execution attemptCan run defrag daemons with different policies on distinct sets of nodesAdd condor_test_token tool to generate a short lived SciToken for testingThe job’s executable is no longer renamed to ‘condor_exec.exe’
Absent slot configuration, execution points will use a partitionable slotLinux cgroups enforce maximum memory utilization by defaultCan now define DAGMan save points to be able to rerun DAGs from thereMuch better control over environment variables when using DAGManAdministrators can enable and disable job submission for a specific userCan set a minimum number of CPUs allocated to a usercondor_status -gpus shows nodes with GPUs and the GPU propertiescondor_status -compact shows a row for each slot typeContainer images may now be transferred via a file transfer pluginSupport for Enterprise Linux 9, Amazon Linux 2023, and Debian 12Can write job information in AP history file for every execution attemptCan run defrag daemons with different policies on distinct sets of nodesAdd condor_test_token tool to generate a short lived SciToken for testingThe job’s executable is no longer renamed to ‘condor_exec.exe’
Source:
Tweakers.net