Can I install a CLC Genomics Workbench on a compute cluster?

Can I install a CLC Genomics Workbench on a compute cluster?
Go Back

In theory, yes, you could install a copy of the QIAGEN CLC Genomics Workbench on a machine or machines that happen to be part of a compute cluster. However, while this is possible, the Workbench software is really designed for desktop use, or for use as a client for the QIAGEN CLC Genomics Server.

Thus an option more compatible for working with a cluster of computers would be to purchase a QIAGEN CLC Genomics Server, and most likely, when a cluster of computer nodes is available for running jobs on, some number of node licenses. The number of node licenses determines the maximum number of CLC Server jobs that can be run simultaneously.

The QIAGEN CLC Genomics Server product page (https://digitalinsights.qiagen.com/products-overview/discovery-insights-portfolio/enterprise-ngs-solutions/qiagen-clc-genomics-server/ ) provides a more detailed overview.

The full QIAGEN CLC Genomics Server manual can be found here: https://resources.qiagenbioinformatics.com/manuals/clcserver/current/admin/index.php?manual=Introduction.html

For those thinking about running the CLC Server with compute nodes, the chapter on Job Distribution is likely to be of particular interest.

What happens if you do install the Genomics Workbench on a compute cluster node?
Installing the QIAGEN CLC Genomics Workbench on a machine or machines on a compute cluster does not mean that you can submit jobs to run on other nodes of the cluster. Analyses started with the QIAGEN CLC Genomics Workbench will run on the machine the software is executed on. In this sense, it is just like running the software on any remotely accessible system.

Four important considerations in this situation:

The Genomics Workbench was not designed for the purposes of having multiple users using the same copy of the software at the same time. Thus there is no queuing system built into it. This means that people need to take care when submitting computationally intensive jobs if there are others that might also be working with the Genomics Workbench on the same machine at the same time. Jobs on the CLC Workbench are launched immediately and it would be relatively straightforward to exceed the computer resources available if many computationally intensive jobs were attempted at the same time.
The CLC Genomics Workbench algorithms are optimized to run on a slim hardware footprint. Hence depending on the data analysis to be performed, improved hardware may not speed up the analysis, e.q. tools that take advantage of multiple cores do not scale linearly with high numbers of cores (multithreading may have an overhead too) and only the memory required and hence not all the memory available will be used. For guidance you can find the System requirements here: https://digitalinsights.qiagen.com/technical-support/system-requirements/
The Genomics Workbench installation is configured to use an amount of memory that matches the machine used to install the Workbench. Depending on the available memory in machines in the compute cluster this may need modification to match the cluster machines. Please see the related FAQ page: How do I change the memory limit for the CLC Workbench or Server java process?
Please note that you will almost always need to have access to CLC network licenses if you choose to install and run a QIAGEN CLC Genomics Workbench on a node or nodes of a compute cluster. This is because:

Remote access is not supported by our static license conditions, and compute cluster nodes are usually machines accessed by users remotely.
If the compute nodes have more than 64 cores, then network licenses are required for CLC Workbenches.

IPA

CLC Software

HGMD

QCI

OmicSoft Suite

OmicSoft Lands