Home → CLC FAQ - Workflows, Batching and other Workbench utilities → Running analyses in batches → How can I run a batch job with multiple libraries for each sample?
1.1. How can I run a batch job with multiple libraries for each sample?
From CLC Genomics Workbench 20 it is possible to define batch units using metadata when running Workflows in batch. Information on how to do this can be found on the following manual page:
While it in most cases will be fastest to use metadata it is still possible to define batch units based on folder structure as described below in this FAQ, but it is only recommended if having very few samples or if running individual tools one by one.
If wishing to define batch units based on folder structure in the Navigation Area you need to create one folder for each batch unit plus a top folder for your experiment.
In the example below we have three samples, called A, B, and C. For each sample three libraries have been sequenced, e.g. A-1, A-2, and A-3 (Figure 1).
Figure 1: Folder structure with a top folder for the experiment and one folder for each sample.
This, means that all the elements under the folder you choose when you start a batch analysis are considered a batch unit. In the image above, the three folders "Sample A", "Sample B" and "Sample C" are considered as batch units. So, for example, everything within the "Sample A" folder will be used in a given analysis run.
You can, of course, set restrictions on the data to be used as input from the batch folders. This is described in our manual here:
To analyze the samples in batch please follow the steps below:
- Open the wizard and check the Batch option.
- You may now select the top folder holding all the sample folders (Figure 2).
- Follow the wizard as usual to set the parameters.
- In the Result handling step you have two options when running in batch. These are: Save in input folder and Save in specified location. For the latter option there is an additional option to Create subfolders per batch unit.
- Select the option which you find most suitable for your procedures.
Figure 2: The Batch option has been checked, after which the top folder can be selected.
In the figure below the option Save in input folder was selected and a Reads Track has been produced for each sample. As you may notice the Reads Tracks are named according to the first library, but if you look in the History tab of the Reads Track you see that all three libraries were included for the analyses (Figure 3).
Figure 3: The History tab of the out-put file shows the files included in the analysis.