The Trim Reads tool can automatically trim the sequencing adapters from paired reads. If your single reads include adapters, that you wish to trim away, then this can be done using the Trim Reads tool by including a Trim Adapter List. Likewise, primers can be trimmed away using a Trim Adapter List. To create a Trim Adapter List, you need to know the location (5' or 3' end) and orientation (forward or reverse complement) of the adapter/primer. If you do not know the location and orientation of the adapter/primer you can find out using the steps described in this FAQ.
Quick approach using Find in the side panel
A simple and quick approach is to use the Find option in the side panel:
- Open the Read list in the sequence view
- Open Find tab in the side panel
- Enter the primer or adapter sequence
- Click Find
This option is easy to use if the adapter/primer is present at high frequency with perfect matches. By default, the Find option will both search the positive and negative strand. The adapter/primer should be entered in the Adapter Trim List as it is seen in the reads.
Thorough approach using motif search
If you wish to do a more thorough search allowing mismatches in the adapter/primer sequence and annotating the presence of the adapter/primer, then you can use the motif search tool.
1) First create a subset of reads (minimum 100 reads, maximum 100.000). Creating a small subset of reads is necessary, as the tools used in the next step were not designed for large datasets.
You can create a subset using the Subsample Sequence List tool:
Toolbox | Utility Tools | Sequence Lists | Subsample Sequence List
In the wizard please select the option to Sample an absolute number and set the Sample size to the number X you wish to include. Analysis is faster on a smaller dataset, but if the adapter/primer is only present at a low frequency a larger subset is needed.
The output from the Subsample Sequence List tool is a new Sequence List only including X reads from the original Read List.
2) Next you can look for the adapter/primer in your reads by running a Motif search. This can be done either using the Motif search from the Toolbox or Dynamic motif search. The advantages of using the Motif search from the Toolbox is, that it allows you to account for sequencing errors in the adapter/primer and provides you with a table overview of the identified motif (adapter/primer). On the other hand the dynamic motif search allows you to quickly add new motifs (adapters/primers) and to save the view settings and thereby apply the motif (adapter/primer) search to other Read Lists.
Both options are described below:
2a) Motif search from the Toolbox
- First create a Motif List with the adapter sequences:
File | New | Create Motif List
In the wizard you can import a fasta file with all the adapter/primer sequences in the 5' - 3' orientation. If you do not have a fasta format file with your adapters/primers, you can add each adapter sequence manually by clicking the Add button.
- After creating the Motif List run the Motif Search tool on the Read List with the 100 – 100.000 reads:
Toolbox | Classical Sequence Analysis | General Sequence Analysis | Motif Search
Use the following parameter settings:
- Motif search type: Motif List
- Choose the newly created Motif List
- Set the Accuracy (%) to somewhere between 50 and 100% depending on the accuracy you expect of your sequencing data.
- Select Include Negative Strand
- Make sure that the option to Add annotations to sequences is selected
- Finish the wizard. The output of the Motif search from the Toolbox is a motif table and the input Read List updated with motif annotations.
- View the motif annotations on the Reads. To do this make sure to select to show the Motif annotations in the side-panel. If only a few motifs (adapters/primers) are found, then it can be helpful to view the Reads List in a split view with the motif table. In the split view you can select a row in the motif table and the view of the Read List will then jump to this position.
Please see the example screen shot below:
2b) Dynamic motif search
- A dynamic motif can be added either by clicking the Add Motif button and then pasting in the adapter/primer name it's sequence in the window that opens up or by clicking the Manage Motifs button, after which you can select a Motif List.
- Select to Include reverse motif and to show the added adapter/primer in the side panel.
Please see the example screen shot below:
After identifying the location and orientation of the adapter/primer in the reads you can create a Trim Adapter List as described in the manual page linked below. From CLC Genomics Workbench 11 and onwards the adapter/primer sequence should be added in the orientation as seen on the reads, no matter if you trim on the 5' or 3' end.
Creating a new Trim adapter list