10.3. How to trim adapters from miRNA data sequenced on Illumina machine?
To trim the small RNA adapter from Illumina microRNA (miRNA) reads please create a Trim Adapter List in the following way:
- Go to: File | New | Trim Adapter List.
- Click the Add row button.
- Type or copy/paste the name of the adapter
- Type or copy/paste the sequence of the adapter from Illuminas FAQ page: https://support.illumina.com/bulletins/2016/12/what-sequences-do-i-use-for-adapter-trimming.html or custom letter https://support.illumina.com/downloads/illumina-adapter-sequences-document-1000000002694.html 1, depending on the adapter being used.
- Choose to trim "All Reads"
- Choose the action when an adapter is found "Remove the adapter and following sequence (3' trim)"
- For reads without adapters choose "Discard the read"
- Click Next
- Leave default options for Alignment score costs, but optimize Match thresholds for Internal or End matches according to the sequenced read length and your preferences for the specificity. Two examples are included in Figure 1. Details on the Alignment score costs and Match thresholds can be found on the manual page as follows: Creating a new Trim adapter list
- Click Finish.
Why should the Trim adapter list be created this way?
For miRNA data you will normally sequence through the miRNA and into attached adapter sequence. If the read include the full or remnants of the adapter, it is an indication that this is indeed a miRNA and not mRNA, rRNA or DNA, which have not been completely removed from the sample. Therefore, this kind of data is trimmed using the trimming action "Discard when not found". This option will remove the adapter when found and discard the reads for which the adapter is not found.
Since the read is sequenced from the 5' end through the miRNA sequence and into the adapter sequence, it is the 3' end which should be trimmed away.
Figure 1: 36 nucleotides (nt) long Read including Small RNA v1.5 3′Adapter (Example read is from SRR038853 downloaded from SRA) and 50 nt long read including TruSeq Small RNA Adapter (Example read is from SRX1818566 download from SRA)
If your reads are as in the 36 nt long example. The first nt are the miRNA (21 nt in this example) followed by the adapter (24 nt in this example with Small RNA v1.5 3′Adapter), which then extend beyond the read. Hence, you need to select the option "Allow end trimming". The minimum end score should be according to the specificity what you wish to use for when an adapter is recognized. If you for example set it to 6, as done in our tutorial, you will allow for up to three mismatches or two gaps in cases where the miRNA is 21 nt long.
If your reads are longer, say, 50 bp, then the adapter sequence will be found in the middle of the read. The first nt are the miRNA, which in this example is 21 nt miRNA, followed by the 21 nt adapter (TruSeq Small RNA in this example) + 8 nt included after the adapter sequence. If this is the case you will need to select the option "Allow internal matches". The minimum internal score should be according to the specificity what you wish to use for when an adapter is recognized. If leaving the default, which is 10, then it will allow up to three mismatches or two gaps for the 21 bp adapter.
Our tutorial using the 36 bp long reads can be found in our webpage as follows: