2.20. RNA-Seq Analysis does not count some reads covering start-end position of a circular chromosome
A problem with read counts has been identified in the RNA-Seq Analysis tool when using circular reference sequences. It affects all single reads and some paired end reads that map across the start/end positions of a circular reference where there is a gene annotation over this region. Affected reads are mistakenly considered as mapping to an intergenic region.
For affected analyses, all count values for the genes crossing the start/end position of the circular reference, as well as values with derivations depending on these counts, will be incorrect.
This issue does not affect RNA-Seq analyses where the "One reference sequence per transcript" option has been selected. It also does not affect analyses using the older legacy RNA-Seq Analysis tool.
If your analyses are affected by this problem and you have not yet upgraded to a version of the software where this problem has been addressed, the work around involves redefining the start point of circular references to lie outside a gene region and re-running the analyses with the new references. These are the standard steps that would be involved:
- Create a standard, annotated sequence or sequence list using the circular reference by using the Convert from Tracks tool, selecting the sequence track and all annotation tracks that you want to use for this or any other analysis involving these references.
- View the reference sequence to be adjusted in circular view and move the start position to somewhere not covered by a gene annotation. How to do this is describe in the manual at:
- Use the Convert to Tracks tool on the annotated reference sequence(s), choosing to generate a sequence track and all the annotation track types you originally chose when you converted to tracks.
Please ensure the new references and annotation tracks are easily distinguished from the old ones. Using a mix of the new reference sequences with old annotation tracks, or vice versa, would cause incorrect results. This is because the annotations would be in the wrong location along the reference sequence if a mix of the old and new reference tracks were used.
- CLC Genomics Workbench 7.x, 8.x, 9.x, 10.x and 11.x
- All versions of Biomedical Genomics Workbench
- CLC Genomics Server 6.x, 7.x, 8.x, 9.x and 10.x