HomeCLC FAQ - Analyses-related questionsRead mappingHow can I view the bases for mapped reads that extend beyond the end of a reference sequence?

5.1. How can I view the bases for mapped reads that extend beyond the end of a reference sequence?

This information is relevant for

  • stand-alone mappings with reference and consensus sequences
  • stand-alone mappings with just a contig sequence, such as produced by mapping reads back to de novo assembled contigs. For such data, please just replace the word "reference" with "contig" in the instructions below.

This information does not pertain to track-based objects.

 

For reads that map at the end of a reference sequence, any section of the read extending beyond the end of the reference sequence will be not visible. The existence of read information extending beyond the end of reference sequence is indicated by an arrow (>). This is shown in figure 1 of the pdf attached to this page.

The steps you need to take to be able to view the bases in such unaligned read ends are:

  1. Edit the reference sequence, adding a reasonably long stretch of N characters to the end. How long depends on how long your reads are and how long the unaligned ends you wish to view are.

  2. Re-run the mapping using this edited reference.

 

Editing your reference sequence

If you wished to add a series of Ns to the start of a reference sequence, then you would:

  • Open the reference sequences object and highlight the first nucleotide.
  • Right click the highlighted nucleotide and select "Edit Selection" in the menu that appears. This is shown in figure 4 of the attachment.
  • Type in at least as many Ns as your reads are long and click the button labeled "OK" to save this change.

To add Ns at the othe end of the reference, just select the last base of the reference and repeat the above actions.

 

 

Working with mappings with extended reference sequences

After mapping to your edited reference, the ends of any reads extending into the area with the Ns, will be considered non-matching, and therefore appear in faded colours.

If no un-mapped ends appear then please make sure that you have selected to show the sequence ends in the side panel. This is illustrated in figure 2 of the pdf attached to this page.

To get the consensus sequence for such regions, you need to manually drag the end point for each read to be considered. To do this:

  • Make sure you are working with the Selection cursor. To make sure you are, click on the button marked with an arrow and labelled Selection in the top toolbar of the Workbench.
  • Please ensure the compactness setting for the mapping is set to "Not compact"
  • Put the selection cursor on the the grey vertical line between the faded and non-faded parts and depress the left mouse button.
  • Keeping the mouse button depressed, for each relevant read, drag to the left to extend the 5 prime end, or to the right to extend the 3' end.

All read ends that you want to include in the consensus (or contig) calculation should be dragged to include them in the mapping in the above manner.

The consensus sequence will appear after you have released the mouse button after dragging.

 

 

Additional information about viewing options for mappings

If you select the compactness view called Packed in the side panel and then choose to Show mismatches, the nucleotides are colored in a way that may make it easier for your to get an idea about how well the consensus is representing your reads.

This is illustrated in figure 4 of the attachment.

Alternatively, if you want to include all reads, but without viewing the quality scores, you may wish to try the compactness view called Low.

You can find further details about viewing options for read mappings in our manual here:

http://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/index.php?manual=View_settings_in_Side_Panel.html

Knowledge Tags
mapping  / 

Downloads

This page was: Helpful | Not Helpful