HomeCLC FAQ - Analyses-related questionsDe novo assemblyWhat do N charaters represent in the output of my de novo assembly?

6.4. What do N charaters represent in the output of my de novo assembly?

When considering submission of your genomic assembly to a public repository such as NCBI, it is important to know what the N characters in the assembly stand for. 

N characters in de novo assembly outputs can represent two things, depending on the de novo assembly parameters that were used. 

 

If the de novo assembly was performed with the option "Perform Scaffolding" turned OFF, then the N characters can represent:

  1. Positions where all the input sequencing reads themselves contained Ns.

 

If the de novo assembly was performed with the option "Perform Scaffolding" turned ON, then the N characters can represent:

  1. Positions where all the input sequencing reads themselves contained Ns.
  2. ​Regions between scaffolded contigs. Here, the number of Ns represents the approximate distance between contigs in the reported scaffolding.

The first option should not occur often, and it can be confirmed by checking whether there is a scaffold annotation associated with tracts of Ns in the assembly output. One way to do this is illustrated in the following figure:

 

For more information regarding how scaffolding can be used to optimize the graph using paired reads, please refer to the following manual section:

http://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/index.php?manual=Optimization_graph_using_paired_reads.html#sec:scaffolding

 

Direct export to AGP format, suitable for submission to NCBI, is available in CLC Genomics Workbench. It is described in the following manual section:

http://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/index.php?manual=AGP_export.html

 

To close the gap with the Ns we recommend using CLC Genome Finishing Module. More information regarding this module can be found on the QIAGEN Bioinformatics webpage:

https://www.qiagenbioinformatics.com/products/clc-genome-finishing-module/

Knowledge Tags

This page was: Helpful | Not Helpful