HomeCLC FAQ - CLC Genome Finishing ModuleHow to use CLC Genome Finishing Module?In which order should I use CLC Genome Finishing Module tools?

1.1. In which order should I use CLC Genome Finishing Module tools?

This page presents a summary of the facilities available using tools delivered by the CLC Genome Finishing Module in an order reflecting the order they might be used in practice.


Getting started (create an assembly and map the reads back to the contigs)

To create an initial assembly, the De Novo Assembly tool of the CLC Genomics Workbench can be used. In addition, PacBio reads can be assembled using the PacBio De Novo Assembly Pipeline delivered by the CLC Genome Finishing Module, and PacBio and Oxford Nanopore reads can be assembled using the De Novo Assemble Long Reads tool delivered by the Long Read Support (beta) plugin.

Most of the tools provided by the CLC Genome Finishing Module take stand-alone read mappings as input.  In the Navigation Area, stand-alone read mappings have icons that look like those shown below:

If you do not already have such a mapping, you will need to take your unfinished genome contigs and map the reads to these using the Map Reads to Contigs tool. In the Map Reads to Contigs wizard, select the output option Create stand-alone read mappings (not the default). 


Improving Contiguity (automated tools)

Join Contigs

The number of contigs in the assembly can be reduced by joining contigs likely to represent one contiguous sequence using the Join Contigs tool. Automatic contig joining can be done using long reads, a closely related reference, paired read data or by aligning contigs to each other.

Further details can be found in the manual:



Improving Completeness (automated tools)

Align Contigs

The Align Contigs tool can be used to map your contigs to a related reference, or, if no such reference is available, to the contigs themselves. This helps identify areas of the genome not covered by the assembled contigs. It also determines the orientation and location of the contigs relative to the reference, which allows the identification of possible misassemblies, repeats, and overlaps between contigs.

The Align Contigs tool is the gateway to downstream visual inspection and manual editing of the contigs.

From the output of Align Contigs tool, one can join, split and edit contig sequences, view the reads mapped to a given contig, remap all mapped reads to one or more contigs, and replace all mapped reads with reads from one or more datasets.

Further information can be found in the manual:



Improving Correctness (manual editing tools)

The tools listing below are generally used in combination with the Align Contigs tool, described above.

Analyze Contigs

Detect misassemblies by analyzing reads mapped to contigs using the Analyze Contigs tool.

It identifies and annotates problematic regions in the de novo assembly. These annotations can then be used to pinpoint misassembled regions during visual inspection of the alignments.

You can select an area of a contig and perform a range of actions, as described in the manual:



Reassemble regions

Misassembled regions and other problematic areas can be reassembled using the Reassemble regions tool. This adjusts the read mapping and makes changes in the consensus sequence based on the reads in the selected region.

To reassemble all annotated regions, the tool can be run from the Workbench Toolbox.

To reassemble a specific region, right-click on that region in the read mapping and choose the tool from the menu.

Further details can be found in the manual:



Extend Contigs

Contigs can be extended using information from reads that continue beyond the ends of the contigs using the Extend Contigs tool.

Extending contigs can result in regions of overlap between contigs being created. Using the Align Contigs tool subsequently on these extended contigs can then help to visualize overlapping contigs that could be joined.

Further details can be found in the manual:



Collect Paired Read Statistics

The Collect Paired Read Statistics tool looks for broken paired reads where the reads of the pairs map to different contigs. The output from this tool can be used to learn about how contigs are positioned to one another, potential overlaps and gaps between pairs of contigs.

How to use Collect Paired Read Statistics tool is described in the manual:



Resequencing unresolved regions

Create Amplicons

The Create Amplicons tool can subdivide a problematic region, for example a region without any coverage or with low coverage, into amplicons of suitable sizes and annotate these accordingly. Subsequently, these annotations can be used as targets for the Create Primers tool, described below. 

How to use the Create Amplicons tool is described in the manual:



Create Primers

The Create Primers tool is for automated primer design to support resequencing of regions with poor read quality, repeats, or low coverage. 

How to use this tool is described in detail in the manual:



Add Reads to Contigs

Additional reads, such as those obtained when resequencing problematic regions, can be added to an existing contig using the Add Reads to Contigs tool.

Further details about this tool can be found in the manual:



Annotating the genome

Annotate from Reference

If you have a closely related reference which is annotated, you can use the Annotate from Reference tool to transfer the annotations from this reference to a set of contigs.

Further details about this tool can be found in the manual:


Knowledge Tags

This page was: Helpful | Not Helpful