6.7. How can I do a Hybrid Assembly of Long and Short Reads?

It is possible to do a hybrid assembly of long reads, e.g. PacBio or Oxford nanopore, and short reads, e.g. Illumina, in two different ways using QIAGEN CLC software.

  1. Assemble long reads using the De Novo Assemble Long Reads tool, followed by polishing using short reads with the Polish with Reads tool. Both tools come with the free Long Reads Support (beta) plugin.
  2. Assemble the short reads using the De Novo Assembly tool build into the Genomics Workbench. After which contigs can be joined using Join Contigs tool that comes with the commercial Genome Finishing Module.

A small benchmark shown below, shows that option 1 is in general the better approach. However, if option 1 does not give good results on your data we recommend trying option 2 instead.


Benchmark comparing options for hybrid assembly using QIAGEN CLC software:

AP = Alignment percentage

ANI = Average Nucleotide Identity 

Note: Join Contigs tool cannot use reads longer than 99,999 base pairs


On the images below you can find example workflow for the two options:


Option 1: De Novo Assemble Long Reads and Polish with Reads Workflow


Option 2: De Novo Assembly Short Reads and Join Contigs using Long Reads


