HomeCLC FAQ - Analyses-related questionsRNA-seqHow to add additional information from the annotation tracks to my RNA-Seq results?

8.6. How to add additional information from the annotation tracks to my RNA-Seq results?

This FAQ include two sections:

-----------------------

How to add additional information from the Annotation Tracks to Expression Tracks (GE) or (TE)

The available feature information that is carried over from the Annotation Tracks (Gene and mRNA/RNA) to the Expression Tracks (GE and TE) depends on the information that is available in the Annotation Track. In the list below is the maximal list of information columns that is carried over:

  • Name
  • Chromosome
  • Region
  • GeneID/Transcript ID
  • Hyperlink to available database, e.g. ENSEMBL, RefSeq, etc.
  • Biotype

In some cases there might be more relevant information in the annotation track, that you wish to add to your RNA-Seq results. This can for example be the information from the "Description" column, which often is available from a GFF3 file or the "Product" information that may be available if the reference is originating from a Genbank file.

Additional information that is available in the Annotation Track, can be added to the Expression Track (GE and/or TE), as well as the Statistical Comparison Track, by using the Annotate with Overlap Information tool found under the following Genomics Workbench toolbox menu:

Track Tools | Annotate and Filter | Annotate with Overlap Information

The tool is described on the following manual page:

http://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/index.php?manual=Annotate_with_overlap_information.html

After annotating the Expression Track or Statistical Comparison Track with information from the Annotation Track the added columns are available to the right hand side in the table view.

 

How to add additional information from the Annotation Tracks to Expression Browser

The available feature information that is carried over from the annotation track to the Expression Browser include:

  • Name
  • Chromosome
  • Region
  • Identifier

In some cases there might be more relevant information in the annotation track, that you wish to add to your Expression Browser. This can for example be the information from the "Description" column, which often is available from a GFF3 file or the "Product" information that may be available if the reference is originating from a Genbank file. At the moment it is not possible directly to add more information from the annotation tracks to the Expression Browser. However, you can easily create a Generic Annotation File from the Annotation Track, and then add that to the Expression Browser.

To create a Generic Annotation File from the Annotation Track please follow the procedure described below:

  1. Click the Export button in the toolbar.
  2. Type in csv in the search field of the export wizard.
  3. Choose the option Table CSV and click Select (Figure 1).
  4. Select the Annotation Track of interest in the export wizard and click Next.
  5. Make sure to uncheck the Export All columns in the Set parameters step of the wizard and click Next (Figure 2).
  6. Select the Name and description column and any other columns that you would like to add to the experiment (Figure 3). The Name column is needed for adding the description to the right feature, so this should always be included. Click Next.
  7. Choose were to save the file and click Finish.
  8. After export open the .csv file in Notepad++ or other text editing program (Figure 4).
  9. Edit the Name to Feature ID (Figure 5) and the other selected column names to headers listed in the manual as follows: Generic annotation file for expression data format . Description is one of the available headers, so this you do not need to edit.
  10. Save the edits.
  11. Import the .csv file to the CLC Genomics Workbench using the Standard import.
  12. After import the icon of the imported annotation file should look as shown on Figure 6. If the icon does not look like this, then something have when wrong with the formatting or you have not selected a valid header.

 

Figure 1: Choose to export to Table CSV.

 

Figure 2: Uncheck the option "Export all columns".

 

Figure 3: Select the columns of interest. In this example "Name" and "Description".

 

Figure 4: The exported file opened in Note pad++.

 

Figure 5: Edit the headers to headers that are valid for the Generic expression format.

 

Figure 6: The Generic annotation file imports as an annotation table.

 

Finally, you can create an Expression Browser choosing the newly imported Generic Annotation file as an annotation resource. How to create an Expression Browser is described in the manual as follows:

http://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/index.php?manual=Create_Expression_Browser.html

 

Figure 7: Expression Browser view showing the "Description" as an annotation.

 

 

 

 

 

 

 

This page was: Helpful | Not Helpful