HomeCLC FAQ - Analyses-related questionsRNA-seqWhy do I get p-values of zero from Differential Expression Analysis?

8.7. Why do I get p-values of zero from Differential Expression Analysis?

The p-value is the "probability of the observations under the null hypothesis". In case of Differential Expression Analysis the null hypothesis is the assumption of no differential expression.

The less likely the observation is under the assumption that there is no differential expression, the smaller the p-values will become. In theory, it's possible to get a p-value of precisely zero in any statistical test, if the observation is simply impossible under the null hypothesis. In practice, this is extremely rare.

The most likely reason that p-values of zero are observed in the Statistical comparison track is due to so-called "arithmetic underflow". This happens due to very very tiny (positive) numbers that cannot be represented by the computer. This should not be an issue, as the "true" numbers are so small that for practical purposes, they might as well have been 0.0. 

CLC Genomics Workbench uses double precision floating point variables (Double) for the P-value column in the Statistical comparison track. Floating point variables can contain a very large span of values (Double around +/- 1.798E+308 ), but values are stored with limited precision because a fixed number of bits are used.

Double type variables carry 15 decimal digits. This means that there is a limit on the smallest value x > 0 that can be returned from a calculation involving Double, dependent on input values. The end result is that cells in result tables, such as the P-value column, may be automatically rounded off to nearest representable value. 

The "real" P-values in the zero-value cells are actually unknown but bounded: 1.1102230246251565E-16 > P > 0 .
 
Since, it is not possible to plot a p-value of zero on a logarithmic scale, such values are rounded for the volcano plot. The default setting is set to 1E-16.

The image below show an example of a Statistical comparison track and volcano plot in a split view. The p-values represented as 0.000 in the table are rounded to 1E-16 (-log10 1E-16) and highlighted in the volcano plot:

 

Knowledge Tags

This page was: Helpful | Not Helpful