Monday, February 24, 2014

Calculate Average Value - Watch Out NULL Values

Calculating the average or mean is one of the most common tasks. Average or mean of a variable is simply the sum of values divided by the number of data points. However if we do not understand how null values are handled by a database, mistakes may happen. Taking the following table as an example. It has 6 records and two of them have null in column value.

SQL> select * from tbl_test2;

        ID      VALUE
---------- ----------
         1          2
         2          3
         3          4
         4          5
         5
         6

6 rows selected.
We calculate the average value using avg() function.
SQL> select avg(value) from tbl_test2;

AVG(VALUE)
----------
       3.5
The result is 3.5. If we use the following query to calculate the total number of records (6), the sum of value(14) and the average as the sum of value divided by number of records (14/6), the average value is 2.33.
SQL> select  count(*) tot_num, sum(value) sum_value, sum(value)/count(*) as avg_2 
from tbl_test2;

   TOT_NUM  SUM_VALUE      AVG_2
---------- ---------- ----------
         6         14 2.33333333
We can see the average value calcuated by avg(value) and sum(value)/count(*) is very different. Why does this happen? The reason is that when Oracle calculates function avg(value), the records with null value are excluded from consideration. We can add a condition to the second query to specifiy only records with non-null value are taking into consideration as the following.
SQL> select  count(*) tot_num, sum(value) sum_value, 
            sum(value)/count(*) as avg_2  
from tbl_test2 where value is not null;

   TOT_NUM  SUM_VALUE      AVG_2
---------- ---------- ----------
         4         14        3.5
Now, the sum(value)/count(*) returns the same value as avg(value). As we can see, it is important to understand how NULL values are handled in a database.

No comments: