What is Using aggregate functions with NULL

It is important to be aware of the possible presence of NULLs in your data, and of how NULL interacts with T-SQL query components, including aggregate function. There are a few considerations to be aware of:

  • With the exception of COUNT used with the (*) option, T-SQL aggregate functions ignore NULLs. For example, a SUM function will add only non-NULL values. NULLs don’t evaluate to zero. COUNT(*) counts all rows, regardless of value or non-value in any column.
  • The presence of NULLs in a column may lead to inaccurate computations for AVG, which will sum only populated rows and divide that sum by the number of non-NULL rows. There may be a difference in results between AVG(<column>) and (SUM(<column>)/COUNT(*)).

For example, consider the following table named t1:

C1

C2

1

NULL

2

10

3

20

4

30

5

40

6

50

This query illustrates the difference between how AVG handles NULL and how you might calculate an average with a SUM/COUNT(*) computed column:

SQLCopy

SELECT SUM(c2) AS sum_nonnulls, 
    COUNT(*) AS count_all_rows, 
    COUNT(c2) AS count_nonnulls, 
    AVG(c2) AS average, 
    (SUM(c2)/COUNT(*)) AS arith_average
FROM t1;

The result would be:

sum_nonnulls

count_all_rows

count_nonnulls

average

arith_average

150

6

5

30

25

In this resultset, the column named average is the aggregate that internally gets the sum of 150 and divides by the count of non-null values in column c2. The calculation would be 150/5, or 30. The column called arith_average explicitly divides the sum by the count of all rows, so the calculation is 150/6, or 25.

If you need to summarize all rows, whether NULL or not, consider replacing the NULLs with another value that will not be ignored by your aggregate function. You can use the COALESCE function for this purpose.

Leave a Reply

Your email address will not be published. Required fields are marked *