Deep Data Mining Blog: Some Observations on NULL Value Handling in Oracle SQL

Sunday, September 22, 2013

Some Observations on NULL Value Handling in Oracle SQL

We need to be aware of how NULL/missing values are handled in SQL query so that we will not be surprised by query results that appear "wrong". This is descried using the following simple table as an example. The fifth record has a NULL value.

SQL> select id, value from tbl_data order by id;

ID	VALUE
1	-1
2	0
3	1
4	2
5

If we calculate the total number of records in the table, number of records with values>=0 and values <0, they are 5, 3 and 1, respectively, as shown below. As we can see, the number of records for values >=0 (3) plus that <0 (1) is less than the total number of records (5). This is because NULL values appear in the SQL where clause exclude records from the consideration.

SQL> select count(*) from tbl_data;

COUNT(1)
5

SQL> select count(*) from tbl_data where value>=0;

COUNT(1)
3

SQL> select count(*) from tbl_data where value<0;

COUNT(1)
1

A better way to calcluate this kind of statisitcs is to use "case when" instead of where clause as shown below.

SQL> select count(*) total, sum(case when value>=0 then 1 else 0 end) non_negative, sum(case when value<0 then 1 else 0 end) negative, sum(case whe n value is null then 1 else 0 end) n_missing from tbl_data;

TOTAL	NON_NEGATIVE	NEGATIVE	N_MISSING
5	3	1	1

Or even better, we combine "case when" with "group by" to calculate the statistics. "Group by" is one of my favorites as it gives the complete picture (including NULL values) about the data.

SQL> select v, count(*) from (select case when value>=0 then 'non-negative' when value <0 then 'negative' else 'missing' end v from tbl_data) group by v;

V	COUNT(1)
negative	1
non-negative	3
missing	1

In summary, if we are aware of how NULL values are handles in the database, we will not be surprised by query results that appear "wrong".

10 Most Influential People	Text Files and Oracle DB	Predictive Model vs Rule	Build Predictive Model	About Predictive Model Variable	Logistic Regression
Recency Frequency Monetary Analysis	Unique Identifier in Oracle	Materialized View	Database Link	Calculate Percentage Using SQL	Handle NULL Value
Calculate Cumulative Perentage	Find Score Cutoff Value	Remove Duplicates	Calculate Correlation Coefficients	Oracle vs SQL Server	Random Sampling
Table Insert	Read Only Table	Clustering	Ranking	Find Most Frequent	Median Value
Oracle Source Code	Debug PL/SQL	Hide PL/SQL Scripts	Repair Views	Dump Schema	Move Big Files to Amazon

Popular Topics

Popular Topics

Sunday, September 22, 2013

Some Observations on NULL Value Handling in Oracle SQL

No comments: