Thursday, November 14, 2013

How to Find Out the Cutoff Value for Certain Percentile

Problem

We want to find out what answers to questions like:
What is the salary that top 1 percent of people are making?
What is the credit score that the lowest 15% have?
What is the cutoff predictive risky score that will raise alerts on the riskiest 0.1% of credit card transaction?

Solution

This problem can be solved using the same approach as described in Find the cutoff value for the top n records. We can always convert the percentile to the top-n-records (or bottom-n-records) by multiplying the size of the table and the percentile. However, there is a simpler way to do this. We can use Oracle function percentile_disc() to return the cutoff value given a certain percentile and ordering. Let's use a table that contains 100 random numbers as an example.
SQL> select * from TBL_100_RND where rownum <10;

NUM
-.58413539
1.31513578
2.16901993
.596910933
.208190221
.179899195
-.63028597
.525443855
-.52158424

The following query returns the cutoff value (v) for the top 10%.
SQL> select percentile_disc(0.1) within group(order by num desc) as cutoff from TBL_100_RND;

cutoff
1.13239998

The following query verifies that the cutoff value 1.13239998 indeed produces 10% of the records.
SQL> select count(1) from TBL_100_RND where num>=1.13239998;

COUNT(1)
10

No comments: