It is a very common task to find the cutoff value to get the top N records . The following are some of the examples: 1. Find the cutoff salary for the top 10 employees in a company. 2. Find a cutoff score for a risk model that generates alerts for the top 100 riskiest transactions.
We use the following table that has 20 records as an example.
SQL> select id, num from TBL_20 order by id;
| ID | NUM |
|---|---|
| 1 | -.650222 |
| 2 | -1.465297 |
| 3 | -.689485 |
| 4 | -1.547403 |
| 5 | -1.791099 |
| 6 | -1.270857 |
| 7 | .988116 |
| 8 | 1.246141 |
| 9 | .643606 |
| 10 | -.515888 |
| 11 | -.713859 |
| 12 | -.587674 |
| 13 | -1.634403 |
| 14 | 1.285847 |
| 15 | -.08049 |
| 16 | .231295 |
| 17 | -.66065 |
| 18 | .422664 |
| 19 | -.134565 |
| 20 | -1.773186 |
20 rows selected.
If we want to find out the cutoff value for the largest 5th column "num", we first use function row_number() to generate rank and then select the num that has a rank of 5.
SQL> with tbl as (select a.*, row_number() over(order by num desc) rnk from tbl_20 a) select num from tbl where rnk=5;
| NUM |
|---|
| .422664 |
To verify that .422664 is indeed the cutoff value for the top 5 records, we run the following query.
SQL> select * from tbl_20 where num>=.422664 order by num desc;
| NUM | ID |
|---|---|
| 1.285847 | 14 |
| 1.246141 | 8 |
| .988116 | 7 |
| .643606 | 9 |
| .422664 | 18 |
It is a good practice to always verify our results using another query. That way, the chance of making mistakes is greatly reduced.
No comments:
Post a Comment