Monday, November 18, 2013

More on Finding Records with Highest/Lowest Values by Category

Problem

In the earlier post Find Records with Highest/Lowest Values by Category, we use row_number() to generate the unique rank for each record. When there are ties in records, those records will receive different ranks randomly. For example, both Ford and Scott have the highest salary of 3000. However, row_number() generates different ranks for them. How do we find all the records with the highest salary by job including ties?

SQL> select EMPNO, ENAME, JOB, SAL, 
row_number() over(partition by job order by sal desc) sal_rank 
from emp order by job, sal desc;

     EMPNO ENAME      JOB              SAL   SAL_RANK
---------- ---------- --------- ---------- ----------
      7902 FORD       ANALYST         3000          1
      7788 SCOTT      ANALYST         3000          2
      7934 MILLER     CLERK           1300          1
      7876 ADAMS      CLERK           1100          2
      7900 JAMES      CLERK            950          3
      7369 SMITH      CLERK            800          4
      7566 JONES      MANAGER         2975          1
      7698 BLAKE      MANAGER         2850          2
      7782 CLARK      MANAGER         2450          3
      7839 KING       PRESIDENT       5000          1
      7499 ALLEN      SALESMAN        1600          1
      7844 TURNER     SALESMAN        1500          2
      7521 WARD       SALESMAN        1250          3
      7654 MARTIN     SALESMAN        1250          4

14 rows selected.

Solution

Two approaches are described here. The first one uses Oracle analytic function and the second one does not.
Approach 1. In stead of using Oracle analytic function row_number, we use dense_rank. If there are ties, dense_rank will produce the same rank for them.

SQL> select EMPNO, ENAME, JOB, SAL, 
dense_rank() over(partition by job order by sal desc) sal_rank 
from emp order by job, sal desc;

     EMPNO ENAME      JOB              SAL   SAL_RANK
---------- ---------- --------- ---------- ----------
      7902 FORD       ANALYST         3000          1
      7788 SCOTT      ANALYST         3000          1
      7934 MILLER     CLERK           1300          1
      7876 ADAMS      CLERK           1100          2
      7900 JAMES      CLERK            950          3
      7369 SMITH      CLERK            800          4
      7566 JONES      MANAGER         2975          1
      7698 BLAKE      MANAGER         2850          2
      7782 CLARK      MANAGER         2450          3
      7839 KING       PRESIDENT       5000          1
      7499 ALLEN      SALESMAN        1600          1
      7844 TURNER     SALESMAN        1500          2
      7521 WARD       SALESMAN        1250          3
      7654 MARTIN     SALESMAN        1250          3

14 rows selected.
Our final selection will be the following.

SQL> select * from (select EMPNO, ENAME, JOB, SAL,
 dense_rank() over(partition by job order by sal desc) sal_rank 
from emp order by job, sal desc)   where sal_rank=1 order by job;

     EMPNO ENAME      JOB              SAL   SAL_RANK
---------- ---------- --------- ---------- ----------
      7788 SCOTT      ANALYST         3000          1
      7902 FORD       ANALYST         3000          1
      7934 MILLER     CLERK           1300          1
      7566 JONES      MANAGER         2975          1
      7839 KING       PRESIDENT       5000          1
      7499 ALLEN      SALESMAN        1600          1

6 rows selected.
Approach 2. We calculate a temporary table containing the highest salary by job and then perform inner joining of the temporary table with the original table. As a result of the inner joining, only those employees with the highest salaries the jobs will be selected.
SQL> with
  2  tbl_temp as
  3  ( select job, max(sal) max_sal from emp group by job)
  4  select a.EMPNO, a.ENAME, a.JOB, a.SAL from
  5  emp a, tbl_temp b
  6  where a.job=b.job and a.sal=b.max_sal
  7  order by a.job;

     EMPNO ENAME      JOB              SAL
---------- ---------- --------- ----------
      7788 SCOTT      ANALYST         3000
      7902 FORD       ANALYST         3000
      7934 MILLER     CLERK           1300
      7566 JONES      MANAGER         2975
      7839 KING       PRESIDENT       5000
      7499 ALLEN      SALESMAN        1600

6 rows selected.
As we see, the results produced by the two approaches are the same.

No comments: