## Sunday, November 17, 2013

### Find Records with Highest/Lowest Values by Category

#### Problem

We use the emp table as an example. How to we find the records for the highest paid employees by job titles?

```SQL> select EMPNO, ENAME, JOB, SAL from emp    order by job, sal desc;

EMPNO ENAME      JOB              SAL
---------- ---------- --------- ----------
7902 FORD       ANALYST         3000
7788 SCOTT      ANALYST         3000
7934 MILLER     CLERK           1300
7900 JAMES      CLERK            950
7369 SMITH      CLERK            800
7566 JONES      MANAGER         2975
7698 BLAKE      MANAGER         2850
7782 CLARK      MANAGER         2450
7839 KING       PRESIDENT       5000
7499 ALLEN      SALESMAN        1600
7844 TURNER     SALESMAN        1500
7521 WARD       SALESMAN        1250
7654 MARTIN     SALESMAN        1250

14 rows selected.
```

#### Solution

We use Oracle analytic function row_number() which generates unique ranks for records. The rank will be generated independently by job ("partition by job") and will be based on the descending order of salary ("order by sal desc"). So the record with the highest salary in each job will receive a rank of 1 as shown below.

```SQL> select EMPNO, ENAME, JOB, SAL, row_number() over(partition by job    order by sal desc) sal_rank from emp order by job, sal desc;

EMPNO ENAME      JOB              SAL   SAL_RANK
---------- ---------- --------- ---------- ----------
7902 FORD       ANALYST         3000          1
7788 SCOTT      ANALYST         3000          2
7934 MILLER     CLERK           1300          1
7900 JAMES      CLERK            950          3
7369 SMITH      CLERK            800          4
7566 JONES      MANAGER         2975          1
7698 BLAKE      MANAGER         2850          2
7782 CLARK      MANAGER         2450          3
7839 KING       PRESIDENT       5000          1
7499 ALLEN      SALESMAN        1600          1
7844 TURNER     SALESMAN        1500          2
7521 WARD       SALESMAN        1250          3
7654 MARTIN     SALESMAN        1250          4

14 rows selected.
```
To show only those records having highest salaries by jobs, we add a condition that only records with the rank of salary equal one will be selected.
```SQL> select * from (select EMPNO, ENAME, JOB, SAL, row_number() over(      partition by job order by sal   desc) sal_rank from emp order by job, sal desc) where sal_rank=1;

EMPNO ENAME      JOB              SAL   SAL_RANK
---------- ---------- --------- ---------- ----------
7788 SCOTT      ANALYST         3000          1
7934 MILLER     CLERK           1300          1
7566 JONES      MANAGER         2975          1
7839 KING       PRESIDENT       5000          1
7499 ALLEN      SALESMAN        1600          1
```
The above method using row_number to generates rank will select one record for each category. If more then one records have the highest salaries in their group such as Analyst Scott and Ford, only one of them will be selected. In this case, it is Scott. In the post More on Finding Records with Highest/Lowest Values by Category, I will describe how to select all records with the highest value when there are ties.