Deep Data Mining Blog: Remove the Rows with Lowest or Highest Values

Thursday, July 30, 2015

Remove the Rows with Lowest or Highest Values

In a project, I need to remove rows with lowest or highest values within groups. I used the following table to illustrate the problem. For grp 10, rows with val 1 and 4 need to be removed and for grp 11, rows with val 5.5 and 9.6.

SQL> select * from tbl_x2 order by 1,2;

       GRP        VAL
---------- ----------
        10          1
        10          2
        10          3
        10          4
        11        5.5
        11        7.3
        11        9.6

7 rows selected.

I used the following SQL statement to do it.

delete from tbl_x2 a
where a.rowid in
(
select rid from
(
with tbl as (
select b.rowid rid,
row_number() over(partition by grp order by val) rnk,
row_number() over(partition by grp order by val desc) rnkr
from tbl_x2 b)
select rid from tbl where rnk=1 or rnkr=1
)
);

The following query shows that rows with lowest or highest values within groups are removed.

SQL> select * from tbl_x2 order by 1,2;

       GRP        VAL
---------- ----------
        10          2
        10          3
        11        7.3

The trick is to use row_number() function to generate ranks or reverse ranks based on val for each grp. For example, the following query show all rows in the original table and the ranks (rnk) and reversed ranks (rnkr). The lowest val has a rank of 1 and the highest val has a reversed rank of 1.

select b.*,
row_number() over(partition by grp order by val) rnk,
row_number() over(partition by grp order by val desc) rnkr
from tbl_x3 b

       GRP        VAL        RNK       RNKR
---------- ---------- ---------- ----------
        10          1          1          4
        10          2          2          3
        10          3          3          2
        10          4          4          1
        11        5.5          1          3
        11        7.3          2          2
        11        9.6          3          1

10 Most Influential People	Text Files and Oracle DB	Predictive Model vs Rule	Build Predictive Model	About Predictive Model Variable	Logistic Regression
Recency Frequency Monetary Analysis	Unique Identifier in Oracle	Materialized View	Database Link	Calculate Percentage Using SQL	Handle NULL Value
Calculate Cumulative Perentage	Find Score Cutoff Value	Remove Duplicates	Calculate Correlation Coefficients	Oracle vs SQL Server	Random Sampling
Table Insert	Read Only Table	Clustering	Ranking	Find Most Frequent	Median Value
Oracle Source Code	Debug PL/SQL	Hide PL/SQL Scripts	Repair Views	Dump Schema	Move Big Files to Amazon

Popular Topics

Popular Topics

Thursday, July 30, 2015

Remove the Rows with Lowest or Highest Values

No comments: