Thursday, September 13, 2012

Recency, Frequency, Monetary (RFM) Analysis: Part 5. RFM Code

To calculate monetary variable, we can use analytic function sum (amount) over(). Thus, we can combine the quintiles for recency, frequency and monetary to create a RFM code for each customer. This can be done using the following query. RFM code 555 is the best group of customers.


   USER_ID    CAT   STATE_ID CAMPAIGN_DATE CAMPAIGN_ID  RESPONSE     AMOUNT
---------- -- -------- --------- -------- -------- ----------
      1001 A  MA       29-MAY-12 A        N
      1001 A  MA       22-MAY-12 A        N
      1001 A  MA       15-MAY-12 A        Y                50
      1001 A  MA       08-MAY-12 A        N
      1001 A  MA       01-MAY-12 A        N
      1001 B  CT       06-JUN-12 A        N
      1001 A  MA       06-JUN-12 A        N
      1002 B  CT       08-MAY-12 A        N
      1002 B  CT       15-MAY-12 A        Y                50
      1002 B  CT       22-MAY-12 A        N
      1002 B  CT       29-MAY-12 A        Y                50
      1002 B  CT       01-MAY-12 A        N
with temp_tbl as
(
select 
user_id, 
min(sysdate-campaign_date) as recency,
count(1) frequency, 
sum(amount) as sum_amount
  from TBL_DIRECT_MAILING a where RESPONSE='Y'
group by user_id
),
temp_tbl2 as
(
select 
a.user_id,  
recency,
frequency, 
sum_amount from 
(select distinct user_id from TBL_DIRECT_MAILING) a,  
temp_tbl b
where a.user_id=b.user_id(+)
),
temp_tbl3 as
(
select user_id,
ceil(cume_dist() over(order by recency desc nulls first)*5) recency_quintile,
ceil(cume_dist() over(order by frequency nulls first)*5) frequency_quintile,
ceil(cume_dist() over(order by sum_amount  nulls first)*5) amt_quintile
from temp_tbl2)
select user_id, recency_quintile*100+frequency_quintile*10+amt_quintile as RFM_Code from
temp_tbl3 order by user_id;

The result may look like the following.

   USER_ID   RFM_CODE
---------- ----------
      1001        333
      1002        555
      1003        233

3 comments:

Unknown said...

I have try your code on Microsoft SQL Server 2012 Express version.
It seem have some problem. Below are my code, anyone help to justify what problem appear?

with temp_tbl as
(
select [CreditCardNumber],

min(SYSDATETIME-[TranscationDate]) as recency,
count(1) frequency,
sum([Amount]) as sum_amount
from [dbo].[Transcation] a
group by [CreditCardNumber]
),
temp_tbl2 as
(
select
a.[CreditCardNumber]
recency,
frequency,
sum_amount from
(select distinct CreditCardNumber from Transcation) a,
temp_tbl b
where a.CreditCardNumber=b.CreditCardNumber
)
temp_tbl3 as
(
select[CreditCardNumber] ,
ceiling(cume_dist() over(order by recency desc nulls first)*5) recency_quintile,
ceiling(cume_dist() over(order by frequency nulls first)*5)frequency_quintile,
ceiling(cume_dist() over(order by sum_amount nulls first)*5) amt_quintile
from temp_tbl2),
select CreditCardNumber, recency_quintile*100+frequency_quintile*10+amt_quintile as RFM_Code
from temp_tbl3 Order by CreditCardNumber;

Jay Zhou, PhD. said...

They are Oracle SQL scripts. SQL server may have different syntax.You may need to do some "translation". Good luck!

Unknown said...

The SQL function NTILE is designed to support n-tile analysis (quartile, quintile, percentile, etc.) The following example shows how to create RFM cell codes of equal size as recommended by Arthur Hughes. Classic RFM that just concatenates R, F and M quintiles can place a large percentage of the population in an individual cell. This approach divides the customer population into 125 equally sized RFM cells.

SELECT
MEMBERSHIP_NBR,
MAX_TRANSACTION_DATE,
VISIT_COUNT,
AVG_VISIT_AMT,
10*RF +
NTILE(5) OVER (PARTITION BY RF ORDER BY AVG_VISIT_AMT) RFM
INTO ES78700.dbo.MEMBERSHIP_RFM_CELL
FROM
(
SELECT
MEMBERSHIP_NBR,
MAX_TRANSACTION_DATE,
VISIT_COUNT,
AVG_VISIT_AMT,
10*R +
NTILE(5) OVER (PARTITION BY R ORDER BY VISIT_COUNT) RF
FROM
(
SELECT
MEMBERSHIP_NBR,
MAX_TRANSACTION_DATE,
VISIT_COUNT,
AVG_VISIT_AMT,
NTILE(5) OVER (ORDER BY MAX_TRANSACTION_DATE) AS R
FROM
(
SELECT
MEMBERSHIP_NBR,
MAX(TRANSACTION_DATE) MAX_TRANSACTION_DATE,
COUNT(*) VISIT_COUNT,
AVG(TOTAL_VISIT_AMT) AVG_VISIT_AMT
FROM STORE_VISITS S
GROUP BY MEMBERSHIP_NBR
) AS MEMBERSHIP_RFM
) AS MEMBERSHIP_R
) AS MEMBERSHIP_RF
ORDER BY RFM
;