Deep Data Mining Blog: Recency, Frequency, Monetary (RFM) Analysis: Part 1

Thursday, September 13, 2012

Recency, Frequency, Monetary (RFM) Analysis: Part 1

RFM (Recency, Frequency, Monetary) analysis captures the context information about an account or customer. The result of RFM analysis is a set of variables that can be used for ad hoc queries, segmentation, or predictive modeling to identify valuable customers, detect fraud, improve direct mailing response rates, etc.

RFM analysis is usually performed based on a "transaction" data set. Each record represents a transaction at specific point in time. For example, the following are a sequence of credit card transactions.

We talk about recency factor first. In the following table, transaction 4 as the highlighted is most suspicious. Not only it is an expensive electronics purchase but also the immediate transaction before it is a $1 gas station purchase that happens 11 minutes ago. Fraudsters often use gas station to test a credit cards to make sure they work before they make big purchase at a store.
CARD_NUMBER TXN_ID TXN_DATE TXN_TYPE TXN_AMT
----------- ---------- ---------- ------------ ----------
123457 1 0304:11:00 Grocery 85.35
123457 2 0304:17:30 Electronic 2100.75
123457 3 0305:08:25 Gas Station 1
123457 4 0305:08:36 Electronic 1435.2
123457 5 0305:11:23 Grocery 55.3
Thus, the recency is an important consideration. It provides the context information about a transaction. When we build predictive models or write ad hoc fraud detection rules, including recency factors can greatly improve detection accuracy.

Another example is to predict if a prospect responds to a direct mail offering. The following table shows a prospect's responses in 5 weeks to 5 products. The reason that she responds to offering of product C in week 3 could be related to the offering of B one week before. For example, a person will more likely accept a credit card with 6.99% APR if the she is offered 17.99% APR earlier. Again, the recency factors are important.
User_ID User_Category State Week Product_ID Response
1001 A MA 1 A No
1001 A MA 2 B No
1001 A MA 3 C Yes
1001 A MA 4 D No
1001 A MA 5 E No
One of the convenient ways to calculate recency factors is SQL analytic function lag. For example, the following query calculates three recency factors for credit transactions, i.e., time since recent transaction, recent transaction type and recent transaction amount. The "Partition by card_number" clause means the calculation is done independently for each card number.
select
a.*,
( txn_date-lag(txn_date ) over(partition by card_number order by txn_date)) time_since_recent_txn,
lag(txn_type) over(partition by card_number order by txn_date) last_recent_type ,
lag(txn_amt) over(partition by card_number order by txn_date) last_recent_amt
from TBL_CARD_TXN a order by txn_date;

When there are time/date variables in the data, it is extremely important to take advantage of them by using RFM analysis to create derived variables. Those variables can then be used as inputs to predictive models. Please see another post Build Best Direct Marketing Predictive Models.

2 comments:

Unknown said...: Informatics Outsourcing is an Offshore Data Management service company. Data Management Service includes all types of Data Conversion, File Conversion, XML Conversion, HTML Conversion,SGML Conversion, Document Conversion,Data Entry, Data Extraction and Validation,OCR and ICR Services with affordable price. Our team to give the solution quickly and given requirements.; 4:35 AM
Unknown said...: Since such information are private and gets stored in the data bank of a particular industry, one should have a credible document shredding services to ensure that information stored are not hacked as well as secured.

David Huffman; 8:53 AM

10 Most Influential People	Text Files and Oracle DB	Predictive Model vs Rule	Build Predictive Model	About Predictive Model Variable	Logistic Regression
Recency Frequency Monetary Analysis	Unique Identifier in Oracle	Materialized View	Database Link	Calculate Percentage Using SQL	Handle NULL Value
Calculate Cumulative Perentage	Find Score Cutoff Value	Remove Duplicates	Calculate Correlation Coefficients	Oracle vs SQL Server	Random Sampling
Table Insert	Read Only Table	Clustering	Ranking	Find Most Frequent	Median Value
Oracle Source Code	Debug PL/SQL	Hide PL/SQL Scripts	Repair Views	Dump Schema	Move Big Files to Amazon

Popular Topics

Popular Topics

Thursday, September 13, 2012

Recency, Frequency, Monetary (RFM) Analysis: Part 1

2 comments: