Thursday, September 20, 2012

Comparison of Business Rules and Predictive Models

In many organizations, there are two ways to support decision making, i.e., intuitive business rules and predictive models. Business analysts build intuitive rules. Statistician/data modelers use data-driven method mainly predictive models. The following is a comparison of them from 13 different aspects. You may want to check out an earlier post.
Consideration Business Rules Predictive Models
Examples Example 1. A medical fraud rule: Treatments for pregnancy discomfort for male patient is fraud. Example 2. A direct sale rule: Customers who purchase product A within 3 months will likely buy product B. Example 3. Mobile phone customer acquisition rule: If a new customer has bankruptcy in the past, do not offer him a free phone. Example 1. A medical fraud detection model: An unsupervised model that detect anomaly in medical claims. Example 2. A credit card fraud detection model: A neural net that generates fraud score for credit card transactions. Example 3. A customer acquisition model: a logistic regression model that predicts if a new customer will pay her cell phone bill.
Owners Business Analysts who have experience in particular fields such as bank risk management, healthcare, etc. In their communications, they use terms that are understood by business people. They believe that there are explainable cause-effect relationships in predict frauds, etc. Their job is to identify those rules. Statisticians/Data Miners see predictive models are solutions to a wide range of business problems. They believe that in reality the cause-effect relationship is too complex to be understood by human beings. A black box data-driven predictive model is the way to make the most accurate predictions.
Understandability High. Rules are normally intuitive and easy to understand. Low. Most predictive models are black boxes.
Forms of Existence SQL scripts, SAS scripts, Excel Spreadsheet, Word document, in people's head and not documented, etc. SQL scripts, SAS scripts, Excel Spreadsheet, R objects, Oracle mining objects, etc.
Initial Investment Low. High. Need to build data warehouse that archive historical data. May need to purchase data mining software such as SAS, Statistica, etc.
Effort to Build Low. Analysts with domain knowledge can write rules without doing heavy data analysis work. High. Data miners have to go through the processes systemically to build predictive models.
Effort to Deploy into Production Low. Business rules are usually simple scripts such as SQL. High. Predictive models, including the input variable calculation, could be very sophisticated. To deploy a model into production will take significant amount of effort.
Quantity Many. For example, many banks or insurance companies have hundreds or thousands of business rules. Usually, the number of predictive models are very small. For example, even some large banks or insurance companies have only 2 or 3 predictive models to detect frauds.
Perception of Productivity and Effectiveness High. Because of the following two reasons: 1. there are hundreds or thousands of understandable rules; 2. new rules can be created on daily basis, it gives the management an impression that the group who build them is productive and effective. Medium. It may take many days/weeks to build a predictive model before one can see any result. Because predictive model is mostly a black box, it is hard to explain to the management how a predictive model works. Thus, it is possible that the management thinks the model building group is not productive.
Actual Productivity and Effectiveness Low to medium. See an earlier post. Since rules are usually created individually, there are redundancies among them. All the rules as a whole are not optimized. Highly productive and effective. It is often to see that a single predictive model outperforms hundreds or thousands of intuitive business rules combined.
Tools Excel spreadsheet, SQL, special scripting language, and others SAS, SPSS, R, Oracle data mining, and others
Adaptability Low. Since rules are created by people based on their experience manually, they are harder to adapt to the changing situations. High. Predictive models will be refreshed once new data become available. The newly refreshed model will reflect the changes automatically.
Maintenance Difficult. It is hard to maintain hundreds or thousands of rules. It is helpful to rank them based on their effectiveness. Since the number of predictive models is small, it is easier to maintain them.
Longevity Low. Business rules usually lose its effectiveness more quickly than predictive models do after their creations. High to medium. Because of the following two reasons, predictive models maintain its effectiveness longer than business rules: 1. they are built based on large quantity of historical data; 2. predictive models have the ability to generalize. The ability to generalize means they do not simply memorize historical data. They are structured to perform well for future data.

No comments: