Saturday, September 01, 2012

In-Database Data Mining

In previous post "Data Mining Components", we talk about four components or layers in a using data mining to solve business problems, e.g., fraud prevention, new customer life time value prediction, online ads click through rate, etc. Two abstract components are business problems and data mining algorithms. By abstract, we mean their existence is largely mental. They do not have physical forms and can not be purchased. Two physical components are data mining software and data management tools. Data mining software are the implementation of data mining algorithms. Data management tools could be relational databases or plain files.   It is important to realize that items within each layer are sometimes exchangeable. We can select the combination of data mining models, tools and databases that suit our needs.  In these years, we have found that performing data mining within database has its unique advantages.  In this in-database data mining scenario,  two physical components, i.e., data mining algorithm implementation and data management, are collapsed into a single component, Oracle database. For example, we have used Oracle Data Mining to build debit card and check deposit fraud  models. The in-database solution provides substantially increased security, productivity, manageability and scalability. This should not be a surprise because a database is designed to handle large data using SQL as a standard language in an enterprise environment. If we consider data mining as just more sophisticated SQL queries, nothing more, nothing less, a lot of difficulties in data mining practices will disappear naturally
       In-Database Data Mining 


