Tuesday, November 06, 2012

Major Data Mining Steps

Normally, we need to go through the following steps to build a predictive modeling solution.

  1. Data Gathering 
  2. Data Validation 
  3. Data Preparation 
  4. Feature Variable Calculation (creating more salient variables that are more predictive of target)
  5. Predictive Model Building and Testing 
  6. Model Deployment
In our opinion, the main challenges are: Data Preparation, Feature Variable Calculation and Model Deployment. Probably 90% of time is spent on the above three areas. 

1 comment:

Fabio said...

Hello, I'm a DBA and developer interested in using models to try to help me out in predicting or understanding the database performance. There are views such as v$sesstat I'd like to sample every minute or so. Then I would classify them as database (or sessions) running OK or database (or sessions) facing issues. Also, I'd like to know if there is a specific statistic# set of statistics (features) that are outstanding in a way to help me out to understand what is going on. Would you mind giving me some directions of which data mining / Oracle data mining specifics docs I should read? You help is really appreciated.