A log file that records the number of impressions and number of clicks for advertisements are constantly updated (growing). The file looks like the following:ADS_ID|DEVICE_OS|NUM_IMPRESSION|NUM_CONVERSION 32| Android| 1| 0 32| Android| 1| 0 32| Android| 1| 0 32| Android| 2| 0 32| Android| 1| 0 32| Android| 2| 0 32| Android| 1| 0 32| Android| 3| 0 32| Android| 2| 0
We want to generate real time reports about the summarized performance of advertisements, such as the click through rate by ads_id, click through rate by operation system and ads_id, etc. The reports should reflect the real time changes in the log file.
One of the best solutions is to use an Oracle external table and view. There are two tasks to be performed.
Task 1. Define an external table on the log file. An external table is just a pointer to the location of file and definition of its format. The file itself is not loaded into the database as a permanent table. Once the external table is defined, we can query it using SQL just like regular table. The data is read by Oracle on the fly. Thus any changes in the file will be reflected on the query result.
Task 2. Define views to summarize the external table and produce reports. Since views just store the process logic and only produce the output when we query it, the content of views always reflects the latest information in the external table which in turn captures the changes in the log file.
Task 1.Define an external table.
Step 1. If not yet, we need to assign "create any directory" privilege to current user.
Log in as the system user and run the following command under sqlplus.
SQL> grant create any directory to current_user;
Step 2. We create directory. It is assumed that the Oracle database server has the access to the directory where the log file is located.
Log in as the current user and create an directory.
SQL> create directory dir_files as '/home/log/data';
Step 3. We define the external table that points to the log file.create table tbl_data1_ext ( ads_id number, operation_sys varchar2(32), num_impression number, num_click number ) organization external ( type oracle_loader default directory dir_files access parameters ( records delimited by newline skip 1 fields terminated by '|' missing field values are null ) location('ads_log.txt') ); Once it is done, we can verify if the file is define correctly. SQL> select count(1) from TBL_DATA1_EXT;
We use Linux command wc to count the number of lines in the file. The text file has one more line which is the header. The header was skipped when we define the external table.$ wc -l ads_log.txt 165202 ads_log.txt
Task 2. Create views. Once we have the external table, we can create views to summarized it.SQL> create view v_ctr_for_ads as select ads_id, sum(NUM_IMPRESSION) num_impression, sum(NUM_CLICK) NUM_CLICK, sum(NUM_CLICK)/sum(NUM_IMPRESSION) clr from TBL_DATA1_EXT group by ads_id;
SQL> create view v_ctr_for_ads_os as select ads_id, OPERATION_SYS, sum(NUM_IMPRESSION) num_impression, sum(NUM_CLICK) NUM_CLICK, sum(NUM_CLICK)/su m(NUM_IMPRESSION) clr from TBL_DATA1_EXT group by OPERATION_SYS, ads_id;
The user can look at views the get the summary information about the log file in real time. SQL> select * from v_ctr_for_ads where num_impression>100 order by clr desc;
SQL> select * from v_ctr_for_ads_os where num_impression>100 order by clr desc;
With Oracle external tables that are just pointers to the location of text files and definition of file formats, we can perform SQL queries against text files without loading them into the database as physical database tables. It is a great solution when we want to repeatedly analyze text files that are constantly changing. Combining external tables and views, we can get query results that reflect the most recent content of text files. This solution is also very "clean" since there is no permanent database tables created.