Showing posts with label Oracle external table. Show all posts
Showing posts with label Oracle external table. Show all posts

Tuesday, November 12, 2013

Five Ways of Loading Text Files Into Oracle Database

Problem

Often we need to load text files, such as comma delimited files, into Oracle databases as tables.

Solution

The following are five ways to do it, i.e., external table, SQL loader, SQL insert, Oracle SQL Developer Import Data function and Oracle Apex Load Data. Actually, the real options are only the first three, external table, SQL loader, and SQL insert. SQL Developer and Apex use of the three options to import files. Of course, there are more than five ways to do it. For example, we can also use third party tools such ETL utilities, Microsoft Access, R etc. to load data into the database through ODBC connections.

External Table

We have described how to define external table in the database that points to text files in post Analyze Text Files in Real Time Using SQL Without Loading Them into Database. Once a external table is defined, we can simply define a permanent in database table using CTAS "create table as select". For example, we create a database table for the external TBL_DATA1_EXT as the following.

SQL> create table tbl_data1_real as select * from TBL_DATA1_EXT;

The above SQL statement creates a table of the same format as the external table and physically load the data into it. This is my favorite way of loading data. Using this approach, I am able to perform some analyze on the external tables and make sure them look right before I load them.

SQL Loader

We follow the three steps to use SQL loader to import data: 1.create a table; 2. Write a control file; and 3. run sqlldr to load the file.

Step 1. Create table using SQL.
create table tbl_data1_real
(
ads_id number,
device_os varchar2(32),
num_impression number,
num_click number
);
Step 2. Compile a control file like the following.
load data
infile 'c:\\projects\\log\\ads_log.csv'
append
into table TBL_DATA1_REAL
fields terminated by '|'
OPTIONALLY ENCLOSED BY '"' AND '"'
trailing nullcols
( ADS_ID ,
DEVICE_OS,
NUM_IMPRESSION,
NUM_CLICK
)
Step 3. Load the file into the table.
$ sqlldr user/password@localhost:1521/xe CONTROL=ads_log2.ctl skip=1
SQL*Loader: Release 11.2.0.1.0 - Production on Tue Nov 12 07:04:25 2013
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Commit point reached - logical record count 64
Commit point reached - logical record count 128
Commit point reached - logical record count 192
Commit point reached - logical record count 256
Commit point reached - logical record count 320
Commit point reached - logical record count 384
......................

SQL Insert

Please see my post A Quick Way to Import Spreadsheet Into a Relational Database. We simply generate SQL insert statements for our data and run them. It is a great way o quickly import text files of small size.

Oracle SQL Developer Import Data

Oracle SQL Developer is a free development tool. We first create the destination table using SQL statement mentioned in the above section SQL Loader. Within SQL Developer, we select the table can right click to select Import Data.
There are three options to load the data, Insert Scripts, Staging External Table, and SQL Loader Utility as shown below. We can pick the one we like.

Oracle Apex Text File Import

With Apex 4.2, we open the web browser and log onto the admin account for the workspace. We then go to SQL Workshop/Utilities/Data Workshp. From there, we can use the data load function.

Conclusions

We described five ways of importing text files into Oracle database tables. My favorite way is Oracle external table because I can run SQL queries against the files to validate them before they are physically imported. Of course, there are more than five ways to do it. For example, we can also use third party tools such ETL utilities, Microsoft Access, R etc. to load data into the database through ODBC connections.

Saturday, November 09, 2013

Analyze Text Files in Real Time Using SQL Without Loading Them into Database

Problems

A log file that records the number of impressions and number of clicks for advertisements are constantly updated (growing). The file looks like the following:

ADS_ID|DEVICE_OS|NUM_IMPRESSION|NUM_CONVERSION
32| Android| 1| 0
32| Android| 1| 0
32| Android| 1| 0
32| Android| 2| 0
32| Android| 1| 0
32| Android| 2| 0
32| Android| 1| 0
32| Android| 3| 0
32| Android| 2| 0

We want to generate real time reports about the summarized performance of advertisements, such as the click through rate by ads_id, click through rate by operation system and ads_id, etc. The reports should reflect the real time changes in the log file.

Solutions

One of the best solutions is to use an Oracle external table and view. There are two tasks to be performed.

Task 1. Define an external table on the log file. An external table is just a pointer to the location of file and definition of its format. The file itself is not loaded into the database as a permanent table. Once the external table is defined, we can query it using SQL just like regular table. The data is read by Oracle on the fly. Thus any changes in the file will be reflected on the query result.

Task 2. Define views to summarize the external table and produce reports. Since views just store the process logic and only produce the output when we query it, the content of views always reflects the latest information in the external table which in turn captures the changes in the log file.

Task 1.Define an external table.

Step 1. If not yet, we need to assign "create any directory" privilege to current user.

Log in as the system user and run the following command under sqlplus.

SQL> grant create any directory to current_user;

Step 2. We create directory. It is assumed that the Oracle database server has the access to the directory where the log file is located.

Log in as the current user and create an directory.

SQL> create directory dir_files as '/home/log/data';

Step 3. We define the external table that points to the log file.

create table tbl_data1_ext
(
ads_id number,
operation_sys varchar2(32),
num_impression number,
num_click number
)
organization external
( type oracle_loader
default directory dir_files
access parameters
( records delimited by newline
skip 1
fields terminated by '|'
missing field values are null
)
location('ads_log.txt')
);
Once it is done, we can verify if the file is define correctly.
SQL> select count(1) from TBL_DATA1_EXT;

COUNT(1)
165201

We use Linux command wc to count the number of lines in the file. The text file has one more line which is the header. The header was skipped when we define the external table.

$ wc -l ads_log.txt
165202 ads_log.txt

Task 2. Create views. Once we have the external table, we can create views to summarized it.

SQL> create view v_ctr_for_ads as select ads_id, sum(NUM_IMPRESSION) num_impression, sum(NUM_CLICK) NUM_CLICK, sum(NUM_CLICK)/sum(NUM_IMPRESSION) clr from TBL_DATA1_EXT group by ads_id;
View created.

SQL> create view v_ctr_for_ads_os as select ads_id, OPERATION_SYS, sum(NUM_IMPRESSION) num_impression, sum(NUM_CLICK) NUM_CLICK, sum(NUM_CLICK)/su m(NUM_IMPRESSION) clr from TBL_DATA1_EXT group by OPERATION_SYS, ads_id;
View created.

The user can look at views the get the summary information about the log file in real time.
SQL> select * from v_ctr_for_ads where num_impression>100 order by clr desc;

ADS_ID NUM_IMPRESSION NUM_CLICK CLR
32 4116 5 .001215

SQL> select * from v_ctr_for_ads_os where num_impression>100 order by clr desc;

ADS_ID OPERATION_SYS NUM_IMPRESSION NUM_CLICK CLR
32 Android 3360 5 .001488
32   756 0 0

Conclusions

With Oracle external tables that are just pointers to the location of text files and definition of file formats, we can perform SQL queries against text files without loading them into the database as physical database tables. It is a great solution when we want to repeatedly analyze text files that are constantly changing. Combining external tables and views, we can get query results that reflect the most recent content of text files. This solution is also very "clean" since there is no permanent database tables created.

Saturday, May 26, 2012

More On Oracle External Table

Personally, I prefer external table over SQL Loader. When possible, I created a external table and run some queries to make sure the text file looks good. Then I loaded them as a table or materialized view. Basically I run the one of the following queries to load the data.
create table tbl_actual_tab as select * from tbl_external_table;
 or
create materialized view tbl_actual_tab as select * from tbl_external_table;

How to Query a Text File In Oracle

With Oracle external tables, we can run SQL queries against text files without physically loading them into the database. To create a external table, simply specify the "ORGANIZATION EXTERNAL" parameters after "Create table". Once it is created, you can query the text file like a regular table.

 CREATE TABLE "BDM"."TBL_TRADE_EXT" (
 "SYMBOL" VARCHAR2(32),
 "DT" VARCHAR2(32),
 "OPEN" NUMBER, "HIGH" NUMBER,
 "LOW" NUMBER, "CLOSE" NUMBER,
 "ADJ_CLOSE" NUMBER )
 ORGANIZATION EXTERNAL (
TYPE ORACLE_LOADER DEFAULT DIRECTORY "DIR_MKT_DATA"
 ACCESS PARAMETERS ( records delimited by '\n' SKIP 1 fields terminated by "," )
 LOCATION ( 'all.csv' ) )