Wednesday, April 02, 2014

Why Sometime It Is Not a Good Idea to Export Database Tables as Text Files?

It is a common practice to export a database table as a text file for the purpose of delivering data to another party. However, by converting a database table into a text file we lose many valuable information about data and that may cause problem.
If a column contains special characters such as new line (ASCII code 10 or 0A in hex), the records may not be successfully loaded back into a database as most import utility will treat new line the end of a record. For example, the record id 3 in following table has a new line character between "data" and "analytics". This can be shown using function utl_raw.CAST_TO_RAW().

SQL> select * from tbl_test;

        ID MESSAGE
---------- ----------------------------------------------------------------
         1 Hello
         2 World
         3 data
            analytics
SQL> select message, utl_raw.CAST_TO_RAW(message) ascii_code from tbl_test where id = 3;

MESSAGE                          ASCII_CODE
-------------------------------- ---------------------------------------------------------
data                             64617461200A20616E616C7974696373
 analytics
If we dump this table as a text file, record 3 will not be loaded correctly because the new line normally marks the end of a record. Other characters that make things complicated including comma(,), single quote('), double quote(") etc. Unfortunately, even some ETL tools convert database tables into text files and then load them back into another database. We have experienced problems with these tools.
Then how do we deliver our data to another party if we do not use text files? One approach is to use Oracle utilities such as data pump(expdp/impdp) or exp/imp. We may also use database link to fetch the data from a remote database directly into a table,e.g, create table tbl_abc as select * from tbl_abc@remote_db.

No comments: