Thursday, November 29, 2018

Load Text Into Oracle Table CLOB Column Using Python

A text file contains Chinese texts that describe close to 4 thousand small towns. The ID of a town is always the first line followed by the text. The text could be as long as 22k characters. Thus, we have to use CLOB to store these texts.
1
宁明县爱店镇成立于1992年3月,位于中越陆地边境公母山下,与越南谅山省禄平县接壤,
边境线长达25.5公里,...
2
林逢镇位居田东县城东部,地处右江河谷中心地带、大西南出海通道咽喉要地,东经南宁可达北海、
防城及广东沿海各地,...
From SQL Developer client, I connect to the Oracle database and create a table with a CLOB column to store the text.
#create a table to hold texts
create table t_text (id number, val clob);
I load the file into the table on Amazone RDS Oracle database using the following scripts.
import cx_Oracle
connection = cx_Oracle.connect("xxxxx","xxx",\
                               "orcl.xxxxxx.us-east-1.rds.amazonaws.com:1521/orcl",\
                               encoding = "UTF-8", nencoding = "UTF-8")
cur = connection.cursor()
cur.setinputsizes(VAL = cx_Oracle.CLOB)
f = open('id_baike2.txt',  encoding="utf8")
line1 = f.readline()
line2 = f.readline()
while line2:
    cur.execute("INSERT INTO t_text (ID, VAL) VALUES (:ID, :VAL)", \
                ID=int(line1),VAL=line2)
    line1 = f.readline()
    line2 = f.readline()
f.close()
connection.commit()
cur.close()
connection.close()

No comments: