Problem
We have the following text file.
$ cat ads_log_small.csv ADS_ID,DEVICE_OS,NUM_IMPRESSION,NUM_CONVERSION 32, Android, 1, 0 32, Android, 1, 0 32, Android, 1, 0 32, Android, 2, 0 32, Android, 1, 0 32, Android, 2, 0 32, Android, 1, 0 32, Android, 3, 0 32, Android, 2, 0
If we load records in the text file into a database table, the original order of records in the file is lost. In the post SAS data set vs. relational database table, we mentioned that,Unlike SAS data set, a database table is a set where records no order (unless we explicitly sort them by keys). If we want to preserve the original sequence of the record, how do we add a line number to the original file?
Solution
We can use Linux (or Cygwin on Windows) command awk to add a line number to the file.
$ cat ads_log_small.csv | awk '{if (NR==1) print "LINE_NUMBER" ,",",$0; else print NR-1,",",$0}'
LINE_NUMBER , ADS_ID,DEVICE_OS,NUM_IMPRESSION,NUM_CONVERSION
1 , 32, Android, 1, 0
2 , 32, Android, 1, 0
3 , 32, Android, 1, 0
4 , 32, Android, 2, 0
5 , 32, Android, 1, 0
6 , 32, Android, 2, 0
7 , 32, Android, 1, 0
8 , 32, Android, 3, 0
9 , 32, Android, 2, 0
We can write the output to a new file.
$ cat ads_log_small.csv | awk '{if (NR==1) print "LINE_NUMBER" ,",",$0; else print NR-1,",",$0}' > ads_log_new.csv
No comments:
Post a Comment