Friday, July 20, 2012

SAS data set vs. relational database table

There is a key difference between SAS data sets and relational database tables. In SAS data set, like a text file, there is a natural, fixed order of records (or observations in SAS term). The order of a data set will not change unless we specifically modify it using method like "proc sort". So in SAS, it makes perfect sense to say "extract the first 500 observations from a data set".

However, in a relational database table, there is no natural order defined for records in a table. To say "exact the first 500 records from a table" does not make sense. We have to specifically define an order. Thus in a relational database, the following statements all make sense.
"Extract 500 records randomly from a table."
"Extract the top 500 records with the highest account numbers from a table (assuming account numbers are unique)".

Caution needs to be taken when converting SAS data set to a relational data table. The natural order of records in the SAS data set will be lost. One way to preserve the natural order of a SAS data set is to  create an extra column based on SAS system variable _N_.


3 comments:

Jhon anderson said...

Useful post.Thanks for sharing.
SAS Course in Chennai | SAS Institutes in Chennai | SAS Training Institutes in Chennai

Mathew Stephen said...

Thank you for taking time to provide us some of the useful and exclusive information with us.
Regards,
SAS Training Institute in Chennai | SAS Training in Chennai | SAS courses in Chennai

sivanesan said...

Nice post. Very interesting to read. Thank you for Sharing.
erp in chennai