SQL Server to SAS ODBC: The Partial Truth

Issue:

There are occasions when your ODBC connection from SAS to SQL Server returns partial data without an error flag, and you’d never know that you were missing data from looking at the SAS log.

 

I was running a SAS script in SAS Enterprise Guide (SASEG) which uses SQL pass-through to interface and retrieve data from Microsoft SQL Server. In this instance I wanted to create a dataset of all coupon redemptions in a given time frame. Below is the SAS script I was running (grayed out a few proprietary database names just to be safe).

sas code

 

Checking the log, my script ran without error and created a dataset with 15,330 rows.

 

first sas try

 

Later in the day, I wanted to change some code further down in the script. Since our database is not live, I would anticipate that a single query would return the exact same result set each time it’s executed.  However, I noticed that my row counts were changing each time I ran the script.  The second time I ran the script, it returned 2902 rows, and then the third time I ran the script it returned 13,816 rows!

third sas trysecond sas try

What was going on?!?

Confused, I took the exact SQL code and executed it directly in the SQL Server Management studio.  This time I noticed an error “Arithmetic overflow error converting expression to data type bigint.” However, even though there was an error, SQL Server still returned 1737 rows.sql server output 1

I ran the exact query a second time. This time it returned 3813 rows!

sql-server-output-2

 

It seemed that SQL server was encountering an error with a particular record and would return all rows up to the encounter, but somehow the ERROR message never made its way to SAS.  Also, since there was no “order by” clause in the query, the data results seemed to be ordering by some SQL Server clustering technique that might have been triggered by a random seed (which explains why my row counts always changed).

sql code

After removing the “cast” function, the query returned 19,994 records without error, and consistently returned 19,994 records every refresh thereafter.  I had discovered the issue, but I remain very concerned that SAS did not acknowledge the partial list and overflow error at all in the execution log.  If my row counts had not changed, I would have never realized that there was an error in the data.

 

Possible Solutions:

1) Execute every SQL Server script in SQL Server Management Studio prior to pasting into SAS. This could be a huge pain, because macros in pass-through scripts must be resolved prior to running.  An alternative to this would be to stress test the SQL scripts in SQL Server Management Studio, where we execute every possible permutation of the SAS query in order to try and protect against any subset.

 

2)  <TBD> Is there an error catching option? I have tried errorstop, stopontruncate, and exitcode options in the proc sql statement, but none of them seem to warn the user of this error. Please comment on this post if you have a suggestion, and I will update the post! Thanks.

.