Snowflake vs Redshift: some syntax differences.
Snowflake ·Moving code from Redshift to Snowflake
As I said here, my company is in the process of moving from Redshift to Snowflake. There are loads of reasons for doing it: to mention a few, better handling of permissions, less maintenance and less effort to optimise query performance.
A really good article as a starting point.
I start reading around for differences between the two syntax in preparation to move lots of SQL scripts from Redshift to Snowflake, and this article explains well the major differences regarding:
- Window functions
- Timestamps’ operations and time-zones
- Casing and quotes
- General less forgiveness of Snowflake over Redshift in functions and datatype conversions
This article helped me a lot as starting point, to understand what could be the potential issue when the query was not running, but sometimes Snowflake error messages aren’t clear enough, and it’s hard to figure out what the issue is. So during the migration we decided to keep a log of all the differences in syntax that we found.
I’ll list them below, hopefully is helpful for other people dealing with a Redshift to Snowflake migration.
More timestamps issues
GETDATE()
is no longer supported, but you can use CURRENT_TIMESTAMP
to get the current timestamp and CURRENT_DATE
to get the current date. This syntax is also compatible with Redshift so you can safely replace all your GETDATE()
with CURRENT_TIMESTAMP
, remembering to convert to a specific timezone if needed.
DATEPART
is not allowed in Snowflake, but you can use the equivalent DATE_PART
, with exactly the same structure, checkout the documentation here.
More cast issues
When casting, Snowflake is less forgiving than Redshift, but has built in functions to better deal with those, TRY_CAST
that will return NULL
if the conversion cannot be performed. The function can be used only if the input is of type string
, when the input is numeric use instead TRY_TO_DECIMAL
, TRY_TO_NUMBER
, TRY_TO_NUMERIC
.
The function CONVERT
does not exist any longer, but CAST
is a safe alternative.
Boolean and NULL
With Redshift a syntax like:
CASE WHEN a is true THEN 'Yes' ELSE 'No' END
will work, in Snowflake is
is allowed only for NULL
identity verification, not for BOOLEAN
, so the above should be rewritten as
-- similar:
CASE WHEN a = true THEN 'Yes' ELSE 'No' END
-- using IFF:
IFF( a, 'Yes', 'No')
Snowflake, in fact, have an IFF
function, less verbose than CASE
for when only one condition need to be checked.
Snowflake does not infer a BOOL
type from `` and 1
, so something like
-- here b is a number with value 1 or 0:
SELECT CASE WHEN b THEN 'Yes' ELSE 'No' END
will not work on Snowflake, you can convert to a condition on the field, so b=1
or do TO_BOOLEAN(b)
or TRY_BOOLEAN(b)
for a safer result.
Another difference is in the Redshift function ISNULL(a,0)
, this is no longer available and it needs to be replaced with IFNULL(a,0)
, where this returns `` when the field a
is NULL
. Or even better you can use COALESCE(a,0)
or NVL(a,0)
that are supported by both databases, and return the same result.
String functions
Some issues with string function might occur too. The string concatenation operator is ||
and +
gives an error message:
-- working in Snowflake:
SELECT 'Snowflake' || ' is great'
-- not working in Snowflake:
SELECT 'Snowflake' + ' is great'
To remove left and right trailing spaces you would use TRIM
, valid in both databases, but BTRIM
that was doing the same in Redshift is not a valid function in Snowflake.
The only syntax allowed in both databases to get the length of a string is LENGTH
and LEN
is not valid anymore.
The equivalent of the hash function FUNC_SHA1
is SHA1
, so use this one with the same structure.
Parsing JSON
Contrary to Redshift, Snowflake allows a better handling of unstructured data so you can query JSON objects more easily. Suppose you have a JSON in the format:
my_json = '{"f2":
{"f3":1},
"f4":
{"f5":99,
"f6":"star"}
}'
to get star
from it in Redshift you would need:
select json_extract_path_text( my_json,'f4','f6')
in Snowflake:
select parse_json( my_json:f4.f6 )
To know more about how to deal with JSON and semi-structured data, have a look at this document or this post in the Snowflake community.
Sometimes the syntax differences are hard to spot, and you end up losing a lot of time troubleshooting, a good idea is try to comment out pieces of your SQL and then test out functions and syntax in an easier example.
Hope you find this useful. What other differences did you spot moving your queries to Snowflake?