The differences between Azure Synapse and SQL Server
1 min readMay 22, 2022
- Recursive CTEs are not available as of now
- Modifying a table with a CTE is not possible (unless you have hash index)
- Introduced in Synapse are Distributions: Hash, Round Robin and Replicate. In general Hash is the best table if you are able to get a decent column to index on. Round robin will always do some “Shuffling” when querying data that causes it to be slower. The more unique this column the better. That or ensure that the data will have the least skew possible.
Distributions in synapse will determine how you assign data to each of the 60 distributions in your database. The column you use for this distribution should follow the following rules:
- Is used in
JOIN
,GROUP BY
,DISTINCT
,OVER
, andHAVING
clauses. When two large fact tables have frequent joins, query performance improves when you distribute both tables on one of the join columns. When a table is not used in joins, consider distributing the table on a column that is frequently in theGROUP BY
clause. - Is not used in
WHERE
clauses. This could narrow the query to not run on all the distributions. - Is not a date column.
WHERE
clauses often filter by date. When this happens, all the processing could run on only a few distributions. Ie. if 10 users are trying to filter on the same date, this means all of the calculations will be done by one distribution.