You’ll find a list of resources for dedicated SQL pool and serverless SQL pool functions of Synapse SQL
Key design decisions
The articles below highlight concepts and design decisions for Synapse SQL development:
Specific coding techniques, tips, and recommendations for development:
Best practices for dedicated SQL pools in Azure Synapse Analytics
Reduce cost with pause and scale
Tune query performance
- Performance tuning with materialized views
- Performance tuning with ordered clustered columnstore index
- Performance tuning with result set caching
Group INSERT statements into batches
Use PolyBase to load and export data quickly
Load then query external tables
Hash distribute large tables
Do not over-partition
Minimize transaction sizes
Reduce query result sizes
Use the smallest possible column size
Use temporary heap tables for transient data
Optimize clustered columnstore tables
Use larger resource class to improve query performance
Use smaller resource class to increase concurrency
Use DMVs to monitor and optimize your queries
Dedicated SQL pools have several DMVs that can be used to monitor query execution.
- Monitor your workload using DMVs
- DBCC PDW_SHOWEXECUTIONPLAN
Best practices for serverless SQL pool in Azure Synapse Analytics
Use appropriate data types
Optimize repeating queries
Use CETAS to enhance query performance and joins
Copy and transform data (CETAS)
Distribution Advisor in Azure Synapse SQL
Performance tuning with materialized views using dedicated SQL pool in Azure Synapse Analytics
A standard view computes its data each time when the view is used. There’s no data stored on disk.
A materialized view pre-computes, stores, and maintains its data in dedicated SQL pool just like a table.
Performance tuning with ordered clustered columnstore index
- By default, for each table created without an index option, an internal component (index builder) creates a non-ordered clustered columnstore index (CCI) on it.
- Ordered clustered columnstore indexes by enabling efficient segment elimination, resulting in much faster performance by skipping large amounts of ordered data that don’t match the query predicate.
- When creating an ordered CCI, the dedicated SQL pool engine sorts the existing data in memory by the order key(s) before the index builder compresses them into index segments.
Queries with all these patterns typically run faster with ordered CCI.
- The queries have equality, inequality, or range predicates
- The predicate columns and the ordered CCI columns are the same.
Data loading performance
The performance of data loading into an ordered CCI table is similar to a partitioned table. Loading data into an ordered CCI table can take longer than a non-ordered CCI table because of the data sorting operation
Reduce segment overlapping
The number of overlapping segments depends on the size of data to sort, the available memory, and the maximum degree of parallelism (MAXDOP) setting during ordered CCI creation.