This is the starting point of a series of blogs regarding Azure SQL DB performance. This blog contains the setup and links to all the different database SKU’s I’ve tested so you won’t have to scroll endlessly to the part that is of interest to you. I hope you’ll find this useful!
The trigger for this feast were a number of questions from my employer. What database tier and SKU (Stock Keeping Unit) should we choose for customer C. Well, the answer depends on their budget and the performance they’re expecting. And we didn’t have a conclusive answer. Microsoft will provide you with their statistics on database performance, but what does that mean in real life. Because no matter what the good people in Redmond can come up with as databases, we as data engineers can do worse. Much worse.
So, I created some horrible query’s to insert, select and delete data and fired them off to a database. My own laptop with nvme disks provided the baseline, after that I gradually worked my way up from the Basic DTU database to the Hyperscale database.
To make sure my results are comparable, I’ve chosen to use SQL Query Stress, an open source tool to run query’s against databases. I can choose the number of iterations and the number of threads that are active and see some numbers come back.
To enrich my data a bit, I’m using SQL Server Management Studio to collect wait statistics and disk latency. These numbers will help when analyzing differences.
There are a lot more options available, these are the ones I choose to build on, to compare the databases.
On the other hand, I checked the Microsoft documentation to see the limits they documented and from the portal I added the monthly cost. For the serverless options, I chose to calculate for 1 month running fulltime. This is not completely fair as the use case for this tier is to run only when needed. But what would be a fair comparison? 15 hours? 20 hours? I’ll compare the price of the provisioned database so you’ll see a tipping point when provisioned might be cheaper.
The databases, all of them, have been deployed as is. No changes to the configuration, DOP etc have been made. The performance shown is the one you get out of the box. There’s no doubt you will have to change settings to suit your workload. You should therefore look at all the numbers as relative to each other and not absolute measurements.
To get my results, I created a snapshot of the wait statistics and the disk latency data before starting either the insert, select or delete query’s. After the respective query ends, I’m getting the latest of the wait statistics and disk latency to see what has been bothering the server most. I’ve checked the Azure portal as well to see the load on Data IO, Log IO and processor. These last measures are hard to use because of the high level of aggrgation. The raw numbers can be found in the databases but even with those numbers, it’s hard to get a clear picture, other than the query is horrible and must be tuned ;).
I’ve documented the results in my Github as well, you can find them here. In the repo, you’ll find the query’s and Excel sheets used to document the findings.
Azure offers two purchasing models, DTU or vCore. With DTU, you can increase the performance by scaling up, with the vCore model there are independent scaling options within compute and storage.
DTU, Database Transaction Unit, is a combination of compute, storage and IO. vCores gives you a number of cores from an underlying host. Within the vCore model are a number of choices, most important one is the serverless or provisioned one. The former more cost efficient for accidental workloads.
If you’re going provisioned, there’s the general purpose, Business Critical and Hyperscale tier to give you more performance and/or more storage. Finally there’s the managed instance that has a different architecture.
I will use a VM (L3 tier) with a SQL 2019 instance installed as a comparison, but won’t test all the possible SKU’s there. You’ll find an extensive list of performance result in the main overview, including the Gen4 databases. These are, at the time of writing (first half of 2022) deprecated and being moved to Gen5. So while you’ll find them, I wasn’t able to test them, simply because they didn’t show up in the portal. What I didn’t test was what happens when you change the minimum number of cores on serverless. This would create an amazing amount of extra tests that might show up in the future, but for this first run wasn’t feasible.
Choosing the right tier
I’m sure you’d already like to know the outcome. It’s really hard to give a complete description, but in the closing post of this series I’ll share my thoughts. But I highly encourage you to read through the series to get a feel for all the differences.
Choices are dependent on prices as well, I’ve included the pricing I encountered when testing all the tiers and SKU’s. Your prices are probably different because they are dependent on the contract you’ve got with either Microsoft, a CSP or other third party. Prices can and will change too so again, look at them as relative to each other.
I’m in huge debt to Paul Randal of SQL Skills. Without his wait statistics reference, a lot of this series would never have been possible. I’ve rephrased his documentation to fit into my blogs. Any error or misconception on that part is mine and mine alone. Please check Paul’s documentation to get the full description of waits.
At the time of writing (july – september 2022) the information was correct and checked against the Microsoft documentation. I will try and keep things up to date but this isn’t my only job ;). As these have been my tests, my documentation and my blogs, errors may have occurred. These are my errors.
Thanks for reading!