Join Joey D'Antoni for an in-depth discussion in this video Why cursors are bad, part of SQL Server Performance for Developers.
- [Instructor] So one common anti-pattern we see, particularly with application developers that are new to working with databases, is the use of cursors. Here we're going to talk about why cursors are bad, and typically this is hard to understand 'cause we have a pretty common problem set that we look for in application code. You need the equivalent of a for loop. In terms of applications we do this all the time. When you're developing an application, it's quite frequent that you'll build a for loop into your application code. However, with databases, it's possible to have millions, billions, or even trillions of records.
Typically you'd want to do for loop when you had to iterate through a small set of records maybe five to ten, definitely less than thousands. However what a cursor does is iterates through each record one at a time, and that can be problematic in terms of how the performance works overall, so... The other thing where we have this scenario is you think you need to perform row-based processing when you might not be able to perform set-based processing, and we talk about set-based processing versus row-based processing a cursor is a good example of row-based processing.
That's where we open our cursor, we read each row into that cursor as a variable and then we do some sort of operation on it. A set-based operation is where we take two tables, or maybe even a temp table, and compare those values, and then join all of those records in one set. So, even if we're doing millions of records that can take place in milliseconds 'cause it's all one operation. Whereas with a row-based operation, we incur a great deal of overhead and our scale becomes linear, so if we look at a comparison of the two.
Set-based operations are almost always going to scale better. They're just faster and at certain scale, you're just never going to be able to complete a row-based operation, 'cause they're going to scale linearly, so as you get more rows, it's definitely going to increase the time that it's going to take to complete your operation. Set-base operations are also typically easier for folks to read. Cursors are pretty verbose in terms of text sets available. And cursors also use a lot more memory, and since they operate longer, they create more scenarios where we have blocking problems, and this can affect how other users have access to your data, whether it's reading or writing the data.
So let's take a quick look at a cursor example and we'll walk through the code, and then we'll show how it runs. I'm opening a cursor. I'm using the AdventureWorks2016CTP3 database that we've been using. And on line five here I'm declaring a cursor. Line six, I'm using the LOCAL STATIC option. We're opening that cursor for this query, which is select ProductId from Production.Product for DaysToManufacture is less than or equal to one. So let's go ahead and execute that query, just to get an idea of how it's going to run.
And so that runs pretty quickly, it ran subsecond there, but so that's going to be our cursor, so it's going to load each of those values into the cursor and for each one it's going to perform this operation and that's going to be select star from Production.ProductInventory, where the ProductID equals that id. So for each record here, which was 400 rows, this cursor's going to iterate once, so it's going to do 400 operations, so let's take a look at how long this is going to take to run real quick.
So there, that ran in 16 seconds. We could've done this whole thing with just a simple interjoin between these two tables and done it in one operation though, that would've taken milliseconds, but instead we operated on a row by row fashion. While this can be useful for some maintenance operations that maybe have to iterate through each database, for application code, that's going to be user facing, this is really something you're never going to want to do.
- Query execution
- How to read an execution plan
- What not to do with SQL Server
- Why cursors are bad
- Why scalar UDFs are expensive
- Datatypes and design
- What is a columnstore index?
- Transaction isolation