Join Joey D'Antoni for an in-depth discussion in this video Let’s look at the T-SQL it puts out, part of SQL Server: Developer and DBA Collaboration.
- [Instructor] If you're a developer, you may have had discussions with your DBAs about the benefits and costs of using object-relational mapping tools like Entity Framework. Your DBAs have probably complained about the SQLs that these tools output. Compared to writing your own code to access data, and in general your own forms and things like that, as a developer you're going to be way more productive using these object-relational mapping tools like Entity Framework. However, there are several traps that are easy to fall in that give these tools a reputation for performing very poorly.
There's no inherent performance problem but if you let the tools create the database for you, it can be slightly problematic. You may not get a good database design. Your T-SQL code is not going to be optimal. Really, at its heart, Entity Framework is a way of exposing .net objects without actually knowing their values, but then fetching and updating those values from the database when you need them. It's important to be aware of when and how EF is going to hit the database, and that's a process called materialization. One of them problems with this and why Entity Framework has such a bad reputation amongst DBAs is that material-ation process, tends to retrieve all the rows and it lets the application deal with them.
So as you scale, you retrieve a lot of rows from the database. The other problem Entity Framework has is that it leads to some very ugly code. So why should you concern yourself with this when your application works perfectly? Just ask a simple question. How big is your development environment? How big is your test environment? And most of all, how much data are you going to have in production and how big will it be in the next few years? Because some of these operations will work fine on 100 megabyte database when you fetch half the database because all of that data's going to live in memory.
But as you get to hundreds of gigs of data and you need half of that database, that can be really painful. And tracking those SQL server performance issues will be very difficult, especially if the code is extremely ugly. Trust me, the code is ugly. So rather than keep talking about how bad the code that these tools produce, we're going to take a look at some. But the one thing I want you to take away from this video is to write your own T-SQL code. Ideally, write it in stored procedures. We'll talk more about that in a subsequent video, but let's look at the code.
So here we're over in Management Studio and you can see some red errors here 'cuz I don't have these objects created in my database 'cuz this was actual code that we've extracted from a real customer that just shows outputted into the framework. So here we see we're selecting a lot of columns starting on line three, going line through 24. We're selecting all these columns from a nested subselect called UnionAll, starting on line 26 to line 36, where we have this K statement where C14 equals 1.
We're using a casting things as a bit. Then we're nesting another query here starting on line 40. Nesting another select statement starting on line 47, going all the way to another nested select on line 63. This is kind of an ugly query, but wait it gets worse, 'cuz we're unioning all on another select query to bring in the same columns on lines 82 through 94. We're doing a left outer join, so we're joining all the rows from the first query to all the rows on the second query where this cass value as unique identifier equals other value where it's null.
And then we're doing an order by all the way on line 101. This query's not inherently bad, but it is very hard to un-nest. Fortunately, we spent a lot of time already un-nesting the query to show you what the actual query looks like. It's a lot simpler. You can see it all on one screen. We're selecting 13 columns from a table. We're casting zero as a bid value as C1 and 1001 is C2 where column 10 equals this GUID value, that's a unique identifier.
This still has some problems. We're doing an implicit conversion on an identifier. However, this is much easier, a, for the optimizer to generate an execution plan for it and then b, for you or your DBA to take a look at intune, because it's a much simpler query. You can understand what's going on a lot better, even though it took some time to write. The other code was spit out by the Entity Framework tool. This was something that was hand written by a consultant. Much easier process to do this, this way.
- Schema design
- Building good queries
- Writing stored procedures
- Building a CI/CD pipeline for database deployment