In this video, learn about the difference between declarative and procedural languages.
- [Instructor] SQL is the language of tabular data. It started as the query language for relational databases, but now it's used in data analytics tools, like Apache Spark and Kafka Distributed Streaming Platforms. SQL was developed in the 1970s and is still widely used almost 50 years later. SO why is SQL so popular? The short answer is that it is well suited for working with tabular data. Much of the data we use lends itself to tabular structures. Retailers track customer data and tables. Healthcare companies manage patient's data with relational tables. In fact, it's hard to think of an industry that can't make use of SQL and tabular data. Another reason SQL is popular is that it's declarative. We specify what we want without having to concern ourselves with how to get what we want. For example, we can state what columns we want to retrieve and filter the resulting set of rows by some condition, like in this example. The results of our queries are tables, so we can use SQL to query the results of our SQL queries. Now I should point out that there are statements that can result in output that is not strictly tabular. For example, some commands that have been added to recent versions of SQL support rollups and cubes and other data structures that are useful in reporting, but don't preserve the relational structure. If you are a developer, you're probably used to working with procedural languages. In languages like Java, Python, and C++ you specify how to do something. So for example, you might specify how to loop through an array. In procedural languages, you directly manipulate data structures and perform operations on them in an order that you specify. SQL statements have to be mapped from their declarative form into a procedural form in order to return the data specified in the query. Now the procedural version of a query is known as an execution plan. It's a sequence of steps that include things like scanning, filtering, and joining data in such a way that when the execution plan is done it produces the output that you specified in your declarative query. When we develop procedural code, we're often thinking about how to make it efficient. The same thing applies to execution plans. While there may be many ways to execute a query, some are more efficient than others. For example, if you want to return data about a subset of customer purchases where the customer ID starts with, oh let's say, 3014. Then you can look through all the rows in some 10 million row database, but that's highly inefficient. A much better approach is to use an index, which points to the location of the datablocks containing the rows that you're interested in. In this course, you'll learn about query plans and the different kinds of steps that are used in them. We'll discuss the trade-offs of different ways to map a declarative query to a procedural execution plan and perhaps most importantly, we'll learn about techniques for structuring your data and crafting your queries to produce efficient execution plans.
- How SQL executes queries
- Working with PostgreSQL tools for tuning
- Bitmap and hash indexes
- Using different types of indexes to improve performance
- Challenges with joining tables
- When to use partitioning to improve performance
- Collecting statistics about data in tables