From the course: SAS® 9.4 Cert Prep: Part 07 Using SQL in SAS

Demo: Reading and filtering data with SQL

- [Instructor] Let's use PROC SQL to select columns and filter rows from an existing SAS table and create a report. So we'll start with the most basic PROC SQL query. We began with just proc sql semicolon and then I'm going to add a select statement. Select and asterisk will select all columns and I want to read from the table pg1.storm_final. The select and from classes provide the minimum information that is needed in a query. I'll submit the step. First, I'll look at the log. And just verify there are no warnings or errors. And then we'll look at the results. Our query selected all columns and all rows from the storm_final table. The results were returned in a report. Well, let's go back to the program and change the select clause. Instead of all columns, now I'd like to choose a subset. I'll include Season, comma, Name. Remember all column lists need to be separated with commas in PROC SQL. After Name, StartDate. And after StartDate, I'll use the format equal option. And we'll display the values as mmddyyy10. We can still use standard SAS formats. Comma and the last column will be MaxWindMPH. The from clause doesn't change at all. I'll run that step. And a new report is created with just those four columns and notice StartDate is formatted. What about computing a new column? For example, remember Name is in uppercase, what if I want it instead to be proper case? In the SELECT clause, I can take advantage of functions or other expressions to create or modify columns. I'll use the propcase function, the argument will be Name and then I use the keyword as followed by the column I want to create. In this case, I want to overwrite back to the same column, Name. I'll run again. And Name looks great. Okay, what else can I do in my code? Let's take advantage of some of the other clauses in an SQL query. I'll add a where clause after the from clause. And the expression can match the way we would have written the where statement in a SAS procedure or data step. So where MaxWindMPH is greater than 156, and Season greater than 2,000. And one more clause in our query. I'll take advantage of SQL's order by clause. We'll sort the values by MaxWindMPH and we'll do it in descending sequence and then within MaxWindMPH, we'll sort by name. Again, notice the comma separating those columns. Finally, I can add a title statement. Because PROC SQL is creating a report, my first title will be International Storms since 2000. And my second title will be Category 5 Wind greater than 156. And at the end of my program, I'll add a title null statement to clear those titles. Let's see the final result. Looks great. We have the titles for our report, we've subset the data to include only storms with wind greater than 156 and you'll notice the sort order with the strongest storm Patricia at the top of the report. Now, one thing I really want to emphasize with this program is that I want you to remember that the query is one single statement with these multiple clauses. So notice, there's only a single semicolon at the conclusion of that select statement. Unlike data steps and procedures that we've written before, this time all of the criteria for the report is included in a single statement.

Contents