From the course: SAS® 9.4 Cert Prep: Part 12 Combining Tables

Processing a one-to-one merge

- [Instructor] Let's look at an example of merging tables. Suppose we want to combine class and class teachers in a single table. Notice that the name column is in both tables, and both tables are sorted by name. This is a one-to-one merge since each value of name is in both tables. To merge tables in the data step, you use a merge statement rather than a set statement. You can list multiple tables on the merge statement as long as each table has the common matching column that is listed on the by statement. When a BY statement is used in a data step, the data must be in sorted order. Typically you would use PROC SORT steps to arrange the rows of the input tables by the matching column before the data step merge. So here's the data step merge that will join our two tables. Both tables are listed in the merge statement, and the common column "name" is listed in the by statement. Let's see how SAS processes the code behind the scenes. The data step merge process is very similar to how you would envision matching two lists by hand if the values are in sorted order. SAS simply compares rows sequentially as it reads from the multiple tables matching rows based on the value of the common column. In the compilation phase, all of the columns from the first table listed on the merge statement and their attributes are added to the PDV. SAS then examines the second table on the merge statement. Any additional columns and their attributes that are not already in the PDV are added. If there are any other statements in the data step that create new columns, they are also added to the PDV. Finally any other compile-time statements are processed. In the execution phase, SAS begins by examining the by column value for the first row in each table. If they match then both rows are read into the PDV, additional statements are executed, and at the end of the data step the row is written to the output table. SAS returns to the top of the data step for the next iteration and advances to row two in both tables. Again the values in name match so both rows are read into the PDV, and so on. That sequential comparison process continues until all rows are read from each table listed on the merge statement.

Contents