From the course: Implementing a Data Warehouse with Microsoft SQL Server 2012

Unlock the full course today

Join today to access over 22,500 courses taught by industry experts or purchase this course individually.

Using Data Quality Services to find duplicate data

Using Data Quality Services to find duplicate data - SQL Server Tutorial

From the course: Implementing a Data Warehouse with Microsoft SQL Server 2012

Start my 1-month free trial

Using Data Quality Services to find duplicate data

I'd like to talk about using data quality services to find duplicate data. So often, when we import data from multiple sources, we could end up with duplicates. If two records are exactly the same, it's fairly easy to write a database query that finds records that are exactly the same. This becomes a little more of a challenge when records are similar, but not exactly the same. So for example, if we have two customer records that have the same first name, last name, email address and phone number. But have a slightly different mailing address. We might determine that this is, in fact, the same person. It's a duplicated record. On the other hand, if we had two customer records that were similar, same first name, same last name, same middle name but different addresses, different phone numbers, different email addresses, we might have to determine that this is in fact, two different human beings. Who just by coincidence have the same name. We can automate some of this process by using…

Contents