New Feature: Playlist Center! Pick a topic and let our playlists guide the way—like a learning mixtape.

Start learning with our library of video tutorials taught by experts. Get started

Managing and Analyzing Data in Excel 2010
Illustration by

Using a specialized array formula to identify data that's been duplicated


From:

Managing and Analyzing Data in Excel 2010

with Dennis Taylor

Video: Using a specialized array formula to identify data that's been duplicated

As fast and easy as the Remove Duplicates command is to get rid of duplicate data, what it does not do for you is indicate what has been duplicated. So for example, in this list here which does have duplicates in it, there is a Juan Bishop in Row 5, also one in Row 10. If we want to identify which records have been duplicated before getting rid of them, what we need to do is sort the data. So I do have another list here, an exact list, but it has been sorted, clicking over here.
Expand all | Collapse all
  1. 1m 32s
    1. Welcome
      54s
    2. Using the exercise files
      38s
  2. 25m 18s
    1. Sorting from the Sort menu
      4m 37s
    2. Sorting from the toolbar
      4m 2s
    3. Multi-key sorting
      3m 4s
    4. Sorting based on the order of data in custom lists
      4m 44s
    5. Sorting by color font, color background, or icon
      3m 57s
    6. Sorting columns
      2m 11s
    7. Sorting data in random order
      2m 43s
  3. 19m 1s
    1. Using single- and multiple-column text filtering
      5m 8s
    2. Taking a look at special numeric filters
      1m 54s
    3. Harnessing special date filters
      2m 5s
    4. Creating a top-ten list by value or percent
      3m 11s
    5. Creating custom filters
      1m 40s
    6. Copying and sorting filtered lists
      3m 7s
    7. Recognizing the limitations of standard filtering
      1m 56s
  4. 11m 16s
    1. Setting up subtotals
      4m 20s
    2. Creating multiple levels and copying subtotals
      6m 56s
  5. 13m 22s
    1. Using the Advanced Filter for complex OR criteria
      4m 30s
    2. Using the Advanced Filter for complex multiple-field criteria
      5m 37s
    3. Using the Advanced Filter to create unique lists from repeating field data
      3m 15s
  6. 10m 44s
    1. Using the Remove Duplicates command
      2m 30s
    2. Using a specialized array formula to identify data that's been duplicated
      5m 10s
    3. Using an array formula to count the number of unique items in a list
      3m 4s
  7. 10m 31s
    1. Using SUMIF, COUNTIF, and related functions for quick data analysis
      6m 48s
    2. Using database functions like DSUM, DAVERAGE, and DMAX
      3m 43s
  8. 34s
    1. Next steps
      34s

Watch this entire course now—plus get access to every course in the library. Each course includes high-quality videos taught by expert instructors.

Become a member
please wait ...
Managing and Analyzing Data in Excel 2010
1h 32m Appropriate for all Oct 27, 2011

Viewers: in countries Watching now:

In this course, Dennis Taylor shares easy-to-use database commands and methods for maintaining an Excel database. The course covers sorting, adding subtotals, auto-filtering, and using the Excel Advanced Filter feature and specialized database functions.

Topics include:
  • Multiple key sorting
  • Single and multiple column numeric filters
  • Creating a top-ten list with values or percentages
  • Setting up subtotals
  • Creating multiple-field criteria filters
  • Creating unique lists from repeating field data
  • Using the Remove Duplicates command
  • Finding duplicate data with specialized arrays
  • Counting the number of unique items in a list
  • Using SUMIF and COUNTIF functions
  • Working with the database functions such as DSUM and DMAX
Subjects:
Business Data Analysis
Software:
Excel
Author:
Dennis Taylor

Using a specialized array formula to identify data that's been duplicated

As fast and easy as the Remove Duplicates command is to get rid of duplicate data, what it does not do for you is indicate what has been duplicated. So for example, in this list here which does have duplicates in it, there is a Juan Bishop in Row 5, also one in Row 10. If we want to identify which records have been duplicated before getting rid of them, what we need to do is sort the data. So I do have another list here, an exact list, but it has been sorted, clicking over here.

And in this list we will see two Juan Bishops together. If you do want to identify duplicate records here, what we need to do is after doing the sorting, insert a new column to the left of Column A. And what we are going to do here is to put in a formula. I want to show you or start off at least with the long way and then show you a nice quick way using an array formula to identify which records have been duplicated. So, here's the formula I am going to put in here. I can start probably in Row 2.

It's good to see everything in place. =if. Now you may or may not be familiar with the If function but this I think almost explains itself. Here is what we'd like to say. And we want to include the word And. And for the moment we are only thinking of rows 2 and 3 and we know they are not duplicated but they could be. So we'd like to indicate if this cell B2=B3, comma, and if C2=C3, comma, and if D2=D3, comma, and on and on and on.

Now this is going to get really long, but if they are all equal, and why don't I just cut it short here for the moment and then show you a better way to do this? But we're on the path here for this to make sense. If these pairs are all equal to one another, I'll just go this for right parenthesis, comma. In other words, if all those pairings are true, that they are all matched, then what do we want to see here over in cell A2? "Dup". But if that's not the case, if any ne of those is not the same then we want to put in a word Unique. Right parenthesis and Enter.

And those are unique and we can copy this down the column relatively quickly simply by double-clicking the lower right-hand corner. So all these are unique and sure enough down here in Row 50 there we go, 53, that's a duplicate, and we'll find others eventually. Now we've got several hundred names. No reason to scroll through all of this. We could have 70,000 or 700,000. This at least identifies the records that have been duplicated. Possibly you can use a filter and just show the duplicated ones. We could do that.

You will need a dummy heading up here. Put in anything for the moment. If we introduced filtering right now we could then on this column right here simply show those that are dup and then see the list of the records that have been duplicated. Sometimes that's important, sometimes not. But this is the way to get there. Let me remove the filter and to complete this formula and actually do it all the way across, in other words not just to Column F but all the way across, seems like it would take a lot of work.

In other words, to really do the check here fully we'd want to check all the way over into in this example into Column L. But here is a better way to do this and there is a special kind of formula in Excel called an array dormula. So what we'd like to do here is actually change this to read B2:L2=B3:L3. In other words, as I delete all this, all the cells B2 all the way over to L2 we are going to check all of those and we are going to compare them with the corresponding cells in the row below.

And this is a lot shorter than what we just saw. Now, this is what we call an array formula. In order for this to work properly I need to press Ctrl+Shift+Enter, not simply Enter. Ctrl+Shift+Enter, there we go, and I'll double-click to recopy all this. We'll still find our duplicates down there in the same location, but it's a much shorter formula. Let me once again display this a little larger so we can see it, make it a little bit wider, and so what we are seeing again in English-- let me make this even wider so we can see it.

We're going to compare B2 with B3 and C2 with C3 and on and on and on all the way over in Column L. If all those are the same, we've got a duplicate. If any one of them is different it's going to be unique. So this is how to identify the records that are duplicates. It's an unusual kind of formula. You may or may not have seen array formulas, but I think you'll see pretty clearly. Any time there is the need to identify the duplicate records, a formula like this will get the job done. If you'd like more information on array formulas and how they work, you might want to check out another course in this series, Advanced Formulas and Functions for Excel 2010.

There is an entire chapter on array formulas there.

Find answers to the most frequently asked questions about Managing and Analyzing Data in Excel 2010.


Expand all | Collapse all
please wait ...
Q: Where can I learn more about Excel formulas?
A: Discover more on this topic by visiting Excel formulas on lynda.com.
 
Share a link to this course

What are exercise files?

Exercise files are the same files the author uses in the course. Save time by downloading the author's files instead of setting up your own files, and learn by following along with the instructor.

Can I take this course without the exercise files?

Yes! If you decide you would like the exercise files later, you can upgrade to a premium account any time.

Become a member Download sample files See plans and pricing

Please wait... please wait ...
Upgrade to get access to exercise files.

Exercise files video

How to use exercise files.

Learn by watching, listening, and doing, Exercise files are the same files the author uses in the course, so you can download them and follow along Premium memberships include access to all exercise files in the library.
Upgrade now


Exercise files

Exercise files video

How to use exercise files.

For additional information on downloading and using exercise files, watch our instructional video or read the instructions in the FAQ.

This course includes free exercise files, so you can practice while you watch the course. To access all the exercise files in our library, become a Premium Member.

join now Upgrade now

Are you sure you want to mark all the videos in this course as unwatched?

This will not affect your course history, your reports, or your certificates of completion for this course.


Mark all as unwatched Cancel

Congratulations

You have completed Managing and Analyzing Data in Excel 2010.

Return to your organization's learning portal to continue training, or close this page.


OK
Become a member to add this course to a playlist

Join today and get unlimited access to the entire library of video courses—and create as many playlists as you like.

Get started

Already a member?

Become a member to like this course.

Join today and get unlimited access to the entire library of video courses.

Get started

Already a member?

Exercise files

Learn by watching, listening, and doing! Exercise files are the same files the author uses in the course, so you can download them and follow along. Exercise files are available with all Premium memberships. Learn more

Get started

Already a Premium member?

Exercise files video

How to use exercise files.

Ask a question

Thanks for contacting us.
You’ll hear from our Customer Service team within 24 hours.

Please enter the text shown below:

The classic layout automatically defaults to the latest Flash Player.

To choose a different player, hold the cursor over your name at the top right of any lynda.com page and choose Site preferencesfrom the dropdown menu.

Continue to classic layout Stay on new layout
Exercise files

Access exercise files from a button right under the course name.

Mark videos as unwatched

Remove icons showing you already watched videos if you want to start over.

Control your viewing experience

Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.

Interactive transcripts

Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.

Are you sure you want to delete this note?

No

Notes cannot be added for locked videos.

Thanks for signing up.

We’ll send you a confirmation email shortly.


Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

Keep up with news, tips, and latest courses with emails from lynda.com.

Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

   
submit Lightbox submit clicked
Terms and conditions of use

We've updated our terms and conditions (now called terms of service).Go
Review and accept our updated terms of service.