(computer boot noises) - Hello, and welcome to Code Clinic. My name is Ray Villalobos. Code Clinic is a course where a unique problem is introduced to a collection of lynda.com authors. In response, each author will create a solution using their programming language of choice. You can learn several things from Code Clinic. Different approaches to solving a problem, the pros and cons of different languages, and some tips and tricks to incorporate into your own coding practices.
We'll work on a problem centered around image analysis. In one sense, this is simply data analysis. Images are really nothing more then specialized and well defined sets of data, an image consists of pixels. Pixels consist of data representing the color of the pixel, and in some cases, the pixel's transparency. The pixels are arranged in rows and columns. When assembled correctly, they represent an image. Our brains are good at recognizing patterns but computers are not.
Think about Captcha security devices. Those puzzles you sometimes see when login into a website. The Captcha asks what letters and numbers are in the image. Information obscured by random lines, sometimes overlapping transparent blocks of color. All those intersecting shapes makes it difficult for a computer program to separate the background noise from the actual data. Another example is the tests to determine color blindness. Letters and numbers are hidden in a circle filled with different colored dots.
If you are color blind, you will not be able to see the numbers. For a computer program this can be incredibly difficult as it requires detecting an edge, as well as recognizing the over-all shape. It's difficult even for the most advanced programmer. In this problem, we're trying to solve a common problem for many photographers, plagiarism. A photographer will take a picture and post it on the internet, only to discover someone has stolen their image and placed a subset of that image on their website. For example, here's an image, and then a cropped version of that image.
It would be extremely handy if there was a program searching the internet for cropped versions of an original image. So that a photographer could protect the rights. In fact, Google Image Search will do just that. We're curious about how it works and what the required code might look like. So here's the challenge: Given two images, determine if one image is a subset of the other. We'll assume they are both jpeg files. But the resolution is the same as well as the bit depth. We've provided a set of images.
The program should return a table showing which images are cropped versions of other images. Perhaps you want to pause and create a solution of your own. How would you solve the problem? In the next videos, I'll show you how I solved this challenge.
Visit other courses in the series to see how to solve the exact same challenges in languages like C++, C#, Java, PHP, Python, R, Ruby, and Swift.
Skill Level Intermediate
Q: I am unable to access the Lake Pend Oreille data from outside the U.S.
A: A static copy of this data is provided here for lynda.com members outside of the U.S