Learn how the code is divided into three source code files or modules. Explore how (1) the first handles file input, (2) Watson uploads the file to IBM Watson and digests the JSON results, and (3) the third creates a new JPEG file with boxes drawn around the found faces. For improvements, learn how the boxes could be thicker, and also code could be added to process PNG and GIF images to draw the boxes around faces in those image formats as well.
- [Instructor] My solution to the face recognition challenge had me creating more code than for any other Code Clinic project. I split the task into four steps. First, obtain the input file name and classify it as JPEG, PNG, or GIF. Second, interface with Watson to upload the image and obtain the JSON output. Third, interpret the output to display the proper results, the number of faces, and the filename. And the fourth step is to box the images on the original file, which is optional and I didn't quite complete it.
To keep this project organized, I split the task among three modules and set up a shared header file, findfaces.h. The findfaces.h file defines the variables, prototypes and constants used in the code. It also contains Watson's API key, which you must add if you want to compile the code on your own. The main module contains the main function which grabs a filename as input if one isn't typed at the command prompt. I further added a file extension function to process the name and return an enumerated value depending on the extension and therefore the image type.
This result would help process the final step, which is boxing the faces in a new image file. The switch case structure at line 44 processes the different image types. The next module is the workhorse, Watson.com. I wrote this code as a standalone program first, which helped me speak with Watson and then process the JSON data returned to the parse out what I needed. You may notice that none of the functions deal with the image structure as an argument. This structure, picture image, holds the relevant data, but it just became easier to make that structure a global variable and define it externally on line four.
I had way too many problems passing this structure as a pointer to the various functions. Things got out of hand, and the code became terrible to debug. The external variable solution worked best, though I'm not a fan of global variables in C code. The extract_faces function is my kludge for being unable to tame any JSON library in the C language. Seriously, I tried. It was just faster to code this function, which finds all the instances of the text face_location in the JSON string, and then pulls out the necessary coordinates.
It also counts the number of face_location items found, which equals the number of faces in the image. The last module is do_jpeg. It reads the JPEG file specified and draws boxes around the faces and then saves a new file with the suffix, dash faces. It was my plan also to write modules for GIF and PNG images as well, but I ran out of time. For improvements, I would like to make the box thicker for larger images. My test images were smaller, so the box looked really good, so I'd like a thicker box on larger images.
Also, if you use this code on an image that's been rotated, the results may look odd. The EXIF orientation data is just plain weird, so if you see the boxes, but the image hasn't been oriented properly, that's why. And of course, I'd like to add the routines that would process PNG and GIF images as well. I worked with JPEG data before, but my PNG and GIF kung fu just isn't strong enough to process those files.