From the course: Microsoft Power Apps: AI Builder

Object detection model requirements - PowerApps Tutorial

From the course: Microsoft Power Apps: AI Builder

Object detection model requirements

- [Instructor] We're going to use the AI Builder to build an object detection model which does exactly what it sounds like it will do. Given an image and a list of objects, the AI model that we build will identify objects within those images. Our process is exactly the same as it was for our form processing model, but there are some specific requirements that are different with objects than they are with forms. First, we have three different object detection domains, and one of these is actually new, so I wouldn't be surprised if by the time you're viewing this course, there are four or five different domains. The first is for objects on retail shelves, and this would be used if you were going to take a physical inventory, for example, the second is brand logo, and that object detection domain is optimized for identifying corporate logos. Finally, we have everything else, and that's common objects. So if it's not objects on a retail shelf, if it's not brand logos, then it is common objects. There are two ways that we get our list of object names that we're going to want to identify. The first is simply to type in a list. We'll actually be working with fruit, so we'll type in a list that includes lemon, lime, apple, tomato. If you're surprised that tomatoes are a fruit, don't trust me, check Wikipedia. Next, we can also have our object name selected from an entity in the Common Data Service. And you might do this, for example, if you had a list of inventory items that you wish to use. You can't combine the two. You either are typing in a list, or you are using the Common Data Service. Our sample images have some specific requirements. The first is format. These are the three formats that we can use right now, JPG, PNG, and bitmap. And the maximum size for any of the sample images or test images is six megabytes. What this means is if you pull out your multi-megapixel camera and take images, you will probably have to compress them. The easiest thing to do is to change the settings in your camera to take images that have fewer pixels so you don't have to do that. But if you need to compress images because they've already been taken and you're using what you've been given, there are several services online where you can upload images, have them compressed, and then download them again. For each of the objects that we want to identify, we need to have at least 15 images, or we can't train the model. And this really is a minimum. If you imagine that you want to be able to identify all different kinds of tomatoes, then you're going to need to have a number of images of tomatoes. And 15 is a pretty small tomato sample, so often you'll be training with 50 images for each object. You want to have a similar number for each one. You don't want to have 15 images of limes and 500 images of tomatoes. A good rule for making sure that your image samples are of similar size is to take whatever object you have the smallest number of images for, double that, and you shouldn't have more than that doubled number for any of the other objects. So if I have one item, one object that I only have 15 images for, I shouldn't have more than 30 for any of the others. We want our images to be varied, but also to be representative. What do I mean by that? First, we'd like to be capturing the objects against different backgrounds. Let's go back to the domain where we're detecting objects on retail shelves. Retail shelves vary widely. There are endcaps and regular shelves. Sometimes you'll have a display that sits in front of a counter. You'll want to capture your objects against different backgrounds when you take pictures, not necessarily those backgrounds, but different backgrounds because if every picture you take shows the same background, it's going to be harder then when you actually use the model against a variety of backgrounds. Next, different lighting is important. When you're actually using an application like this in a retail setting, the lighting will be varied, so you'll want to make sure you have some light that is daylight, some light the is fluorescent light, some light that is incandescent light or LED light if you can. Do the best you can with this. Camera angles, though, definitely, because sometimes you'll be taking a picture that is straight on with a product, and sometimes it'll be slightly offset. You'll be above the product or below the product or be taking a picture that shows the top. So you'll want to get different camera angles on each of the items. We also want to have different sizes and even different numbers. As well as having a lime, we could have a basket of limes. We could have small limes, and we could have larger limes. And one way we can deal with size is also to be closer to the item when we take a picture and farther away from the item. Again, if these images that you're working with have already been provided, you're in a process of deciding, perhaps, which images you want to use. So apply these rules for creating a set of varied representative images as you're viewing the images that you might use. Once you have your object names and a set of representative varied images that you can use to train your model, you are ready to start object detection with the AI Builder.

Contents