Get an introduction to the Vision API and explore the possibilities it brings to your applications as well as its importance within Microsoft COG services.
- [Instructor] Let's talk about the Vision API. The Vision API is a core part of Microsoft Cognitive Services. Yes, it is just a web service that Microsoft hosts, but what does it mean? Well put simply, the Vision API is a REST based service running inside of Azure, and you submit it an image, yes you do have to authenticate, and the image has to be a standard format like JPG, PNG, et cetera, I'll cover all of this in demos, but you just submit it an image and the API returns back with a JSON object describing to you what it saw in the image.
And I was amazed at how well it actually works, it's almost like a person seeing the picture and telling you what it saw, in some ways even better than a person. So what are the kinds of things that the Vision API can recognize in a picture? It can tell you things like well, I see a dog sitting on grass, or it can tell you that I see a train, perhaps a train riding by the ocean, towards a sunset, yes it can recognize all of that.
Or it can say, well, I see a famous person in the picture, whoever it might be, or it can say, I see the US senate, or it can say, I see faces in this picture, and the faces are approximately at certain locations. It can even tell you if the faces are male or female, with their approximate ages, and many other possibilities. So let's check this out in demos.
- Exploring the possibilities of the Vision API
- Submitting an image to the Vision API for processing
- Asking the Vision API to recognize faces
- Working with the Speech API
- Writing speech-to-text code
- Working with the Language API
- Getting languages for translation
- Language Understanding (LUIS) concepts