The Vision framework and NLP APIs are both domain specific. With Vision, you can easily build computer vision machine learning features into your app. Supported features include face tracking, face detection, landmarks, text detection, and rectangle detection. The natural language processing APIs in AVFoundation use machine learning to deeply understand text using features such as language identification, tokenization, lemmatization, parts of speech, and named entity recognition.
- [Instructor] In addition to Core ML, Apple also released two other machine learning frameworks called Vision and NLP that are more domain specific. You already know that the Vision Framework has something to do with images, but now let's explore what the Vision Framework can actually do, in more detail. Apple says Vision gives you a high level, on device solution to computer vision problems through one simple API. So, you do not have to be a computer Vision expert. And, we can do things like face detection, face landmark detection, so that we can figure out where the mouth is, where the eyes are, and so on.
We can do rectangle detection, barcode detection, or even object tracking. So, you don't have to be a computer Vision expert, you can just say I want to know where the face are because Vision handles the complexity for you. The same things that apply to Core ML are also great for Vision because, again, it protects the user's privacy, it reduces data cost, it reduces server cost, and it's always available. And, this is so important because no data from your user is going to leave its device.
So, everything is protected, lower costs for your users, and yourself as a developer, so this is extremely great. So, how does Vision actually work? This is a three step process. First of all, we are going to define a Vision request. So, for example, we have an image and we want to do a face rectangul request, which will request the rectangle around the face. So, this is step one. We always do that. Then, what we also need, is a request handler, which could be something like handle gender detection, or handle the face recognition detection request, which has always two parameters.
This function, or this handler, has always a request object, a V and request object, and an error object. So, this is part two. And, then what we get, in this handler, are observations which could be something like face observations or classification observations. We're going to see that in a later example, but now you already know that dealing with Vision is always this three step job, so we create a request, a request handler, and we're dealing with observations. It's as simple as that. So, now this is Vision.
Let's have a look at Natural Language Processing, or NLP. An NLP really is all about natural language text. And, this text could either be typed, using a keyboard, it could be recognized handwriting, or it could be transcribed speech. And, what we can do, after processing, is just extract that information, use it somewhere, or even apply more intelligence, using a Core ML model. And, now let's have a look at what we can actually do with this processing power.
What we can do, for example, is language identification, very powerful. Also, tokenization, which would be identifying words, sentences, paragraphs, and so on. We can do part of speech analysis, which means that we can identify verbs, nouns, prepositions, and so on. We can do lemmatization which is extremely powerful for the enhancement of search algorithms. We are going to see that in a practice example, later. And, a lemma is just a base form of a verb. So, for example, from the verb produced, the lemma is produced.
So, you can maybe imagine how we can use that for an enhanced search algorithm by searching for both words, not just produce, but also produced. You're going to see that in detail, later. And, what we can also do, is name entity recognition, which is extremely cool because this allows us to identify names, organizations, places, in a given sentence. And, all of that is made possible by one single class which is called NSLinguisticTagger, and this is a class in foundation, which means that it is available on all Apple platforms.
You can use it on Mac OS, Watch OS, TiVo OS, IOS, all platforms are available. And, it works together, extremely well, with Core ML, and it works in real time. So, again, all of the advantages, of Vision and Core ML, are also true for the NSLinguisticTagger Class, and it is extremely easy to use.
- What are machine learning, Core ML, Vision, and NLP?
- Adding a machine learning model to a project
- Getting predictions from machine learning models
- Converting existing machine learning models for Core ML
- Classifying images and detecting objects with Vision and Core ML
- Analyzing natural language text with NSLinguisticTagger