From the course: NLP with Python for Machine Learning Essential Training

Unlock the full course today

Join today to access over 22,700 courses taught by industry experts or purchase this course individually.

Count vectorization

Count vectorization - Python Tutorial

From the course: NLP with Python for Machine Learning Essential Training

Start my 1-month free trial

Count vectorization

- [Instructor] Now that we've learned a little bit about what vectorization is and why it's necessary, we're going to jump in to learning how to actually implement three different types of vectorizers over the next three lessons. Keep in mind, that all three of these methods will generate very similar document-term matrices where there's one line per document, or text message in our case, and then the columns will represent each word or potentially a combination of words, as we'll see in the next lesson. The main difference between the three is what's in the actual cells of the matrix. So we'll start with count vectorization. This is basically the example that we ran through at the end of the last lesson. So count vectorization creates the document-term matrix and then simply counts the number of times each word appears in that given document, or text message in our case, and that's what's stored in the given cell, so it's pretty straight forward. So again, we're going to use our…

Contents