From the course: Advanced NLP with Python for Machine Learning

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Build a model on word2vec embeddings

Build a model on word2vec embeddings - Python Tutorial

From the course: Advanced NLP with Python for Machine Learning

Start my 1-month free trial

Build a model on word2vec embeddings

- [Instructor] In this video, we'll take a similar approach to the last video, but we'll use vectors created from word2vec as the input into our random forest model, instead of using vectors created from TFIDF. So let's start by reading in our data, and I'll just note that we're importing this gensim package as that's what we're using for our word2vec bottle. Now, since our text messages are already cleaned and tokenized, we don't have to use that gensim pre-processing function that we saw before. We can jump right into fitting our word2vec model. Just like with TFIDF, or any model for that matter, we'll train this on only our training set, and we'll use the same parameter settings we used previously. So create vectors of length 100. We'll use a window of five words before and after the key word to understand context in which the word is used. And we'll learn a word vector for any word that appears at least twice in the…

Contents