Using doc2vec for sentiment analysis
Now that we know how to train word embeddings, we can also extend these methodologies to have a document embedding. We will explore how to do this in the following sections.
Getting ready
In the previous sections about word2vec methods, we managed to capture positional relationships between words. What we have not done is capture the relationship of words to the document (or movie review) that they come from. One extension of word2vec that captures a document effect is called doc2vec.
The basic idea of doc2vec is to introduce document embedding, along with the word embeddings that may help to capture the tone of the document. For example, just knowing that the words movie and love are near to each other may not help us determine the sentiment of the review. The review may be talking about how they love the movie or how they do not love the movie. But if the review is long enough and more negative words are found in the document, maybe we can pick up on...