Summary
Sentiment analysis and entity recognition are two powerful social media analytics techniques to get context around user content. Sports being a sentiment and emotion inciting subject among audiences, for this chapter the dataset we used were tweets using the Twitter API on the English Football Premier League. We used the Twitter REST and Streaming API to collect the data and also applied basic cleaning explained in Chapter 2,Harnessing Social Data - Connecting, Capturing, and Cleaning) and new cleaning methods such as device detection from Twitter API metadata. Sentiment Analysis allows us to categorize text into positive, negative, and neutral categories. We also learnt that there are limitations to sentiment analysis with accuracy, especially in ambiguous expressions. We used the VADER (Valence Aware Dictionary for Sentiment Reasoning) module from NLTK for sentiment analysis. We also saw that we can build our own sentiment analysis algorithm through machine learning on test and...