Creating word clouds
You may have seen word clouds produced by Wordle or other software before. If not, you will see them soon enough in this chapter. A couple of Python libraries can create word clouds; however, these libraries don't seem to be able to beat the quality produced by Wordle yet. We can create a word cloud via the Wordle web page at http://www.wordle.net/advanced. Wordle requires a list of words and weights in the following format:
Word1 : weight Word2 : weight
Modify the code from the previous example to print the word list. As a metric, we will use the word frequency and select the top percent. We don't need anything new for this. The final code is in the ch-09.ipynb file in this book's code bundle:
from nltk.corpus import movie_reviews
from nltk.corpus import stopwords
from nltk import FreqDist
import string
sw = set(stopwords.words('english'))
punctuation = set(string.punctuation)
def isStopWord(word):
...