Creating a word cloud from a StackOverflow job listing
Now lets look at creating a word cloud. Word clouds are an image that demonstrate the frequency of key words within a set of text. The larger the word in the image, the more apparent significance it has in the body of text.
Getting ready
We will use the Word Cloud library to create our word cloud. The source for the library is available at https://github.com/amueller/word_cloud. This library can be installed into your Python environment using pip install wordcloud
.
How to do it
The script to create the word cloud is in the 08/04_so_word_cloud.py
file. This recipe continues on from the stack overflow recipes from chapter 7 to provide a visualization of the data.
- Start by importing the word cloud and the frequency distribution function from NLTK:
from wordcloud import WordCloud from nltk.probability import FreqDist
- The word cloud is then generated from the probability distribution of the words we collected from the job listing:
freq_dist...