FastText word vectors
The second major focus of fastText is creating word embeddings for the input text. During training, fastText looks at the supplied text corpus and forms a high-dimensional vector space model, where it tries to encapsulate as much meaning as possible. The aim of creating the vectors space is that the vectors of similar words should be near to each other. In fastText, these word vectors are thensaved in two files, similar to what you have seen in text classification: a.bin
file and a.vec
file.
In this section, we will look at the creation and use of word vectors using the fastText command line.
Creating word vectors
We will now take a look at how to go about creating word vectors in fastText. You will probably be working with and building a solution for a specific domain, and in such a case, my advice would be to generate the raw text from the specific domain. But in cases where the raw text is not available to you, then you can use the help of Wikipedia, which is a huge...