Depending on your operating system, you will need to make sure that you have some dependencies installed in your machine. In this section, you will get to know how to install fastText based on whether you are using a Linux, Windows, or macOS operating system. Additionally, you will get to know what additional dependencies you should install depending on your usage. My recommendation is to install all the software packages, as we will be exploring all the various ways we can use fastText in this book.
FastText works on Windows, Linux, and macOS. FastText is built using the C++ language, so you will first need a good C++ compiler.
The list of prerequisite software that you need to install is as follows:
- GCC-C++; if you are using Clang, you will need 3.3 or newer
- Cmake
- Python 3.5 (you can work with Python 2.7, but we are going to focus on Python 3 in this book)
- NumPy and SciPy
- pybind
Optional requirements, depending on your system, are as follows:
Installing dependencies on RHEL machines supporting the yum package manager
On Linux machines, you will need to have g++
installed. On Fedora/CentOS, which supports the yum
package manager, you can installg++
using the following command. Open the Terminal or connect to the server where you are installing this using your favorite SSH tool and run the following command:
$ sudo yum install gcc-c++
CMake should be installed by default. The official docs have the installation instructions in make
and cmake
. I would recommend installing cmake
on your machine and using it to build fastText. You can directly install cmake
using the yum
generic command like before:
$ sudo yum install cmake
To get a full list of cmake
commands, take a look at the following link: https://cmake.org/cmake/help/v3.2/manual/cmake.1.html.
To install the optional software, run the following command:
$ sudo yum install zip docker git-core
If you are starting on a new server and running yum
commands there, then you may encounter the following warning:
Failed to set locale, defaulting to C
In this case, install the glibc
language pack:
$ sudo yum install glibc-langpack-en
Now, you can jump to the installation instructions for Anaconda to install the Python dependencies.
Installing dependencies on Debian-based machines such as Ubuntu
In Ubuntu and Debian machines, apt-get
or apt
is your package manager. apt
is basically a wrapper around apt-get
and other similar tools, and hence you should be able to use them interchangeably. I will be showing apt
commands here but if you are using older versions of Ubuntu and Debian, and see that apt is not working on your machines, then you can replace apt
with apt-get
and it should work. Also, consider upgrading your machine if possible.
Similar to Fedora, to install C++, open a Terminal or SSH into the server where you are going to install fastText and run the following command. This will also install the cmake
command:
$ sudo apt update
$ sudo apt install build-essential
Now install cmake
:
$ sudo apt install cmake
To install the optional requirements, run the following command:
$ sudo apt install zip docker git-core
Now, check the Anaconda section to see how to install Anaconda for the Python dependencies.
Note
The apt
command only works from Ubuntu-16 onwards. If you are using an older Ubuntu version, you should use the apt-get
command.
Installing dependencies on Arch Linux using pacman
The package manager of choice on Arch Linux is pacman
and you can run the following command to install the essential build tools:
$ sudo pacman -S cmake make gcc-multilib
This should install the make
, cmake
, and g++
compiler that you need to build fastText. Although Arch distributions already have Python 3.x installed, I would recommend installing Anaconda as described later in this chapter so that you don't miss out on any of the Python dependencies.
To install the optional requirements, run the following command:
$ sudo pacman -S p7zip git docker