Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
fastText Quick Start Guide

You're reading from   fastText Quick Start Guide Get started with Facebook's library for text representation and classification

Arrow left icon
Product type Paperback
Published in Jul 2018
Publisher Packt
ISBN-13 9781789130997
Length 194 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Joydeep Bhattacharjee Joydeep Bhattacharjee
Author Profile Icon Joydeep Bhattacharjee
Joydeep Bhattacharjee
Arrow right icon
View More author details
Toc

The fastText command line


Following is the list of parameters that you can use with fastText command line:

$ ./fasttext
usage: fasttext <command> <args>

The commands supported by fasttext are:

  supervised train a supervised classifier
  quantize quantize a model to reduce the memory usage
  test evaluate a supervised classifier
  predict predict most likely labels
  predict-prob predict most likely labels with probabilities
  skipgram train a skipgram model
  cbow train a cbow model
  print-word-vectors print word vectors given a trained model
  print-sentence-vectors print sentence vectors given a trained model
  print-ngrams print ngrams given a trained model and word
  nn query for nearest neighbors
  analogies query for analogies
  dump dump arguments,dictionary,input/output vectors

The supervised, skipgram, and cbow commands are for training a model. predict, predict-prob are for predictions on a supervised model. test, print-word-vectors, print-sentence-vectors, print-ngrams, nn, analogies can be used to evaluate the model. The dump command is basically to find the hyperparameters of the model and quantize is used to the compress the model.

The list of hyperparameters that you can use for training are listed later.

The fastText supervised

$ ./fasttext supervised
Empty input or output path.

The following arguments are mandatory:
  -input training file path
  -output output file path

The following arguments are optional:
  -verbose verbosity level [2]

The following arguments for the dictionary are optional:
  -minCount minimal number of word occurences [1]
  -minCountLabel minimal number of label occurences [0]
  -wordNgrams max length of word ngram [1]
  -bucket number of buckets [2000000]
  -minn min length of char ngram [0]
  -maxn max length of char ngram [0]
  -t sampling threshold [0.0001]
  -label labels prefix [__label__]

The following arguments for training are optional:
  -lr learning rate [0.1]
  -lrUpdateRate change the rate of updates for the learning rate [100]
  -dim size of word vectors [100]
  -ws size of the context window [5]
  -epoch number of epochs [5]
  -neg number of negatives sampled [5]
  -loss loss function {ns, hs, softmax} [softmax]
  -thread number of threads [12]
  -pretrainedVectors pretrained word vectors for supervised learning []
  -saveOutput whether output params should be saved [false]

The following arguments for quantization are optional:
  -cutoff number of words and ngrams to retain [0]
  -retrain whether embeddings are finetuned if a cutoff is applied [false]
  -qnorm whether the norm is quantized separately [false]
  -qout whether the classifier is quantized [false]
  -dsub size of each sub-vector [2]

The fastText skipgram 

$ ./fasttext skipgram
Empty input or output path.

The following arguments are mandatory:
  -input training file path
  -output output file path

The following arguments are optional:
  -verbose verbosity level [2]

The following arguments for the dictionary are optional:
  -minCount minimal number of word occurences [5]
  -minCountLabel minimal number of label occurences [0]
  -wordNgrams max length of word ngram [1]
  -bucket number of buckets [2000000]
  -minn min length of char ngram [3]
  -maxn max length of char ngram [6]
  -t sampling threshold [0.0001]
  -label labels prefix [__label__]

The following arguments for training are optional:
  -lr learning rate [0.05]
  -lrUpdateRate change the rate of updates for the learning rate [100]
  -dim size of word vectors [100]
  -ws size of the context window [5]
  -epoch number of epochs [5]
  -neg number of negatives sampled [5]
  -loss loss function {ns, hs, softmax} [ns]
  -thread number of threads [12]
  -pretrainedVectors pretrained word vectors for supervised learning []
  -saveOutput whether output params should be saved [false]

The following arguments for quantization are optional:
  -cutoff number of words and ngrams to retain [0]
  -retrain whether embeddings are finetuned if a cutoff is applied [false]
  -qnorm whether the norm is quantized separately [false]
  -qout whether the classifier is quantized [false]
  -dsub size of each sub-vector [2]

The fastText cbow

$ ./fasttext cbow
Empty input or output path.

The following arguments are mandatory:
 -input training file path
 -output output file path

The following arguments are optional:
 -verbose verbosity level [2]

The following arguments for the dictionary are optional:
 -minCount minimal number of word occurences [5]
 -minCountLabel minimal number of label occurences [0]
 -wordNgrams max length of word ngram [1]
 -bucket number of buckets [2000000]
 -minn min length of char ngram [3]
 -maxn max length of char ngram [6]
 -t sampling threshold [0.0001]
 -label labels prefix [__label__]

The following arguments for training are optional:
 -lr learning rate [0.05]
 -lrUpdateRate change the rate of updates for the learning rate [100]
 -dim size of word vectors [100]
 -ws size of the context window [5]
 -epoch number of epochs [5]
 -neg number of negatives sampled [5]
 -loss loss function {ns, hs, softmax} [ns]
 -thread number of threads [12]
 -pretrainedVectors pretrained word vectors for supervised learning []
 -saveOutput whether output params should be saved [false]

The following arguments for quantization are optional:
 -cutoff number of words and ngrams to retain [0]
 -retrain whether embeddings are finetuned if a cutoff is applied [false]
 -qnorm whether the norm is quantized separately [false]
 -qout whether the classifier is quantized [false]
 -dsub size of each sub-vector [2]

 

 

 

 

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime
Visually different images