Packt+ | Advance your knowledge in tech

You're reading from Scala for Machine Learning Leverage Scala and Machine Learning to construct and study systems that can learn from data

Product type Paperback

Published in Dec 2014

Publisher

ISBN-13 9781783558742

Length 624 pages

Edition 1st Edition

Languages

Scala

Concepts

Machine Learning

Author (1):

R. Nicolas

View More author details

Table of Contents (20) Chapters

Scala for Machine Learning

Credits

About the Author

About the Reviewers

www.PacktPub.com

Preface

1. Getting Started FREE CHAPTER

2. Hello World!

3. Data Preprocessing

4. Unsupervised Learning

5. Naïve Bayes Classifiers

6. Regression and Regularization

7. Sequential Data Models

8. Kernel Models and Support Vector Machines

9. Artificial Neural Networks

10. Genetic Algorithms

11. Reinforcement Learning

12. Scalable Frameworks

Basic Concepts

Index

A

abstraction, Scala
- about / Abstraction
- higher-kind projection / Higher-kind projection
- covariant functors for vectors / Covariant functors for vectors
- contravariant functors for co-vectors / Contravariant functors for co-vectors
- monads / Monads
Actor model
- about / The Actor model
- components / The Actor model
actors
- about / Scalability
adaptive modeling / Model categorization
Akka.io
- about / An overview
Akka framework
- about / An overview, Akka
- URL / Akka
- master-workers / Master-workers
- futures / Futures
Algebird
- about / Abstraction
algebraic libraries
- about / Algebraic and numerical libraries
- jBlas 1.2.3 / Algebraic and numerical libraries
- Colt 1.2.0 / Algebraic and numerical libraries
- AlgeBird 2.10 / Algebraic and numerical libraries
- Breeze 0.8 / Algebraic and numerical libraries
alternative preprocessing techniques
- autoregressive models / Alternative preprocessing techniques
- curve-fitting algorithms / Alternative preprocessing techniques
- nonlinear dynamic systems / Alternative preprocessing techniques
- Hidden Markov models / Alternative preprocessing techniques
annual dividend yield
- about / Fundamental analysis
Apache Commons Math
- URL / Don't reinvent the wheel!
- about / Apache Commons Math
- description / Description
- licensing / Licensing
- installation / Installation
- installation, for Mac OS X / Installation
- installation, for Windows / Installation
Apache Spark
- about / Apache Spark
- features / Why Spark?
- deign principles / Design principles
- deployment modes / Deploying Spark
- performance evaluation / Performance evaluation
- pros / Pros and cons
- cons / Pros and cons
Apache Spark (Akka)
- about / Scalability
artificial neural networks
- feed-forward neural networks / Feed-forward neural networks
- advantages / Benefits and limitations
- disadvantages / Benefits and limitations
autonomous systems / The problem
Autoregressive Integrated Moving Average (ARIMA) / Alternative preprocessing techniques
Autoregressive Moving Average (ARMA) / Alternative preprocessing techniques

B

batch gradient descent algorithm / Selecting an optimizer
batch training / Online training versus batch training
Baum-Welch estimator
- about / The Baum-Welch estimator (EM)
Bayesian network
- about / Probabilistic graphical models
Berkeley Data Analytics Stack (BDAS)
- reference / Apache Spark
Bernoulli mixture model
- about / Model
Bernoulli model
- about / The Multivariate Bernoulli classification
bias-variance decomposition
- about / Bias-variance decomposition
bias input / Mathematical background
binary SVC
- about / The binary SVC
- LIBSVM / LIBSVM
- design / Design
- configuration parameters / Configuration parameters
- interface to LIBSVM / Interface to LIBSVM
- training / Training
- classification / Classification
- c-penalty and margin / C-penalty and margin
- kernel evaluation / Kernel evaluation
- applications in risk analysis / Applications in risk analysis
Breeze Scala libraries / Abstraction
Broyden-Fletcher-Goldfarb-Shanno (BGFS) / BFGS

C

C-Epsilon SVM formulation / The nonseparable case – the soft margin
cake pattern
- about / Configurability
/ Step 3 – instantiation
case classes
- versus companion objects / Companion objects versus case classes
- versus enumerations / Enumerations versus case classes
- advantages / Enumerations versus case classes
cash per share
- about / Fundamental analysis
categories, NP problems
- about / NP problems
- P-problems / NP problems
- NP problems / NP problems
- NP-complete problems / NP problems
- NP-hard problems / NP problems
centroid / K-means clustering
Cholesky decomposition
- about / Cholesky factorization
Cholesky factorization
- about / Cholesky factorization
chromosomes / Evolutionary computing
class constructor template
- about / Class constructor template
classification model, evaluation factors
- accuracy / Key quality metrics
- precision / Key quality metrics
- recall / Key quality metrics
- F-measure or F-score F / Key quality metrics
- G-measure / Key quality metrics
classification model, terminology
- true positives (TP) / Key quality metrics
- true negatives (TN) / Key quality metrics
- false positives (FP) / Key quality metrics
- false negatives (FN) / Key quality metrics
class prior
- about / Formalism
class prior probability
- about / Formalism
cluster assignment, K-means clustering
- about / Step 2 – cluster assignment
cluster configuration, K-means clustering
- about / Step 1 – cluster configuration
- clusters, defining / Defining clusters
- clusters, initializing / Initializing clusters
clustering
- about / Clustering
- expectation-maximization algorithm / The expectation-maximization algorithm
clustering algorithms
- K-means clustering / Clustering, K-means clustering
- EM / Clustering
co-vector
- about / Higher-kind projection
code snippets
- format / Code snippets format
common discriminative kernels
- about / Common discriminative kernels
companion objects
- versus case classes / Companion objects versus case classes
complex adaptive systems / Introduction to LCS
components, XCS
- about / XCS components
- application to portfolio management / Application to portfolio management, The XCS core data
- XCS rules / XCS rules
- covering / Covering
- implementation example / An implementation example
computational workflow
- overview / An overview of computational workflows
conditional dependency / Training
conditional independence / A model by any other name
- about / Probabilistic graphical models
conditional random field (CRF)
- about / Conditional random fields, Introduction to CRF
- linear chain CRF / Linear chain CRF
- potential functions / Linear chain CRF
- identity potential functions / Linear chain CRF
- transition feature functions / Linear chain CRF
- state feature functions / Linear chain CRF
- text analytics / Regularized CRFs and text analytics
- versus HMM / Comparing CRF and HMM
configurability
- about / Configurability
configuration parameters, SVM
- SVM formulation / The SVM formulation
- SVM kernel function / The SVM kernel function
- SVM execution / The SVM execution
confusion matrix / F-score for multinomial classification
conjugate directions
- about / Conjugate gradient
conjugate gradient
- about / Conjugate gradient
connectionism
- about / The biological background
constructive tuning strategy / Regularization
Consumer Price Index (CPI)
- about / Fundamental analysis
consumer price index (CPI)
- about / Introducing the multinomial Naïve Bayes
continuation-passing style (CPS) / Beyond actors – reactive programming
control learning / A solution – Q-learning
convolution neural networks
- about / Convolution neural networks
- local receptive fields / Local receptive fields
- weights, sharing / Sharing of weights
- convolution layers / Convolution layers
- subsampling layers / Subsampling layers
- fully connected hidden layer and output layer / Putting it all together
core parking
- about / Performance evaluation
Counter class
- about / Counter
covariant functor
- about / Covariant functors for vectors
cross-validation, model
- about / Cross-validation
- one-fold cross validation / One-fold cross validation
- K-fold cross validation / K-fold cross validation
crossover operator, genetic algorithm implementation
- about / Crossover
- population / Population
- chromosomes / Chromosomes
- genes / Genes
curve fitting
- about / Supervised learning

D

Darwinian process / The origin
data, profiling
- about / Profiling data
- immutable statistics / Immutable statistics
- Z-score / Z-Score and Gauss
data chunks / 0xdata Sparkling Water
data clustering
- about / Clustering
data elements / 0xdata Sparkling Water
data extraction
- about / Data extraction
data frames / 0xdata Sparkling Water
data partitioning
- about / Clustering
data segmentation
- about / Clustering
DataSourceConfig class
- pathName parameter / Data extraction
- normalize parameter / Data extraction
- reverseOrder parameter / Data extraction
- headerLines parameter / Data extraction
DBpedia / Basics of information retrieval
decision-making agent / Concepts
decision boundary / Plotting data
decoding, hidden Markov model (HMM)
- about / Decoding – CF-3
- Viterbi algorithm / The Viterbi algorithm
def
- about / Understanding the problem
dependency injection
- about / Configurability
deployment modes, Spark
- standalone / Deploying Spark
- local / Deploying Spark
- Yarn clusters manager / Deploying Spark
- Apache Mesos resource manager / Deploying Spark
descriptive models / Model categorization
designing
- about / Model versus design
design principles, Spark
- about / Design principles
- in-memory persistency / In-memory persistency
- laziness / Laziness
- transforms / Transforms and actions
- actions / Transforms and actions
- shared variables / Shared variables
design template, for classifiers
- about / Design template for immutable classifiers
destructive tuning strategy / Regularization
DFT-based filtering
- about / DFT-based filtering
dimension reduction
- about / Dimension reduction, Dimension reduction
- principal components analysis / Principal components analysis
- non-linear models / Non-linear models
directed graphical models
- about / Probabilistic graphical models
discrete Fourier transform (DFT)
- about / Discrete Fourier transform
/ PCA
discrete Kalman filter
- about / The discrete Kalman filter
- recursive algorithm / The discrete Kalman filter, The recursive algorithm
- optimal estimator / The discrete Kalman filter
- state space estimation / The state space estimation
- benefits / Benefits and drawbacks
- drawbacks / Benefits and drawbacks
- alternative preprocessing techniques / Alternative preprocessing techniques
discretization / Value encoding
dividend coverage ratio
- about / Fundamental analysis
DMatrix class
- about / DMatrix class
DNA / Evolutionary computing
Domain Specific Languages (DSL)
- about / Maintainability
dynamic programming
- about / Overview of dynamic programming

E

earnings per share (EPS)
- about / Fundamental analysis
Eigenvalue decomposition
- about / Eigenvalue decomposition
encapsulation
- about / Encapsulation
- package scope / Encapsulation
- class or object scope / Encapsulation
encoding scheme, genetic encoding
- about / The encoding scheme
- flat encoding / Flat encoding
- hierarchical encoding / Hierarchical encoding
enumerations
- versus case classes / Enumerations versus case classes
- advantages / Enumerations versus case classes
epoch / The training epoch
Erlang programming language / The Actor model
error backpropagation, training epoch
- about / Step 2 – error backpropagation
- weights' adjustment / Weights' adjustment
- error propagation / The error propagation
- computational model / The computational model
error handling, monadic data transformation
- about / Error handling
- input value / Error handling
- output value / Error handling
error insensitive zone
- about / An overview
evaluation
- about / Evaluation
- execution profile / The execution profile
- impact of learning rate / Impact of the learning rate
- impact of momentum factor / The impact of the momentum factor
- impact of number of hidden layers / The impact of the number of hidden layers
- test case / Test case
evaluation, hidden Markov model (HMM)
- about / Evaluation – CF-1
- alpha algorithm / Alpha – the forward pass
- beta algorithm / Beta – the backward pass
evidence
- about / Formalism
evolution
- about / Evolution
- origin / The origin
- NP problems / NP problems
- ary computing / Evolutionary computing
exchange-traded funds (ETFs) / Test case
ExecutionContextTaskSupport
- about / Processing a parallel collection
expectation-maximization (EM)
- about / Training – CF-2
expectation-maximization algorithm
- about / The expectation-maximization algorithm
- Gaussian mixture models / Gaussian mixture models
- overview / Overview of EM
- implementation / Implementation
- classification / Classification
- testing / Testing
- online EM algorithm / The online EM algorithm
experimenting, with Spark
- about / Experimenting with Spark
- Spark, deploying / Deploying Spark
- Spark shell, using / Using Spark shell
- MLlib / MLlib
- RDD generation / RDD generation
- K-means, using Spark / K-means using Spark
exponential moving average
- about / The exponential moving average
exponential normalization / Softmax
extended Kalman filter (EKF) / Benefits and drawbacks
Extended Kalman Filters (EKF) / The discrete Kalman filter
extended learning classifier systems
- about / Extended learning classifier systems
- exploration phase / Extended learning classifier systems
- exploitation phase / Extended learning classifier systems
- components / XCS components

F

-fold cross validation / K-fold cross validation
F-score for binomial classification
- about / F-score for binomial classification
F-score for multinomial classification
- about / F-score for multinomial classification
- macro method / F-score for multinomial classification
- micro method / F-score for multinomial classification
Fast Fourier Transform (FFT)
- about / Discrete Fourier transform
features extraction
- about / Extracting features
features maps / Sharing of weights
features selection
- about / Selecting features
Federal Fund rate
- about / Fundamental analysis
Federal fund rate (FDF)
- about / Introducing the multinomial Naïve Bayes
feed-forward neural network (FFNN) / The biological background
feed-forward neural networks
- about / Feed-forward neural networks
- biological background / The biological background
- mathematical background / Mathematical background
FFNN without a hidden layer / The multilayer perceptron
finances 101
- about / Finances 101
- fundamental analysis / Fundamental analysis
- technical analysis / Technical analysis
- options trading / Options trading
- financial data sources / Financial data sources
first order predicate logic
- about / First order predicate logic
fitness functions, genetic algorithms
- about / The fitness score
- fixed fitness function / The fitness score
- evolutionary fitness function / The fitness score
- approximate fitness function / The fitness score
fixed lag smoothing / Fixed lag smoothing
fork-join pool
- about / Processing a parallel collection
ForkJoinTaskSupport
- about / Processing a parallel collection
Fourier analysis
- about / Fourier analysis
- discrete Fourier transform (DFT) / Discrete Fourier transform
- DFT-based filtering / DFT-based filtering
- market cycles, detecting / Detection of market cycles
Fourier transform
- about / Fourier analysis
frameworks
- about / Tools and frameworks
frequency domain
- about / Discrete Fourier transform
fully connected neural network / The network topology
function approximation
- about / Supervised learning
/ Quantization
functors
- about / Abstraction
fundamental analysis
- about / Fundamental analysis
futures, Akka framework
- about / Futures
- Actor life cycle / The Actor life cycle
- blocking on / Blocking on futures
- future callbacks, handling / Handling future callbacks

G

Gauss-Newton technique
- about / Gauss-Newton
generalized autoregressive conditional heteroscedasticity (GARCH) / Alternative preprocessing techniques
generic Lp -norm
- about / Ln roughness penalty
genes / Evolutionary computing
genetic algorithms
- about / Genetic algorithms and machine learning
- discrete model parameters / Genetic algorithms and machine learning
- reinforcement learning / Genetic algorithms and machine learning
- neural network architecture / Genetic algorithms and machine learning
- ensemble learning / Genetic algorithms and machine learning
- components / Genetic algorithm components
- fitness score / The fitness score
- implementation / Implementation
- tests / Tests
- advantages / Advantages and risks of genetic algorithms
- disadvantages / Advantages and risks of genetic algorithms
genetic algorithms, for trading strategies
- about / GA for trading strategies
- trading strategies, defining / Definition of trading strategies
- test case / A test case
genetic encoding
- about / Genetic algorithm components, Encoding
- value encoding / Value encoding
- predicate encoding / Predicate encoding
- solution encoding / Solution encoding
- encoding scheme / The encoding scheme
genetic fitness functions
- about / Genetic algorithm components
genetic operators
- about / Genetic algorithm components, Genetic operators
- selection / Genetic operators, Selection
- crossover / Genetic operators, Crossover
- mutation / Genetic operators, Mutation
- transposition operator / Genetic operators
GNU Lesser General Public License (LGPL) / Licensing
GoogleFinancials / Data sources
gradient descent / Ordinary least squares regression
gradient descent methods
- about / Steepest descent
- steepest descent / Steepest descent
- conjugate gradient / Conjugate gradient
- stochastic gradient descent / Stochastic gradient descent
graph-structured CRF / Introduction to CRF
graphical models / Probabilistic graphical models
gross domestic product (GDP)
- about / Introducing the multinomial Naïve Bayes
Growth Domestic Product (GDP)
- about / Fundamental analysis

H

Hadoop Distributed File System (HDFS) / Step 2 – loading data
Hadoop distributed file system (HDFS) / Apache Spark
hard margin / The separable case – the hard margin
Hessian matrix
- about / Jacobian and Hessian matrices
hidden layers / The multilayer perceptron
hidden Markov model (HMM)
- about / The hidden Markov model
- components / The hidden Markov model
- canonical forms / The hidden Markov model
- notations / Notations
- lambda model / The lambda model
- design / Design
- evaluation / Evaluation – CF-1
- training / Training – CF-2
- decoding / Decoding – CF-3
- canonical forms, implementing / Putting it all together
- training, test case 1 / Test case 1 – training
- evaluation, test case 2 / Test case 2 – evaluation
- as filtering technique / HMM as a filtering technique
- performance consideration / Performance consideration
Hidden Naïve Bayes (HNB) / Training
hinge loss / The nonseparable case – the soft margin
HMM constructor
- config / Putting it all together
- xt / Putting it all together
- form / Putting it all together
- quantize / Putting it all together
- f / Putting it all together
hyperplane / Binomial classification

I

implementation, genetic algorithms
- about / Implementation
- software design / Software design
- key components / Key components
- selection operator / Selection
- population growth, controlling / Controlling the population growth
- GA configuration / The GA configuration
- crossover operator / Crossover
- mutation operator / Mutation
- reproduction / Reproduction
- solver / Solver
implementation, Q-learning
- about / Implementation
- software design / Software design
- states and actions / The states and actions
- search space / The search space, The policy and action-value
- Q-learning components / The Q-learning components
- Q-learning training / The Q-learning training
- tail recursion to rescue / Tail recursion to the rescue
- validation / The validation
- prediction / The prediction
information retrieval and text mining
- about / Basics of information retrieval
input forward propagation, training epoch
- about / Step 1 – input forward propagation
- computational flow / The computational flow
- error functions / Error functions
- operating nodes / Operating modes
- softmax / Softmax
insensitive error
- about / An overview

J

Jacobian matrix
- about / Jacobian and Hessian matrices
Java
- about / Java
JBlas/Linpack
- URL / Don't reinvent the wheel!
JFreeChart
- about / JFreeChart
- description / Description
- licensing / Licensing
- installation / Installation
- installation, for Mac OSX / Installation
- installation, for Windows / Installation
JFreeChart library
- about / Bias-variance decomposition

K

K-fold cross-validation scheme / Assessing a model
K-means clustering
- about / K-means clustering
- similarity, measuring / Measuring similarity
- algorithm, defining / Defining the algorithm
- cluster configuration / Step 1 – cluster configuration
- cluster assignment / Step 2 – cluster assignment
- reconstruction/error minimization / Step 3 – reconstruction/error minimization
- classification / Step 4 – classification
- curse of dimensionality / The curse of dimensionality
- evaluation, setting up / Setting up the evaluation
- results, evaluating / Evaluating the results
- number of clusters, tuning / Tuning the number of clusters
- validation / Validation
Kalman smoothing
- about / Kalman smoothing
kernel functions
- about / Kernel functions, An overview
- common discriminative kernels / Common discriminative kernels
- linear kernel (dot product) / Common discriminative kernels
- polynomial kernel / Common discriminative kernels
- radial basis function (RBF) / Common discriminative kernels
- sigmoid kernel / Common discriminative kernels
- Laplacian kernel / Common discriminative kernels
- log kernel / Common discriminative kernels
- kernel monadic composition / Kernel monadic composition
kernel trick
- about / The kernel trick
key components, genetic algorithm implementation
- population / Population
- chromosomes / Chromosomes
- genes / Genes
keyquality metrics
- about / Key quality metrics

L

L1 regularization / Ln roughness penalty
L2 regularization / Ln roughness penalty
Lagrange multipliers
- about / Lagrange multipliers
Laplace / The zero-frequency problem
lasso regularization
- about / Ln roughness penalty
Latent Dirichlet allocation (LDA)
- about / Probabilistic graphical models
lazy methods
- about / Computation on demand
LDL decomposition / LDL decomposition
learning classifier systems (LCS)
- about / Learning classifier systems, Introduction to LCS
- components / Introduction to LCS
- features / Why LCS?
- terminology / Terminology
- benefits / Benefits and limitations of learning classifier systems
- limitations / Benefits and limitations of learning classifier systems
learning vector quantization / Clustering
least squares problem / Numerical optimization
lemmatization / Basics of information retrieval
Levenberg-Marquardt
- about / Levenberg-Marquardt
Levenstein distance / Basics of information retrieval
libraries
- about / Other libraries and frameworks
libraries directory
- about / List of libraries and tools
LIBSVM
- about / LIBSVM
- URL, for downloading / LIBSVM
- URL, for documentation / LIBSVM
- benefits / LIBSVM
LIBSVM, Java classes
- svm_model / LIBSVM
- svm_node / LIBSVM
- svm_parameters / LIBSVM
- svm_problem / LIBSVM
- svm / LIBSVM
Lidstone / The zero-frequency problem
likelihood
- about / Formalism
Limited memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) / L-BFGS
linear algebra
- about / Linear algebra
- QR decomposition / QR decomposition
- LU factorization / LU factorization
- LDL decomposition / LDL decomposition
- Cholesky factorization / Cholesky factorization
- singular value decomposition (SVD) / Singular Value Decomposition
- Eigenvalue decomposition / Eigenvalue decomposition
- algebraic libraries / Algebraic and numerical libraries
- numerical libraries / Algebraic and numerical libraries
linear chain CRF / Introduction to CRF
linear chain structured graph CRF / Introduction to CRF
linear regression
- about / Linear regression
- one-variate linear regression / One-variate linear regression
- ordinary least squares regression / Ordinary least squares regression
- versus SVR / SVR versus linear regression
linear SVM
- about / The linear SVM
- separable case (hard margin) / The separable case – the hard margin
- nonseparable case (soft margin) / The nonseparable case – the soft margin
LogBinRegression constructor
- obsSet / Step 5 – implementing the classifier
- expected / Step 5 – implementing the classifier
- maxIters / Step 5 – implementing the classifier
- eta / Step 5 – implementing the classifier
- eps / Step 5 – implementing the classifier
logistic regression
- about / Logistic regression
- logistic function / Logistic function
- binomial classification / Binomial classification
- design / Design
- training workflow / The training workflow
- classification / Classification
low-band filter
- about / The exponential moving average
LU factorization
- about / LU factorization
- basic LU factorization / LU factorization
- with pivot / LU factorization

M

machine learning
- features / Why machine learning?
machine learning algorithms
- taxonomy / Taxonomy of machine learning algorithms
machine learning problems
- classification / Classification
- prediction / Prediction
- optimization / Optimization
- regression / Regression
maintainability
- about / Maintainability
Markov decision processes
- about / Markov decision processes
- Markov property / Markov decision processes, The Markov property
- first order discrete Markov chain / The first order discrete Markov chain
master-workers, Akka
- about / Master-workers
- exchange of messages / Exchange of messages
- worker actors / Worker actors
- workflow controller / The workflow controller
- master actor / The master actor
- master with routing / Master with routing
- discrete Fourier transform (DFT) / Distributed discrete Fourier transform
- limitations / Limitations
mathematical abstractions
- about / Supporting mathematical abstractions
- variable declaration / Step 1 – variable declaration
- model definition / Step 2 – model definition
- instantiation / Step 3 – instantiation
mathematical concepts
- about / Mathematics
- linear algebra / Linear algebra
- first order predicate logic / First order predicate logic
- Jacobian matrix / Jacobian and Hessian matrices
- Hessian matrix / Jacobian and Hessian matrices
- optimization techniques / Summary of optimization techniques
- dynamic programming / Overview of dynamic programming
mathematical notation / Mathematical notation for the curious
maximum margin classifiers
- kernel trick / Max-margin classification
mean squared error (MSE) / One-variate linear regression
measurement noise covariance / The measurement equation
message-passing mechanisms
- fire-and-forget or tell / The Actor model
- send-and-receive or ask / The Actor model
metaphor for graphical models / Probabilistic graphical models
methodology
- defining / Defining a methodology
Michigan approach / Why LCS?
mixins
- about / Composing mixins to build a workflow
mixins, composing for building workflow
- about / Composing mixins to build a workflow
- problem, understanding / Understanding the problem
- modules, defining / Defining modules
- workflow, instantiating / Instantiating the workflow
model
- about / A model by any other name
- features / A model by any other name
- attributes / A model by any other name
- variables / A model by any other name
- parametric / A model by any other name
- differential / A model by any other name
- probabilistic / A model by any other name
- graphical / A model by any other name
- directed graphs / A model by any other name
- numerical method / A model by any other name
- chemistry / A model by any other name
- taxonomy / A model by any other name
- grammar and lexicon / A model by any other name
- inference logic / A model by any other name
- versus design / Model versus design
- features, selecting / Selecting features
- features, extracting / Extracting features
model, assessing
- about / Assessing a model
- validation / Validation
- cross-validation / Cross-validation
- bias-variance decomposition / Bias-variance decomposition
- overfitting / Overfitting
model categorization
- about / Model categorization
- predictive models / Model categorization
- descriptive models / Model categorization
- adaptive modeling / Model categorization
modeling
- about / Modeling, Model versus design
monadic composition
- about / Monads
monadic data transformation
- about / Monadic data transformation
- explicit model / Monadic data transformation, Explicit models
- implicit model / Monadic data transformation, Implicit models
- error handling / Error handling
monads
- about / Abstraction, Monads
Monitor class
- about / Monitor
morphism / Error handling
moving averages
- about / Moving averages
- simple moving average / The simple moving average
- weighted moving average / The weighted moving average
- exponential moving average / The exponential moving average
multilayer perceptron
- about / The multilayer perceptron
- activation function / The activation function
- network topology / The network topology
- design / Design
- UML class diagram / Design
- configuration / Configuration
- network components / Network components
- model / The model
- problem types (modes) / Problem types (modes)
- online training, versus batch training / Online training versus batch training
- training epoch / The training epoch
- training and classification / Training and classification
multinomial Naïve Bayes model
- about / Introducing the multinomial Naïve Bayes
- formalism / Formalism
- frequentist perspective / The frequentist perspective
- predictive model / The predictive model
- zero-frequency problem / The zero-frequency problem
Multivariate Bernoulli classification
- about / The Multivariate Bernoulli classification
- model / Model
- implementation / Implementation
mutation operator, genetic algorithm implementation
- about / Mutation
- population / Population
- chromosomes / Chromosomes
- genes / Genes

N

n-grams / Basics of information retrieval
natural language processing (NLP) / The feature functions model
Naïve Bayes
- applying, to text mining / Naïve Bayes and text mining
Naïve Bayes algorithm
- pros / Pros and cons
- cons / Pros and cons
Naïve Bayes classifiers
- about / Naïve Bayes classifiers
- multinomial Naïve Bayes / Introducing the multinomial Naïve Bayes
Naïve Bayes classifiers implementation
- about / Implementation
- design / Design
- training / Training
- classification / Classification
- F1 validation / F1 validation
- feature extraction / Feature extraction
- testing / Testing
Naïve Bayes models
- about / Probabilistic graphical models
- mathematical notation / Formalism
net profit margin
- about / Fundamental analysis
net sales
- about / Fundamental analysis
network components, multilayer perceptron
- about / Network components
- network topology / The network topology
- input and hidden layers / Input and hidden layers
- output layer / The output layer
- synapses / Synapses
- connections / Connections
- initialization weights / The initialization weights
non-linear models, dimension reduction
- about / Non-linear models
- kernel PCA / Kernel PCA
- manifolds / Manifolds
nonlinear least squares minimization
- about / Nonlinear least squares minimization
- Gauss-Newton / Gauss-Newton
- Levenberg-Marquardt / Levenberg-Marquardt
nonlinear SVM
- about / The nonlinear SVM
- max-margin classification / Max-margin classification
- kernel trick / The kernel trick
NP problems
- categories / NP problems
- about / NP problems
Nu-SVM / The nonseparable case – the soft margin
numerical optimization
- about / Numerical optimization
- Newton / Numerical optimization
- Quasi-Newton / Numerical optimization

O

observation
- about / Extracting features
one-class SVC
- used, for anomaly detection / Anomaly detection with one-class SVC
one-variate linear regression
- about / One-variate linear regression
- implementation / Implementation
- test case / Test case
online training / Online training versus batch training
operating income
- about / Fundamental analysis
operating profit margin
- about / Fundamental analysis
optimal substructures
- about / Overview of dynamic programming
optimization techniques
- about / Summary of optimization techniques
- gradient descent methods / Steepest descent
- Quasi-Newton algorithms / Quasi-Newton algorithms
- nonlinear least squares minimization / Nonlinear least squares minimization
- Lagrange multipliers / Lagrange multipliers
OptionModel class / The OptionModel class
OptionProperty class / The OptionProperty class
options trading
- about / Options trading
option trading, with Q-learning
- about / Option trading using Q-learning
- OptionProperty class / The OptionProperty class, The OptionModel class
- quantization / Quantization
ordinary least squares regression
- about / Ordinary least squares regression
- design / Design
- implementation / Implementation
- trending, test case 1 / Test case 1 – trending
- feature selection, test case 2 / Test case 2 – feature selection
overfitting
- about / Overfitting, The frequentist perspective
overlapping substructures
- about / Overview of dynamic programming
overload operators
- about / Overloading
- += / Overloading
- + / Overloading

P

padding / Value encoding
parallel collections, Scala
- about / Processing a parallel collection
- benchmark framework / The benchmark framework
- performance evaluation / Performance evaluation
Parallel Colt
- URL / Don't reinvent the wheel!
Partial Least Square Regression (PLSR) / Evaluation
partially connected neural networks / The network topology
pay-out ratio
- about / Fundamental analysis
penalized least squares regression / Ln roughness penalty
performance considerations
- about / Performance considerations
- K-means / K-means
- EM / EM
- PCA / PCA
performance evaluation, Spark
- about / Performance evaluation
- parameters, tuning / Tuning parameters
- tests / Tests
- performance considerations / Performance considerations
Pittsburgh approach / Why LCS?
Pool
- about / Key components
posterior probability
- about / Formalism
Predicted Residual Error Sum of Squares (PRESS) / Evaluation
predictive model
- about / The predictive model
predictive models / Model categorization
price/book value ratio (PB)
- about / Fundamental analysis
price/earnings ratio (PE)
- about / Fundamental analysis
price/sales ratio (PS)
- about / Fundamental analysis
price patterns
- about / Price patterns
Price to Earnings/Growth (PEG)
- about / Fundamental analysis
primal problem / The nonseparable case – the soft margin
principal components analysis, dimension reduction
- about / Principal components analysis
- algorithm / Algorithm
- implementation / Implementation
- test case / Test case
- evaluation / Evaluation
probabilistic graphical models
- about / Probabilistic graphical models
probabilistic kernels
- about / Common discriminative kernels
probabilistic reasoning
- about / Probabilistic graphical models
propositional logic
- about / First order predicate logic
protein sequence annotation
- about / An overview

Q

Q-learning
- about / A solution – Q-learning
- Bellman optimality equations / The Bellman optimality equations
- temporal difference, for model-free learning / Temporal difference for model-free learning
- action-value iterative update / Action-value iterative update
- implementation / Implementation
- for option trading / Option trading using Q-learning
- implementing / Putting it all together
- evaluation / Evaluation
QR decomposition / Ordinary least squares regression
QStar class / The Viterbi algorithm
quantization / Value encoding
Quasi-Newton algorithms
- about / Quasi-Newton algorithms
- Broyden-Fletcher-Goldfarb-Shanno (BGFS) / BFGS
- Limited memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) / L-BFGS

R

real-world Bayesian network
- example / Probabilistic graphical models
recombination
- about / Evolutionary computing
reconstruction/error minimization, K-means clustering
- about / Step 3 – reconstruction/error minimization
- K-means components, creating / Creating K-means components
- tail recursive implementation / Tail recursive implementation
- iterative implementation / Iterative implementation
recursive algorithm, discrete Kalman filter
- about / The recursive algorithm
- prediction phase / Prediction
- correction / Correction
- Kalman smoothing / Kalman smoothing
- fixed lag smoothing / Fixed lag smoothing
- experimentation / Experimentation
regression model / Design
regression weights
- about / One-variate linear regression
regularization
- about / Regularization, Ln roughness penalty
- Ln roughness penalty / Ln roughness penalty
- ridge regression / Ridge regression
reinforcement learning
- about / Model categorization, Reinforcement learning
- problem / The problem
- Q-learning / A solution – Q-learning
- terminologies / Terminology
- value of a policy / Value of a policy
- pros / Pros and cons of reinforcement learning
- cons / Pros and cons of reinforcement learning
reinforcement learning agent
- overview architecture / Concepts
reproducible kernel Hilbert spaces
- about / Common discriminative kernels
residuals mean square (RMS) / Step 5 – minimizing the sum of square errors
resilient distributed dataset (RDD) / Apache Spark
- transformation / Apache Spark
- action / Apache Spark
Resilient Distributed Datasets (RDD)
- about / Computation on demand
ridge regression
- about / Ln roughness penalty, Ridge regression
- design / Design
- implementation / Implementation
- test case / Test case
Riemann metric
- about / Kernel monadic composition

S

Scala
- about / Why Scala?, Scala, Scala
- features / Why Scala?
- abstraction / Abstraction
- scalability / Scalability
- configurability / Configurability
- maintainability / Maintainability
- computation / Computation on demand
- time series / Time series in Scala
- object creation / Object creation
- streams / Streams
- parallel collections / Parallel collections
scalability
- about / Scalability
scalability, with Actors
- about / Scalability with Actors
- Actor model / The Actor model
- partitioning / Partitioning
- reactive programming / Beyond actors – reactive programming
Scalable frameworks
- about / An overview
Scala plugin for Eclipse
- reference / Scala
Scala plugin for IntelljIDEA
- reference / Scala
Scala programming
- about / Scala programming
- libraries directory / List of libraries and tools
- code snippets format / Code snippets format
- encapsulation / Encapsulation
- class constructor template / Class constructor template
- companion objects, versus case classes / Companion objects versus case classes
- enumerations, versus case classes / Enumerations versus case classes
- overload operators / Overloading
- design template, for classifiers / Design template for immutable classifiers
- data extraction / Data extraction
- financial data sources / Data sources
- document extraction / Extraction of documents
- DMatrix class / DMatrix class
- Counter class / Counter
- Monitor class / Monitor
Scalaz
- about / Abstraction
semi-supervised learning
- about / Semi-supervised learning
Sequential Minimal Optimization (SMO) / The nonseparable case – the soft margin
- about / LIBSVM
short interest
- about / Fundamental analysis
short interest ratio
- about / Fundamental analysis
shrinkage
- about / Ln roughness penalty
Simple Build Tool (SBT)
- about / Scala
simple build tool (sbt) / Deploying Spark
simple moving average
- about / The simple moving average
simple workflow
- writing / Writing a simple workflow
- problem, scoping / Step 1 – scoping the problem
- data loading / Step 2 – loading data
- data, preprocessing / Step 3 – preprocessing the data
- immutable normalization / Immutable normalization
- patterns, discovering / Step 4 – discovering patterns
- data, analyzing / Analyzing data
- data, plotting / Plotting data
- classifier, implementing / Step 5 – implementing the classifier
- optimizer, selecting / Selecting an optimizer
- model, training / Training the model
- observations, classifying / Classifying observations
- model, evaluating / Step 6 – evaluating the model
singular value decomposition / Ordinary least squares regression
singular value decomposition (SVD) / PCA
- about / Singular Value Decomposition
smoothing factor for counters
- about / The zero-frequency problem
smoothing kernels
- about / Common discriminative kernels
soft margin / The nonseparable case – the soft margin
source code
- about / Source code
- context, versus view bounds / Context versus view bounds
- presentation / Presentation
- primitive types / Primitive types
- type conversions / Type conversions
- implicit conversion / Type conversions
- immutability / Immutability
- Scala iterators, performance / Performance of Scala iterators
Spark ecosystem
- about / Apache Spark
Sparkling Water
- about / 0xdata Sparkling Water
spectral density estimation
- purpose / Fourier analysis
stackable trait injection / Composing mixins to build a workflow
state space estimation, discrete Kalman filter
- about / The state space estimation
- transition equation / The transition equation
- measurement equation / The measurement equation
steepest descent
- about / Steepest descent
stemming / Basics of information retrieval
stimuli / The biological background
stochastic gradient descent / Ordinary least squares regression
- about / Stochastic gradient descent
substructures
- about / Overview of dynamic programming
sum of squared errors (SSE) / One-variate linear regression
supervised learning
- about / Supervised learning
supervised machine learning algorithms
- about / Supervised learning
- generative models / Generative models
- discriminative models / Discriminative models
support vector machines (SVMs)
- about / Support vector machines
- linear SVM / The linear SVM
- nonlinear SVM / The nonlinear SVM
SVC
- about / Support vector classifiers – SVC
- binary SVC / The binary SVC
- one-class SVC / Anomaly detection with one-class SVC
SVM
- components / Design
- configuration parameters / Configuration parameters
- performance considerations / Performance considerations
SVM dual problem
- kernel trick / Max-margin classification
SVMLight
- about / LIBSVM
SVR
- about / Support vector regression
- overview / An overview
- versus linear regression / SVR versus linear regression

T

tagging model / Basics of information retrieval
TaskSupport
- about / Processing a parallel collection
taxonomy, machine learning algorithms
- about / Taxonomy of machine learning algorithms
- unsupervised learning / Unsupervised learning
- supervised learning / Supervised learning
- semi-supervised learning / Semi-supervised learning
- reinforcement learning / Reinforcement learning
technical analysis
- about / Technical analysis
- trading data / Trading data
- trading signal and strategy / Trading signals and strategy
- price patterns / Price patterns
technical analysis, terminology
- bearish or bearish position / Terminology
- bullish or bullish position / Terminology
- long position / Terminology
- neutral position / Terminology
- oscillator / Terminology
- overbought / Terminology
- oversold / Terminology
- relative strength index (RSI) / Terminology
- resistance / Terminology
- short position / Terminology
- support / Terminology
- technical indicator / Terminology
- trading range / Terminology
- trading signal / Terminology
- volatility / Terminology
temporal difference
- about / Temporal difference for model-free learning
terminology, LCS
- environment / Terminology
- agent / Terminology
- predicate / Terminology
- compound predicate / Terminology
- action / Terminology
- rule / Terminology
- classifier / Terminology
- rule fitness or score / Terminology
- sensors / Terminology
- input data stream / Terminology
- rule matching / Terminology
- covering / Terminology
- predictor / Terminology
terminology, reinforcement learning
- environment / Terminology
- agent / Terminology
- state / Terminology
- goal / Terminology
- absorbing state / Terminology
- terminal state / Terminology
- action / Terminology
- policy / Terminology
- best policy / Terminology
- reward / Terminology
- episode / Terminology
- horizon / Terminology
test case, evaluation
- about / Test case
- implementation / Implementation
- evaluation of models / Evaluation of models
- impact of the hidden layers' architecture / Impact of the hidden layers' architecture
test case, trading strategy
- about / A test case
- trading strategies, creating / Creating trading strategies
- optimizer, configuring / Configuring the optimizer
- best trading strategy, finding / Finding the best trading strategy
testing, Naïve Bayes
- about / Testing
- textual information, retrieving / Retrieving the textual information
- text mining classifier, evaluating / Evaluating the text mining classifier
tests, genetic algorithms
- about / Tests
- weighted score / The weighted score
- unweighted score / The unweighted score
text analytics, conditional random field (CRF)
- about / Regularized CRFs and text analytics
- feature functions model / The feature functions model
- design / Design
- implementation / Implementation
- CRF classifier, configuring / Configuring the CRF classifier
- CRF model, training / Training the CRF model
- CRF model, applying / Applying the CRF model
- tests / Tests
- training convergence profile / The training convergence profile
- impact, of size of training set / Impact of the size of the training set
- impact, of L2 regularization factor / Impact of the L2 regularization factor
text mining
- about / Naïve Bayes and text mining
- Naïve Bayes, applying to / Naïve Bayes and text mining
text mining methodology
- implementing / Implementation
- documents, analyzing / Analyzing documents
- frequency of relative terms, extracting / Extracting the frequency of relative terms
- features, generating / Generating the features
ThreadPoolTaskSupport
- about / Processing a parallel collection
time series, in Scala
- about / Time series in Scala
- types and operations / Types and operations
- magnet pattern / The magnet pattern
- transpose operator / The transpose operator
- differential operator / The differential operator
- lazy views / Lazy views
tools
- about / Tools and frameworks
trading signal / Trading signals and strategy
trading strategies
- about / Definition of trading strategies
- trading operators / Trading operators
- cost function / The cost function
- trading signals / Trading signals
- trading strategies / Trading strategies
- trading signal encoding / Trading signal encoding
training, hidden Markov model (HMM)
- about / Training – CF-2
- Baum-Welch estimator / The Baum-Welch estimator (EM)
training, Naïve Bayes classifiers implementation
- about / Training
- class likelihood / Class likelihood
- binomial model / Binomial model
- multinomial model / The multinomial model
- classifier components / Classifier components
training and classification, multilayer perceptron
- about / Training and classification
- regularization / Regularization
- model generation / The model generation
- Fast Fisher-Yates shuffle / The Fast Fisher-Yates shuffle
- prediction / Prediction
- model fitness / Model fitness
training epoch, multilayer perceptron
- about / The training epoch
- input forward propagation / Step 1 – input forward propagation
- error backpropagation / Step 2 – error backpropagation
- exit condition / Step 3 – exit condition
- implementing / Putting it all together
training workflow, logistic regression
- about / The training workflow
- optimizer, configuring / Step 1 – configuring the optimizer
- Jacobian matrix, computing / Step 2 – computing the Jacobian matrix
- convergence of optimizer, managing / Step 3 – managing the convergence of the optimizer
- least squares problem, defining / Step 4 – defining the least squares problem
- sum of square errors, minimizing / Step 5 – minimizing the sum of square errors
- binomial multivariate logistic regression, testing / Test
trending / Test case 1 – trending
two-step lag smoothing algorithm / Experimentation
Typesafe Activator
- URL / Akka

U

unsupervised learning
- about / Unsupervised learning
- data clustering / Clustering
- dimension reduction / Dimension reduction

V

validation, model
- about / Validation
- key quality metrics / Key quality metrics
- F-score for binomial classification / F-score for binomial classification
- F-score for multinomial classification / F-score for multinomial classification
variance-bias trade-off
- about / Bias-variance decomposition
vector quantization
- about / Clustering
view bounds / Context versus view bounds
Viterbi algorithm
- about / The Viterbi algorithm
- psi / The Viterbi algorithm
- qStar / The Viterbi algorithm
- delta / The Viterbi algorithm
ViterbiPath class / Putting it all together
ViterbiPath object / Putting it all together

W

weighted moving average
- about / The weighted moving average
WordNet / Basics of information retrieval
workflow computational model
- about / A workflow computational model
- mathematical abstractions, supporting / Supporting mathematical abstractions
- mixins, combining to build workflow / Composing mixins to build a workflow
- modularization / Modularization

X

0xdata H2O / 0xdata Sparkling Water
0xdata Sparkling Water
- about / 0xdata Sparkling Water

Y

1-year Treasury bill (1yTB)
- about / Introducing the multinomial Naïve Bayes
Yahoo Finances / Step 1 – scoping the problem
YahooFinancials / Data sources

Z

zero-frequency problem
- about / The zero-frequency problem

The rest of the chapter is locked

You're reading from Scala for Machine Learning Leverage Scala and Machine Learning to construct and study systems that can learn from data

Table of Contents (20) Chapters

Index

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

Authors (1)

Personalised recommendations for you

You're reading from Scala for Machine Learning Leverage Scala and Machine Learning to construct and study systems that can learn from data

Table of Contents (20) Chapters

Index

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you