Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
Hands-On Ensemble Learning with R

You're reading from   Hands-On Ensemble Learning with R A beginner's guide to combining the power of machine learning algorithms using ensemble techniques

Arrow left icon
Product type Paperback
Published in Jul 2018
Publisher Packt
ISBN-13 9781788624145
Length 376 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
 Tattar Tattar
Author Profile Icon Tattar
Tattar
Arrow right icon
View More author details
Toc

Table of Contents (17) Chapters Close

Hands-On Ensemble Learning with R
Contributors
Preface
1. Introduction to Ensemble Techniques FREE CHAPTER 2. Bootstrapping 3. Bagging 4. Random Forests 5. The Bare Bones Boosting Algorithms 6. Boosting Refinements 7. The General Ensemble Technique 8. Ensemble Diagnostics 9. Ensembling Regression Models 10. Ensembling Survival Models 11. Ensembling Time Series Models 12. What's Next?
Bibliography Index

Index

A

  • adabag packages
    • using / Using the adabag and gbm packages
  • adaptive boosting / Adaptive boosting
  • Adaptive boosting algorithm
    • about / Why does boosting work?
    • working / Why does boosting work?
  • additive effect / Exponential smoothing state space model
  • advantages, extreme gradient boosting implementation
    • parallel computing / The xgboost package
    • regularization / The xgboost package
    • cross-validation / The xgboost package
    • pruning / The xgboost package
    • missing values / The xgboost package
    • saving and reloading / The xgboost package
    • cross platform / The xgboost package
  • amyotrophic lateral sclerosis (ALS) / Squared-error loss function
  • area under curve (AUC) / Complementary statistical tests
  • auto-correlation function (ACF) / Core concepts and metrics
  • Auto-regressive Integrated Moving Average (ARIMA) models / Auto-regressive Integrated Moving Average (ARIMA) models

B

  • bagging
    • comparing, with random forests / Comparing bagging, random forests, and boosting
    • comparing, with boosting / Comparing bagging, random forests, and boosting
    • for regression data / Bagging and Random Forests
  • bagging technique
    • describing / Bagging and time series
  • board stiffness dataset / Board Stiffness
  • Boostap AGGregatING (bagging) / Bagging
  • boot package / The boot package
  • Bootstrap
    • about / Bootstrap – a statistical method
    • standard error of correlation coefficient / The standard error of correlation coefficient
    • parametric bootstrap / The parametric bootstrap
    • eigen values / Eigen values
    • rule of thumb / Rule of thumb
  • bootstrap hypothesis testing problems / Bootstrap and testing hypotheses

C

  • Chi-square Automatic Interaction Detector (CHAID) / Random Forests
  • chi-square test / Chi-square and McNemar test
  • Classification and Regression Trees (CART) / Random Forests
    • advanatges / Random Forests
    • drawbacks / Random Forests
  • classification trees / Classification trees and pruning
  • class prediction / Class prediction
  • Cohen's statistic / Cohen's statistic
  • complementary statistical tests
    • about / Complementary statistical tests
    • permutation test / Permutation test
    • chi-square test / Chi-square and McNemar test
    • McNemar test / Chi-square and McNemar test
    • ROC test / ROC test
  • complexity parameter (Cp) / Classification trees and pruning
  • contingency table
    • about / Pairwise measure
  • correlation coefficient measure / Correlation coefficient measure
  • Cox proportional hazards models / Regression models – parametric and Cox proportional hazards models

D

  • data
    • pre-processing / Pre-processing the housing data
    • housing / Pre-processing the housing data
  • datasets
    • about / Datasets
    • hypothyroid datasets / Hypothyroid
    • waveform datasets / Waveform
    • German Credit / German Credit
    • Iris / Iris
    • Pima Indians Diabetes / Pima Indians Diabetes
    • US Crime / US Crime
    • Overseas Visitors / Overseas visitors
    • Primary Biliary Cirrhosis / Primary Biliary Cirrhosis
    • multishapes / Multishapes
    • board stiffness dataset / Board Stiffness
    • selecting / The right model dilemma!
  • decision tree
    • about / Decision tree
    • for hypothyroid classification / Decision tree for hypothyroid classification
  • disagreement measure / Disagreement measure
    • for ensemble / Disagreement measure for ensemble
  • double-fault measure / Double-fault measure

E

  • ensemble
    • need for / An ensemble purview
    • disagreement measure / Disagreement measure for ensemble
  • ensemble diagnostics
    • about / What is ensemble diagnostics?
  • ensemble diversity
    • about / Ensemble diversity
    • numeric prediction / Numeric prediction
    • class prediction / Class prediction
  • ensemble survival models / Ensemble survival models
  • ensembling
    • working / Why does ensembling work?
    • by voting / Ensembling by voting
    • by averaging / Ensembling by averaging
  • ensembling, by averaging
    • about / Ensembling by averaging
    • simple averaging / Simple averaging
    • weight averaging / Weight averaging
  • ensembling, by voting
    • majority voting / Majority voting
    • weighted voting / Weighted voting
  • entropy measure / Entropy measure
  • Exponential Distribution / Core concepts of survival analysis
  • exponential models
    • reference / Exponential smoothing state space model
  • exponential smoothing state space model / Exponential smoothing state space model

F

  • functional-delta theorem / Nonparametric inference

G

  • Gamma Distribution / Core concepts of survival analysis
  • gbm package
    • about / The gbm package
    • reference / The gbm package
    • boosting, for count data / Boosting for count data
    • boosting, for survival data / Boosting for survival data
  • gbm packages
    • using / Using the adabag and gbm packages
  • general boosting algorithm / The general boosting algorithm
  • German Credit
    • about / German Credit
    • reference / German Credit
  • German credit dataset / Classification trees and pruning
  • gradient boosting algorithm
    • about / Gradient boosting
    • building, from scratch / Building it from scratch
    • squared-error loss function / Squared-error loss function

H

  • h2o package
    • about / The h2o package
    • reference / The h2o package
  • hazards regression model / Regression models – parametric and Cox proportional hazards models
  • hypothyroid dataset
    • about / Hypothyroid
    • reference / Hypothyroid

I

  • interrater agreement
    • about / Interrating agreement
    • entropy measure / Entropy measure
    • Kohavi-Wolpert measure / Kohavi-Wolpert measure
    • measurement / Measurement of interrater agreement
  • Iris dataset
    • about / Iris
  • iterative reweighted least squares (IRLS) algorithm / Logistic regression model

J

  • jackknife technique
    • about / The jackknife technique
    • for mean and variance / The jackknife method for mean and variance
    • pseudovalues method for survival data / Pseudovalues method for survival data

K

  • k-NN bagging / k-NN bagging
  • k-NN classifier / k-NN classifier, Analyzing waveform data
  • Kaplan-Meier estimator / Nonparametric inference
  • Kohavi-Wolpert measure / Kohavi-Wolpert measure

L

  • linear regression model / Linear regression model
  • logistic regression model
    • about / Logistic regression model
    • for hypothyroid classification / Logistic regression for hypothyroid classification

M

  • McNemar test / Chi-square and McNemar test
  • memoryless property / Core concepts of survival analysis
  • metrics / Core concepts and metrics
  • missForest function
    • reference / Missing data imputation
  • missing data
    • handling, random forests used / Missing data imputation
  • modeling dilemma / The right model dilemma!
  • multishapes dataset / Multishapes
  • multivariate statistics / Visualization and variable reduction

N

  • Naïve Bayes classifier
    • about / Naïve Bayes classifier
    • for hypothyroid classification / Naïve Bayes for hypothyroid classification
  • Nelson-Aalen estimator / Nonparametric inference
  • neural networks
    • about / Neural networks
    • for hypothyroid classification / Neural network for hypothyroid classification
    / Neural networks
  • nonparametric inference / Nonparametric inference
  • number prediction / Numeric prediction

O

  • Overseas Visitors dataset
    • about / Overseas visitors
    • reference / Overseas visitors

P

  • pairwise measure
    • about / Pairwise measure
    • disagreement measure / Disagreement measure
    • Yule's coefficient / Yule's or Q-statistic
    • Q-statistic / Yule's or Q-statistic
    • correlation coefficient measure / Correlation coefficient measure
    • Cohen's statistic / Cohen's statistic
    • double-fault measure / Double-fault measure
  • partial auto-correlation function (PACF)
    • about / Core concepts and metrics
    • reference / Core concepts and metrics
  • partial likelihood function / Regression models – parametric and Cox proportional hazards models
  • permutation test / Permutation test
  • Pima Indians Diabetes dataset / Pima Indians Diabetes
  • Primary Biliary Cirrhosis dataset
    • about / Primary Biliary Cirrhosis
  • Principal Component Analysis (PCA) / Visualization and variable reduction
  • proximity plots
    • using / Proximity plots
  • pruning / Classification trees and pruning

Q

  • Q-statistic / Yule's or Q-statistic

R

  • random forest
    • used, for clustering / Clustering with Random Forest
  • Random Forest algorithm
    • about / Random Forests
  • random forest nuances / Random Forest nuances
  • random forests
    • comparing, with bagging / Comparisons with bagging
    • used, for handling missing data / Missing data imputation
    • used, for clustering / Clustering with Random Forest
  • Random Forests
    • for regression data / Bagging and Random Forests
  • raters
    • about / Pairwise measure
  • regression models
    • bootstrapping / Bootstrapping regression models
    • about / Regression models, Regression models – parametric and Cox proportional hazards models
    • linear regression model / Linear regression model
    • neural networks / Neural networks
    • regression tree / Regression tree
    • prediction / Prediction for regression models
    • boosting / Boosting regression models
    • stacking methods / Stacking methods for regression models
    • Cox proportional hazards models / Regression models – parametric and Cox proportional hazards models
    • hazards regression model / Regression models – parametric and Cox proportional hazards models
  • regression tree / Regression tree
  • residual bootstrapping method / Bootstrapping regression models
  • ROC test / ROC test

S

  • split function / Bagging and Random Forests
  • stack ensembling / Stack ensembling
  • stacking methods
    • for regression models / Stacking methods for regression models
  • statistical/machine learning models
    • about / Statistical/machine learning models
    • logistic regression model / Logistic regression model
    • neural networks / Neural networks
    • Naïve Bayes classifier / Naïve Bayes classifier
    • decision tree / Decision tree
    • support vector machines / Support vector machines
  • support vector machines
    • about / Support vector machines
    • for hypothyroid classification / SVM for hypothyroid classification
  • survival analysis
    • about / Core concepts of survival analysis
  • Survival Models
    • bootstrapping / Bootstrapping survival models*
  • survival tree
    • about / Survival tree

T

  • time series datasets
    • about / Time series datasets
    • AirPassengers dataset / AirPassengers
    • co2 time series data / co2
    • uspop / uspop
    • gas time series data / gas
    • car sales data / Car Sales
    • austres time series dataset / austres
    • WWWusage time series dataset / WWWusage
  • time series models
    • bootstrapping / Bootstrapping time series models*
    • about / Essential time series models
    • Naïve forecasting / Naïve forecasting
    • seasonal / Seasonal, trend, and loess fitting
    • trend / Seasonal, trend, and loess fitting
    • loess fitting / Seasonal, trend, and loess fitting
    • exponential smoothing state space model / Exponential smoothing state space model
    • Auto-regressive Integrated Moving Average (ARIMA) models / Auto-regressive Integrated Moving Average (ARIMA) models
    • auto-regressive neural networks / Auto-regressive neural networks
    • linear model (LM) / Messing it all up
    • messing up / Messing it all up
    • ensembling / Ensemble time series models
  • time series visualization / Time series visualization

U

  • US Crime dataset / US Crime

V

  • variable clustering / Variable clustering
  • variable importance
    • for decision trees and random forests / Variable importance
    / Variable importance
  • variable reduction
    • about / Visualization and variable reduction
    • techniques / Visualization and variable reduction
  • visualization / Visualization and variable reduction

W

  • waveform datasets / Waveform
  • Weibull Distribution / Core concepts of survival analysis

X

  • xgboost package
    • about / The xgboost package
    • reference / The xgboost package
  • xgboost technique
    • reference / The xgboost package

Y

  • Yule's coefficient / Yule's or Q-statistic
lock icon The rest of the chapter is locked
arrow left Previous Section
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at £13.99/month. Cancel anytime
Visually different images