Packt+ | Advance your knowledge in tech

You're reading from Mastering Elasticsearch Further your knowledge of the Elasticsearch server by learning more about its internals, querying, and data handling

Product type Paperback

Published in Feb 2015

Publisher

ISBN-13 9781783553792

Length 434 pages

Edition 2nd Edition

Languages

Java

Tools

Elasticsearch

Concepts

Enterprise Search

Author (1):

Marek Rogozinski

View More author details

Table of Contents (19) Chapters

Mastering Elasticsearch Second Edition

Credits

About the Author

Acknowledgments

About the Author

Acknowledgments

About the Reviewers

www.PacktPub.com

Preface

1. Introduction to Elasticsearch FREE CHAPTER

2. Power User Query DSL

3. Not Only Full Text Search

4. Improving the User Search Experience

5. The Index Distribution Architecture

6. Low-level Index Control

7. Elasticsearch Administration

8. Improving Performance

9. Developing Elasticsearch Plugins

Index

A

additional configuration options, significant terms aggregation
- about / Additional configuration options
- background set filtering / Background set filtering
- minimum document count / Minimum document count
- execution hint / Execution hint
- more options / More options
additional term suggester options
- about / Additional term suggester options
- lowercase_terms / Additional term suggester options
- max_edits / Additional term suggester options
- prefix_length / Additional term suggester options
- min_word_length / Additional term suggester options
- shard_size / Additional term suggester options
- max_inspections / Additional term suggester options
- min_doc_freq / Additional term suggester options
- max_term_freq / Additional term suggester options
- accuracy / Additional term suggester options
- string_distance / Additional term suggester options
additive smoothing
- URL / Configuring smoothing models
advices, for high query rate scenarios
- about / Advices for high query rate scenarios
- filter caches / Filter caches and shard query caches
- shard query caches / Filter caches and shard query caches
- think about optimal query structure / Think about the queries
- routing, using / Using routing
- queries, parallelizing / Parallelize your queries
- field data cache / Field data cache and breaking the circuit
- circuit, breaking / Field data cache and breaking the circuit
- size and shard_size properties, controlling / Keeping size and shard_size under control
allocation awareness
- about / Allocation awareness
- forcing / Forcing allocation awareness
Amazon EC2 discovery
- about / The Amazon EC2 discovery
- plugin installation / The EC2 plugin installation
- generic configuration / The EC2 plugin's generic configuration
- optional configuration options / Optional EC2 discovery configuration options
- nodes scanning configuration / The EC2 nodes scanning configuration
analysis
- about / Analyzing your data
analysis binder
- implementing / Implementing the analysis binder
analyzer indices component
- implementing / Implementing the analyzer indices component
analyzer module
- implementing / Implementing the analyzer module
analyzer plugin
- implementing / Implementing the analyzer plugin
analyzer provider
- implementing / Implementing the analyzer provider
AND operator
- about / Understanding the basics
Apache Lucene
- about / Introducing Apache Lucene
- architecture / Overall architecture
- inverted index / Overall architecture
- data analysis / Analyzing your data
Apache Lucene scoring
- altering / Altering Apache Lucene scoring
- similarity models / Available similarity models
- per-field similarity, setting / Setting a per-field similarity
- similarity model configuration / Similarity model configuration
- default similarity model, selecting / Choosing the default similarity model
Apache Lucene scoring mechanism
- about / Default Apache Lucene scoring explained
- document, returning by Lucene / When a document is matched
- TF/IDF scoring formula / TF/IDF scoring formula
- Elasticsearch point of view / Elasticsearch point of view
- example / An example
Apache Maven
- URL / Creating the Apache Maven project structure
Apache Maven project structure
- creating / Creating the Apache Maven project structure
- basics / Understanding the basics
- about / The structure of the Maven Java project
- pom.xml / The idea of POM
- build process, running / Running the build process
- Maven assembly plugin / Introducing the assembly Maven plugin
- assembly Maven plugin / Introducing the assembly Maven plugin
Application Program Interface (API) / Communicating with Elasticsearch
architecture, Apache Lucene
- about / Overall architecture
- document / Overall architecture
- field / Overall architecture
- term / Overall architecture
- token / Overall architecture
architecture, Elasticsearch
- features / Key concepts behind Elasticsearch architecture
ASCII folding filter
- about / Analyzing your data
Azure repository
- about / The Azure repository

B

background set / Choosing significant terms
backups
- about / Backing up
- performing / Backing up
- saving, in cloud / Saving backups in the cloud
- S3 repository / The S3 repository
- HDFS repository / The HDFS repository
- Azure repository / The Azure repository
basic concepts, Elasticsearch
- index / Index
- document / Document
- type / Type
- mapping / Mapping
- node / Node
- cluster / Cluster
- shard / Shard
- replica / Replica
basic options, phrase suggester
- highlight / Basic configuration
- gram_size / Basic configuration
- confidence / Basic configuration
- max_errors / Basic configuration
- separator / Basic configuration
- force_unigrams / Basic configuration
- token_limit / Basic configuration
- collate / Basic configuration
- real_word_error_likehood / Basic configuration
basic queries
- about / Query categorization, Basic queries
- examples / Basic queries
basic queries use cases
- values, searching in range / Searching for values in range
- simplified query, for multiple terms / Simplified query for multiple terms
- lower scoring partial queries, ignoring / Ignoring lower scoring partial queries
benchmarking queries
- about / Benchmarking queries
- cluster configuration, preparing / Preparing your cluster configuration for benchmarking
- benchmarks, running / Running benchmarks
- benchmarks, controlling / Controlling currently run benchmarks
best fields matching
- about / Best fields matching
Boolean operators
- AND / Understanding the basics
- OR / Understanding the basics
- NOT / Understanding the basics
- + / Understanding the basics
- - / Understanding the basics
budget / The tiered merge policy
byte code
- about / Knowing about garbage collector

C

caches
- clearing / Clearing the caches
- all caches, clearing / Index, indices, and all caches clearing
- specific caches, clearing / Clearing specific caches
candidate generators
- configuring / Configuring candidate generators
- about / Configuring candidate generators
Cat API
- about / The human-friendly status API – using the Cat API
- basics / The basics
- using / Using the Cat API
- common arguments / Common arguments
- examples / The examples
circuit breakers
- using / Using circuit breakers
- field data circuit breaker / The field data circuit breaker
- request circuit breaker / The request circuit breaker
- total circuit breaker / The total circuit breaker
class custom analyzer
- implementing / Implementing the class custom analyzer
cluster
- about / Cluster
cluster- level recovery configuration
- about / Cluster-level recovery configuration
- indices.recovery.concurrent_streams / Cluster-level recovery configuration
- indices.recovery.max_bytes_per_sec / Cluster-level recovery configuration
- indices.recovery.compress / Cluster-level recovery configuration
- indices.recovery.file_chunk_size / Cluster-level recovery configuration
- indices.recovery.translog_ops / Cluster-level recovery configuration
- indices.recovery.translog_size / Cluster-level recovery configuration
common term suggester options
- about / Common term suggester options
- text / Common term suggester options
- field / Common term suggester options
- analyzer / Common term suggester options
- size / Common term suggester options
- sort / Common term suggester options
- suggest_mode / Common term suggester options
communication, Elasticsearch
- about / Communicating with Elasticsearch
- data indexing / Indexing data
- data querying / Querying data
completion suggester
- about / The completion suggester
- logic / The logic behind the completion suggester
- using / Using the completion suggester
- data indexing / Indexing data
- data, querying / Querying data
- custom weights / Custom weights
- additional parameters / Additional parameters
compound queries
- about / Query categorization, Compound queries
- examples / Compound queries
compound queries use cases
- matched documents, boosting / Boosting some of the matched documents
concurrent merge scheduler
- about / The concurrent merge scheduler
configuration options, log byte size merge policy
- merge_factor / The log byte size merge policy
- min_merge_size / The log byte size merge policy
- max_merge_size / The log byte size merge policy
- maxMergeDocs / The log byte size merge policy
- calibrate_size_by_deletes / The log byte size merge policy
configuration options, log doc merge policy
- merge_factor / The log doc merge policy
- min_merge_docs / The log doc merge policy
- max_merge_docs / The log doc merge policy
- calibrate_size_by_deletes / The log doc merge policy
configuration options, tiered merge policy
- index.merge.policy.expunge_deletes_allowed / The tiered merge policy
- index.merge.policy.floor_segment / The tiered merge policy
- index.merge.policy.max_merge_at_once / The tiered merge policy
- index.merge.policy.max_merge_at_once_explicit / The tiered merge policy
- index.merge.policy.max_merged_segment / The tiered merge policy
- index.merge.policy.segments_per_tier / The tiered merge policy
- index.reclaim_deletes_weight / The tiered merge policy
- index.compund_format / The tiered merge policy
cross fields matching
- about / Cross fields matching
curl tool
- URL / Indexing data
custom analysis plugin
- creating / Creating the custom analysis plugin
- implementation details / Implementation details
- TokenFilter, implementing / Implementing TokenFilter
- TokenFilter factory, implementing / Implementing the TokenFilter factory
- class custom analyzer, implementing / Implementing the class custom analyzer
- analyzer provider, implementing / Implementing the analyzer provider
- analysis binder, implementing / Implementing the analysis binder
- analyzer indices component, implementing / Implementing the analyzer indices component
- analyzer module, implementing / Implementing the analyzer module
- analyzer plugin, implementing / Implementing the analyzer plugin
- testing / Testing our custom analysis plugin
- building / Building our custom analysis plugin
- installing / Installing the custom analysis plugin
- checking / Checking whether our analysis plugin works
custom REST action
- creating / Creating custom REST action
- assumptions / The assumptions
- implementation / Implementation details

D

data-only nodes
- configuring / Configuring data-only nodes
data analysis
- about / Analyzing your data
- indexing / Indexing and querying
- querying / Indexing and querying
data field caches
- issues / The problem with field data cache
data node
- about / Node
- configuring / Configuring master and data nodes
data nodes
- about / Data nodes
default shard allocation behaviour
- altering / Altering the default shard allocation behavior
- allocation awareness / Allocation awareness
- filtering / Filtering
- runtime allocation, updating / Runtime allocation updating
- total shards allowed per node, defining / Defining total shards allowed per node
- total shards allowed per physical server, defining / Defining total shards allowed per physical server
default similarity model
- selecting / Choosing the default similarity model
default store type
- about / The default store type
- for Elasticsearch 1.3.0 / The default store type for Elasticsearch 1.3.0 and higher
- for Elasticsearch versions older than 1.3.0 / The default store type for Elasticsearch versions older than 1.3.0
desired merge scheduler
- setting / Setting the desired merge scheduler
DFR similarity
- configuring / Configuring the DFR similarity
direct generators
- about / Configuring candidate generators
- configuring / Configuring direct generators
discovery module
- about / Discovery and recovery modules
- configuration / Discovery configuration
- Zen discovery configuration / Zen discovery
divergence from randomness similarity model
- about / Available similarity models
document
- about / Document
documents
- relations / Relations between documents
documents grouping
- about / Documents grouping
- top hits aggregation / Top hits aggregation
- example / An example
- additional parameters / Additional parameters
document types
- about / Type
doc values / Doc values
- used, for optimizing queries / Using doc values to optimize your queries
- example, of usage / The example of doc values usage

E

EC2 discovery configuration options
- cloud.aws.region / Optional EC2 discovery configuration options
- cloud.aws.ec2.endpoint / Optional EC2 discovery configuration options
- cloud.aws.protocol / Optional EC2 discovery configuration options
- cloud.aws.proxy_host / Optional EC2 discovery configuration options
- cloud.aws.proxy_port / Optional EC2 discovery configuration options
- discovery.ec2.ping_timeout / Optional EC2 discovery configuration options
EC2 nodes scanning configuration
- discovery.ec2.host_type / The EC2 nodes scanning configuration
- discovery.ec2.groups / The EC2 nodes scanning configuration
- discovery.ec2.availability_zones / The EC2 nodes scanning configuration
- discovery.ec2.any_group / The EC2 nodes scanning configuration
- discovery.ec2.tag / The EC2 nodes scanning configuration
EC2 plugin's generic configuration
- cluster.aws.access_key / The EC2 plugin's generic configuration
- cluster.aws.secret_key / The EC2 plugin's generic configuration
Elasticsearch
- about / Introducing Elasticsearch
- basic concepts / Basic concepts
- key concepts / Key concepts behind Elasticsearch architecture
- workings / Workings of Elasticsearch
- startup process / The startup process
- failure detection / Failure detection
- communicating with / Communicating with Elasticsearch
- query rewrite / Query rewrite explained
- filters / Handling filters and why it matters
- scaling / Scaling Elasticsearch
- informing, about REST action / Informing Elasticsearch about our REST action
- informing, about custom analyzer / Informing Elasticsearch about our custom analyzer
Elasticsearch, using for high load scenarios
- about / Using Elasticsearch for high load scenarios
- general Elasticsearch-tuning advices / General Elasticsearch-tuning advices
- advices, for high query rate scenarios / Advices for high query rate scenarios
- high indexing throughput scenarios / High indexing throughput scenarios and Elasticsearch
Elasticsearch Azure plugin, settings
- container / The Azure repository
- base_path / The Azure repository
- chunk_size / The Azure repository
Elasticsearch caching
- about / Understanding Elasticsearch caching
- filter cache / The filter cache
- field data cache / The field data cache
- shard query cache / The shard query cache
- circuit breakers, using / Using circuit breakers
- caches, clearing / Clearing the caches
- all caches, clearing / Index, indices, and all caches clearing
examples, Cat API
- master node information, obtaining / Getting information about the master node
- node information, obtaining / Getting information about the nodes
exclude parameter / What include, exclude, and require mean
expectations on nodes, gateway module
- gateway.expected_nodes / Expectations on nodes
- gateway.expected_data_nodes / Expectations on nodes
- gateway.expected_master_nodes / Expectations on nodes

F

factors, for calculating score property of document
- document boost / When a document is matched
- filter boost / When a document is matched
- coordination factor / When a document is matched
- inverse document frequency / When a document is matched
- length norm / When a document is matched
- term frequency / When a document is matched
- query norm / When a document is matched
failure detection, Elasticsearch
- about / Failure detection
federated search
- about / Federated search
- test clusters / The test clusters
- tribe node, creating / Creating the tribe node
- indices conflicts, handling / Handling indices conflicts
- write operation, blocking / Blocking write operations
field data cache
- about / The field data cache
- doc values / Field data or doc values
- field data / Field data or doc values
- node-level field data cache configuration / Node-level field data cache configuration
- index-level field data cache configuration / Index-level field data cache configuration
- field data formats / Field data formats
- field data loading / Field data loading
field data cache filtering
- about / The field data cache filtering
- information, adding / Adding field data filtering information
- filtering, by term frequency / Filtering by term frequency
- filtering, by regex / Filtering by regex
- filtering by regex and term frequency / Filtering by regex and term frequency
- example / The filtering example
field data circuit breaker
- about / The field data circuit breaker
field data formats
- about / Field data formats
- string-based fields / String-based fields
- numeric-based fields / Numeric fields
- geographical-based fields / Geographical-based fields
fields
- querying / Querying fields
filter cache
- about / The filter cache
- types / Filter cache types
- node-level filter cache configuration / Node-level filter cache configuration
- index-level filter cache configuration / Index-level filter cache configuration
filters
- about / Handling filters and why it matters
- comparing, with query / Filters and query relevance
- working / How filters work
- bool filter / Bool or and/or/not filters
- and filter / Bool or and/or/not filters
- or filter / Bool or and/or/not filters
- not filter / Bool or and/or/not filters
- performance considerations / Performance considerations
- post filtering / Post filtering and filtered query
- filtered query / Post filtering and filtered query
- filtering method, selecting / Choosing the right filtering method
flushing
- about / The transaction log
foreground set / Choosing significant terms
full text search queries
- about / Query categorization, Full text search queries
- examples / Full text search queries
full text search queries use cases
- Lucene query syntax, using / Using Lucene query syntax in queries
- user queries, handling without errors / Handling user queries without errors

G

garbage collection problems
- dealing with / Dealing with garbage collection problems
garbage collector
- about / Knowing about garbage collector, More information on the garbage collector work
- Java memory / Java memory
- life cycle / The life cycle of Java objects and garbage collections
- collection problems, dealing with / Dealing with garbage collection problems
- logging, turning on / Turning on logging of garbage collection work
- JStat using / Using JStat
- memory dumps, creating / Creating memory dumps
- adjusting / Adjusting the garbage collector work in Elasticsearch
- standard start up script, using / Using a standard start up script
- service wrapper / Service wrapper
- swapping on Unix-like systems, avoiding / Avoid swapping on Unix-like systems
gateway configuration properties
- gateway.recover_after_nodes / Configuration properties
- gateway.recover_after_data_nodes / Configuration properties
- gateway.recover_after_master_nodes / Configuration properties
- gateway.recover_after_time / Configuration properties
gateway module
- about / The gateway and recovery configuration
- gateway recovery process / The gateway recovery process
- configuration properties / Configuration properties
- expectations on nodes / Expectations on nodes
- local gateway / The local gateway
- low-level recovery configuration / Low-level recovery configuration
general Elasticsearch-tuning advices
- store, selecting / Choosing the right store
- index refresh rate / The index refresh rate
- thread pools tuning / Thread pools tuning
- merge process, adjusting / Adjusting the merge process
- data distribution / Data distribution
global options, _bench REST endpoint
- name / Running benchmarks
- competitors / Running benchmarks
- num_executor_nodes / Running benchmarks
- percentiles / Running benchmarks
- iteration / Running benchmarks
- concurrency / Running benchmarks
- multiplier / Running benchmarks
- warmup / Running benchmarks
- clear_caches / Running benchmarks
Groovy
- about / Short Groovy introduction
- using, as scripting language / Using Groovy as your scripting language
- variable, defining in scripts / Variable definition in scripts
- conditional statements / Conditionals
- loops / Loops
- example / An example

H

HDFS repository
- about / The HDFS repository
- settings / The HDFS repository
high indexing throughput scenarios
- about / High indexing throughput scenarios and Elasticsearch
- bulk indexing / Bulk indexing
- doc values, versus indexing speed / Doc values versus indexing speed
- document fields, controlling / Keep your document fields under control
- index architecture / The index architecture and replication
- replication / The index architecture and replication
- write-ahead log, tuning / Tuning write-ahead log
- storage type / Think about storage
- RAM buffer, for indexing / RAM buffer for indexing
horizontal scaling
- about / Horizontal scaling
- replicas, creating automatically / Automatically creating replicas
- redundancy / Redundancy and high availability
- high availability / Redundancy and high availability
- cost / Cost and performance flexibility
- performance flexibility / Cost and performance flexibility
- continuous upgrades / Continuous upgrades
- multiple Elasticsearch instances, on single physical machine / Multiple Elasticsearch instances on a single physical machine
- nodes' roles, for larger clusters / Designated nodes' roles for larger clusters
Hot Threads API
- about / Very hot threads
- threads parameter / Very hot threads
- interval parameter / Very hot threads
- type parameter / Very hot threads
- snapshots parameter / Very hot threads
- usage clarification / Usage clarification for the Hot Threads API
- response / The Hot Threads API response
human-friendly status API
- Cat API / The human-friendly status API – using the Cat API
hybrid filesystem store
- about / The hybrid filesystem store

I

I/O throttling
- about / When it is too much for I/O – throttling explained
- controlling / Controlling I/O throttling
I/O throttling configuration
- about / Configuration
- throttling type, configuring / The throttling type
- maximum throughput per second / Maximum throughput per second
- node throttling defaults / Node throttling defaults
- performance considerations / Performance considerations
- example / The configuration example
IB similarity
- configuring / Configuring the IB similarity
implementation, custom analysis plugin
- TokenFilter class extension / Implementation details
- AbstractTokenFilterFactory extension / Implementation details
- custom analyzer / Implementation details
- analyzer provider / Implementation details
- AnalysisModule.AnalysisBinderProcessor / Implementation details
- AbstractComponent class / Implementation details
- AbstractModule extension / Implementation details
- AbstractPlugin extension / Implementation details
implementation, custom REST action
- about / Implementation details
- REST action class, using / Using the REST action class
- plugin class / The plugin class
- Elasticsearch, informing / Informing Elasticsearch about our REST action
include parameter
- about / What include, exclude, and require mean
index
- about / Index
- updating / Updating the index and committing changes
- changes, committing / Updating the index and committing changes
- default refresh time, changing / Changing the default refresh time
- transaction log / The transaction log
index-level filter cache configuration
- about / Index-level filter cache configuration
- index.cache.filter.type / Index-level filter cache configuration
- index.cache.filter.max_size / Index-level filter cache configuration
- index.cache.filter.expire / Index-level filter cache configuration
index-level recovery settings
- about / Index-level recovery settings
- quorum / Index-level recovery settings
- quorum-1 / Index-level recovery settings
- full / Index-level recovery settings
- full-1 / Index-level recovery settings
- integer value / Index-level recovery settings
index distribution architecture
- right amount of shards and replicas, selecting / Choosing the right amount of shards and replicas
- sharding / Sharding and overallocation
- over allocation / Sharding and overallocation
- example, over allocation / A positive example of overallocation
- multiple shards, versus multiple indices / Multiple shards versus multiple indices
- replicas / Replicas
indexing
- altering / NRT, flush, refresh, and transaction log
indices conflicts
- handling / Handling indices conflicts
indices recovery API
- about / The indices recovery API
inverted index
- about / Overall architecture

J

Java memory
- about / Java memory
- eden space / Java memory
- survivor space / Java memory
- tenured generation / Java memory
- permanent generation / Java memory
- code cache / Java memory
Java objects
- life cycle / The life cycle of Java objects and garbage collections
Java service wrapper
- URL / Service wrapper
Java Virtual Machine (JVM) / Communicating with Elasticsearch
JSON document
- URL / Communicating with Elasticsearch

L

Laplace smoothing model
- about / Configuring smoothing models
Least Recently Used cache type (LRU) / Node-level filter cache configuration
limitations, significant terms aggregation
- about / There are limits
- memory consumption / Memory consumption
- avoiding, as top level aggregation / Shouldn't be used as top-level aggregation
- approximated counts / Counts are approximated
- floating point fields, avoiding / Floating point fields are not allowed
linear interpolation smoothing model
- about / Configuring smoothing models
LM Dirichlet similarity
- configuring / Configuring the LM Dirichlet similarity
LM Jelinek Mercer similarity
- configuring / Configuring the LM Jelinek Mercer similarity
log byte size merge policy
- about / The log byte size merge policy
- configuration options / The log byte size merge policy
log doc merge policy
- about / The log doc merge policy
- configuration options / The log doc merge policy
low-level recovery configuration
- about / Low-level recovery configuration
- cluster- level recovery configuration / Cluster-level recovery configuration
- index-level recovery settings / Index-level recovery settings
lowercase filter
- about / Analyzing your data
Lucene analyzer
- about / Analyzing your data
Lucene expressions
- about / Lucene expressions explained
- basics / The basics
- example / An example
Lucene index
- about / Getting deeper into Lucene index
- norm / Norms
- term vectors / Term vectors
- posting formats / Posting formats
- doc values / Doc values
Lucene query language
- about / Lucene query language
- basics / Understanding the basics
- Boolean operators / Understanding the basics
- fields, querying / Querying fields
- term modifiers / Term modifiers
- special characters, handling / Handling special characters

M

mapping
- about / Mapping
master-only nodes
- configuring / Configuring master-only nodes
master election
- about / Master node
- configuration / The master election configuration
- Zen discovery fault detection / Zen discovery fault detection and configuration
- Zen discovery configuration / Zen discovery fault detection and configuration
master eligible nodes
- about / Master eligible nodes
master node
- about / Node, Master node
- configuring / Configuring master and data nodes
- Amazon EC2 discovery / The Amazon EC2 discovery
- discovery implementations / Other discovery implementations
Maven Assembly plugin
- about / Introducing the assembly Maven plugin
- using / Introducing the assembly Maven plugin
- URL / Introducing the assembly Maven plugin
memory store
- about / The memory store
- properties / Additional properties
merge
- tiered merge policy / The tiered merge policy
merge policy
- selecting / Choosing the right merge policy
- log byte size merge policy / The log byte size merge policy
- log doc merge policy / The log doc merge policy
merge schedulers
- about / Scheduling
- concurrent merge scheduler / The concurrent merge scheduler
- serial merge scheduler / The serial merge scheduler
- desired merge scheduler, selecting / Setting the desired merge scheduler
MMap filesystem store
- about / The MMap filesystem store
most fields matching
- about / Most fields matching
multicast Zen discovery configuration
- about / Multicast Zen discovery configuration
- discovery.zen.ping.multicast.address / Multicast Zen discovery configuration
- discovery.zen.ping.multicast.port / Multicast Zen discovery configuration
- discovery.zen.ping.multicast.group / Multicast Zen discovery configuration
- discovery.zen.ping.multicast.buffer_size / Multicast Zen discovery configuration
- discovery.zen.ping.multicast.ttl / Multicast Zen discovery configuration
- discovery.zen.ping.multicast.enabled / Multicast Zen discovery configuration
- discovery.zen.ping.unicats.concurrent_connects / The unicast Zen discovery configuration
multimatch
- controlling / Controlling multimatching
- types / Multimatch types
- best fields matching / Best fields matching
- cross fields matching / Cross fields matching
- most fields matching / Most fields matching
- phrase matching / Phrase matching
- phrase with prefixes matching / Phrase with prefixes matching
multiple Elasticsearch instances, on single physical machine
- about / Multiple Elasticsearch instances on a single physical machine
- shard and its replicas, preventing from being on same node / Preventing the shard and its replicas from being on the same node
multiple language stemming filters
- about / Analyzing your data
multiple shards
- versus multiple indices / Multiple shards versus multiple indices
multi_match query / Controlling multimatching
Mustache template engine
- about / The Mustache template engine
- URL / The Mustache template engine

N

N-gram smoothing models
- URL / Configuring smoothing models
near real-time GET
- about / Near real-time GET
nested documents
- about / The nested documents
new I/O filesystem store
- about / The new I/O filesystem store
node
- about / Node
- data node / Node
- master node / Node
- tribe node / Node
node-level filter cache configuration
- about / Node-level filter cache configuration
nodes' roles
- about / Preventing the shard and its replicas from being on the same node
- master eligible node / Designated nodes' roles for larger clusters
- data node / Designated nodes' roles for larger clusters
- query aggregator node / Designated nodes' roles for larger clusters
norms
- about / Norms
not analyzed queries
- about / Query categorization, Not analyzed queries
- examples / Not analyzed queries
not analyzed queries use cases
- results, limiting to given tags / Limiting results to given tags
- efficient query time stopwords handling / Efficient query time stopwords handling
NOT operator / Understanding the basics

O

object type
- about / The object type
Okapi BM25 similarity
- configuring / Configuring the Okapi BM25 similarity
Okapi BM25 similarity model
- about / Available similarity models
old generation / Java memory
online book store
- implementing / The story
options array, properties
- text / Understanding the REST endpoint suggester response
- score / Understanding the REST endpoint suggester response
- freq / Understanding the REST endpoint suggester response
OR operator
- about / Understanding the basics
over allocation
- about / Sharding and overallocation
- example / A positive example of overallocation

P

parameters, for transaction log configuration
- about / The transaction log configuration
- index.translog.flush_threshold_period / The transaction log configuration
- index.translog.flush_threshold_ops / The transaction log configuration
- index.translog.flush_threshold_size / The transaction log configuration
- index.translog.interval / The transaction log configuration
- index.gateway.local.sync / The transaction log configuration
- index.translog.disable_flush / The transaction log configuration
parent-child relationship
- about / Parent–child relationship
- in cluster / Parent–child relationship in the cluster
pattern queries
- about / Query categorization, Pattern queries
pattern queries use cases
- autocomplete functionality, using prefixes / Autocomplete using prefixes
- pattern matching / Pattern matching
- matching phrases / Matching phrases
- spans / Spans, spans everywhere
per-field similarity
- setting / Setting a per-field similarity
phrase matching
- about / Phrase matching
phrase suggester
- about / The phrase suggester
- usage example / Usage example
- configuration / Configuration
- basic configuration / Basic configuration
- basic options / Basic configuration
- smoothing models, configuring / Configuring smoothing models
- candidate generators, configuring / Configuring candidate generators
- direct generators, configuring / Configuring direct generators
phrase with prefixes matching
- about / Phrase with prefixes matching
plugin class, custom REST action
- about / The plugin class
- constructor / The plugin class
- onModule method / The plugin class
- name method / The plugin class
- description method / The plugin class
position aware queries
- about / Query categorization, Position aware queries
posting formats / Posting formats
preference parameter
- about / Introducing the preference parameter
- _primary property / Introducing the preference parameter
- _primary_first property / Introducing the preference parameter
- _local property / Introducing the preference parameter
- _only_node-wJq0kPSHTHCovjuCsVK0-A property / Introducing the preference parameter
- _prefer_node-wJq0kPSHTHCovjuCsVK0-A property / Introducing the preference parameter
- _shards-0,1 property / Introducing the preference parameter
- custom, string value property / Introducing the preference parameter

Q

query aggregator nodes
- about / Query aggregator nodes
Query API
- about / Querying data
query categorization
- about / Query categorization
- basic queries / Query categorization, Basic queries
- compound queries / Query categorization, Compound queries
- not analyzed queries / Query categorization
- full text search queries / Query categorization, Full text search queries
- pattern queries / Query categorization, Pattern queries
- similarity supporting queries / Query categorization, Similarity supporting queries
- score altering queries / Query categorization, Score altering queries
- position aware queries / Query categorization, Position aware queries
- structure aware queries / Query categorization, Structure aware queries
Query DSL
- about / Choosing the right query for the job
query execution preference
- about / Query execution preference
- preference parameter / Introducing the preference parameter
query processing-only nodes
- configuring / Configuring the query processing-only nodes
query relevance improvment
- about / Improving the query relevance
- data / Data
- quest / The quest for relevance improvement
- standard query / The standard query
- multi match query / The multi match query
- phrases / Phrases comes into play
- garbage, removing / Let's throw the garbage away
- phrase queries, boosting / Now, we boost
- misspelling-proof search, making / Performing a misspelling-proof search
- faceting / Drill downs with faceting
query rescoring
- about / Query rescoring, What is query rescoring?
- example query / An example query
- structure, rescore query / Structure of the rescore query
- rescore parameters / Rescore parameters
- scoring mode, selecting / Choosing the scoring mode
query rewrite
- about / Query rewrite explained
- working / Query rewrite explained
- prefix query example / Prefix query as an example
- Apache Lucene / Getting back to Apache Lucene
- properties / Query rewrite properties
query templates
- about / Query templates, Introducing query templates
- providing, as string value / Templates as strings
- Mustache template engine / The Mustache template engine
- conditional expressions / Conditional expressions
- loops / Loops
- default values / Default values
- storing, in files / Storing templates in files

R

real-time GET operation
- about / Near real-time GET
recovery module
- about / Discovery and recovery modules
relations, between documents
- about / Relations between documents
- object type / The object type
- nested documents / The nested documents
- parent-child relationship / Parent–child relationship
- alternatives / A few words about alternatives
replica
- about / Replica
replicas
- about / Replicas
repository
- about / Saving backups in the cloud
request circuit breaker
- about / The request circuit breaker
require parameter / What include, exclude, and require mean
rescore parameters
- window_size / Rescore parameters
- query_weight / Rescore parameters
- rescore_query_weight / Rescore parameters
REST action class
- using / Using the REST action class
- constructor / The constructor
- requests, handling / Handling requests
- response, writing / Writing response
REST action plugin
- building / Building the REST action plugin
- installing / Installing the REST action plugin
- checking / Checking whether the REST action plugin works
rewrite property
- about / Query rewrite properties
- scoring_boolean / Query rewrite properties
- constant_score_boolean / Query rewrite properties
- constant_score_filter / Query rewrite properties
- top_terms_N / Query rewrite properties
- top_terms_boost_N / Query rewrite properties
routing
- about / Routing explained
- shards and data / Shards and data
- testing / Let's test routing
- indexing with / Indexing with routing
- implementing / Routing in practice
- querying / Querying
- aliases / Aliases
- multiple routing values / Multiple routing values
runtime allocation
- updating / Runtime allocation updating
- index level updates / Index level updates
- cluster level updates / Cluster level updates

S

S3 repository
- about / The S3 repository
- creating / The S3 repository
scaling
- about / Scaling Elasticsearch
- vertical scaling / Vertical scaling
- horizontal scaling / Horizontal scaling
score
- about / Default Apache Lucene scoring explained
score altering queries
- about / Query categorization, Score altering queries
score altering queries use cases
- newer books, favoring / Favoring newer books
- importance of books, decreasing with certain value / Decreasing importance of books with certain value
score_mode parameter
- values / Choosing the scoring mode
scoring
- about / Default Apache Lucene scoring explained
scripting, in full text context
- about / Scripting in full text context
- field-related information / Field-related information
- shard level information / Shard level information
- term level information / Term level information
- advanced term information / More advanced term information
scripting changes
- security issues / Security issues
- Groovy / Groovy – the new default scripting language
- MVEL language, removing / Removal of MVEL language
scripting changes, Elasticsearch versions
- about / Scripting changes between Elasticsearch versions
- scripting changes / Scripting changes
segment merging
- about / Segment merging under control
- merge policy, selecting / Choosing the right merge policy
- scheduling / Scheduling
segments merge
- about / Overall architecture
serial merge scheduler
- about / The serial merge scheduler
settings, HDFS repository
- uri / The HDFS repository
- path / The HDFS repository
- load_default / The HDFS repository
- conf_location / The HDFS repository
- chunk_size / The HDFS repository
- conf.<key> / The HDFS repository
- concurrent_streams / The HDFS repository
settings, memory store
- about / Additional properties
- cache.memory.direct / Additional properties
- cache.memory.small_buffer_size / Additional properties
- cache.memory.large_buffer_size / Additional properties
- cache.memory.small_cache_size / Additional properties
- cache.memory.large_cache_size / Additional properties
settings, S3 repository
- about / The S3 repository
- bucket / The S3 repository
- region / The S3 repository
- base_path / The S3 repository
- server_side_encryption / The S3 repository
- chunk_size / The S3 repository
- buffer_size / The S3 repository
- max_retries / The S3 repository
shard
- about / Shard
sharding
- about / Sharding and overallocation
shard query cache
- about / The shard query cache
- setting up / Setting up the shard query cache
significant terms aggregation
- about / Significant terms aggregation
- example / An example
- significant terms, selecting / Choosing significant terms
- multiple values analysis / Multiple values analysis
- using, on full text search fields / Significant terms aggregation and full text search fields
- additional configuration options / Additional configuration options
- limitations / There are limits
similarity models
- Okapi BM25 / Available similarity models
- divergence from randomness (DFR) / Available similarity models
- information-based model / Available similarity models
- LM Dirichlet / Available similarity models
- LM Jelinek Mercer / Available similarity models
- configuration / Similarity model configuration
- configuring / Configuring the chosen similarity model
- TF/IDF similarity, configuring / Configuring the TF/IDF similarity
- Okapi BM25 similarity, configuring / Configuring the Okapi BM25 similarity
- DFR similarity, configuring / Configuring the DFR similarity
- IB similarity, configuring / Configuring the IB similarity
- LM Dirichlet similarity, configuring / Configuring the LM Dirichlet similarity
- LM Jelinek Mercer similarity, configuring / Configuring the LM Jelinek Mercer similarity
similarity supporting queries
- about / Query categorization, Similarity supporting queries
similarity supporting queries use cases
- similar terms, searching / Finding terms similar to a given one
- documents with similar field values, searching / Finding documents with similar field values
simple filesystem store
- about / The simple filesystem store
single point of failure (SPOF) / Key concepts behind Elasticsearch architecture
smoothing models
- about / Configuring smoothing models
- configuring / Configuring smoothing models
- stupid backoff model / Configuring smoothing models
- Laplace smoothing model / Configuring smoothing models
- linear interpolation smoothing model / Configuring smoothing models
special characters
- handling / Handling special characters
split-brain / The master election configuration
SSD (solid state drives) / Performance considerations
startup process, Elasticsearch
- about / The startup process
store module
- about / Choosing the right directory implementation – the store module
store types
- about / The store type
- simple filesystem store / The simple filesystem store
- new I/O filesystem store / The new I/O filesystem store
- MMap filesystem store / The MMap filesystem store
- hybrid filesystem store / The hybrid filesystem store
- memory store / The memory store
- default store type / The default store type
structure aware queries
- about / Query categorization, Structure aware queries
structure aware queries use cases
- parent documents with nested document, returning / Returning parent documents having a certain nested document
- parent document score, affecting with nested document score / Affecting parent document score with the score of nested documents
stupid backoff smoothing model
- about / Configuring smoothing models
suggester
- _suggest REST endpoint / Using the _suggest REST endpoint
- REST endpoint suggester response / Understanding the REST endpoint suggester response
- suggestion requests, including in query / Including suggestion requests in query
- term suggester / The term suggester
- phrase suggester / The phrase suggester
- completion suggester / The completion suggester
suggesters
- about / Suggesters

T

term modifiers
- about / Term modifiers
term suggester
- configuration / Configuration
- common options / Common term suggester options
term vectors
- about / Term vectors
TF/IDF algorithm / Default Apache Lucene scoring explained
TF/IDF scoring formula
- about / TF/IDF scoring formula
- Lucene conceptual scoring formula / Lucene conceptual scoring formula
- Lucene practical scoring formula / Lucene practical scoring formula
TF/IDF similarity
- configuring / Configuring the TF/IDF similarity
tiered merge policy
- about / The tiered merge policy
- configuration options / The tiered merge policy
TokenFilter
- implementing / Implementing TokenFilter
TokenFilter factory
- implementing / Implementing the TokenFilter factory
total circuit breaker
- about / The total circuit breaker
total shards allowed per node
- defining / Defining total shards allowed per node
total shards allowed per physical server
- defining / Defining total shards allowed per physical server
- inclusion / Inclusion
- requirement / Requirement
- exclusion / Exclusion
- disk-based allocation / Disk-based allocation
transaction log
- about / The transaction log
- configuration / The transaction log configuration
tribe node / Federated search
- about / Node
- creating / Creating the tribe node
- unicast discovery, using / Using the unicast discovery for tribes
- data, reading with / Reading data with the tribe node
- master-level read operations / Master-level read operations
- data, writing with / Writing data with the tribe node
- master-level write operations / Master-level write operations

U

unicast Zen discovery configuration
- about / The unicast Zen discovery configuration
- discovery.zen.ping.unicats.hosts / The unicast Zen discovery configuration
use cases, queries
- about / The use cases
- example data / Example data
- basic queries use cases / Basic queries use cases
- compound queries use cases / Compound queries use cases
- not analyzed queries use cases / Not analyzed queries use cases
- full text search queries use cases / Full text search queries use cases
- pattern queries use cases / Pattern queries use cases, Pattern queries use cases
- similarity supporting queries use cases / Similarity supporting queries use cases
- score altering queries use cases / Score altering queries use cases
- structure aware queries use cases / Structure aware queries use cases
user spelling mistakes, correcting
- about / Correcting user spelling mistakes
- data, testing / Testing data
- technical details / Getting into technical details

V

vertical scaling
- about / Vertical scaling

W

write operations
- blocking / Blocking write operations

Y

YAML
- URL / Writing response
young generation heap space / Java memory

Z

Zen discovery
- about / Zen discovery
- multicast Zen discovery configuration / Multicast Zen discovery configuration
- unicast Zen discovery configuration / The unicast Zen discovery configuration

The rest of the chapter is locked

You're reading from Mastering Elasticsearch Further your knowledge of the Elasticsearch server by learning more about its internals, querying, and data handling

Table of Contents (19) Chapters

Index

A

B

C

D

E

F

G

H

I

J

L

M

N

O

P

Q

R

S

T

U

V

W

Y

Z

Authors (1)

Personalised recommendations for you

You're reading from Mastering Elasticsearch Further your knowledge of the Elasticsearch server by learning more about its internals, querying, and data handling

Table of Contents (19) Chapters

Index

A

B

C

D

E

F

G

H

I

J

L

M

N

O

P

Q

R

S

T

U

V

W

Y

Z

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access