Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
Mastering Elasticsearch

You're reading from   Mastering Elasticsearch Further your knowledge of the Elasticsearch server by learning more about its internals, querying, and data handling

Arrow left icon
Product type Paperback
Published in Feb 2015
Publisher
ISBN-13 9781783553792
Length 434 pages
Edition 2nd Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Marek Rogozinski Marek Rogozinski
Author Profile Icon Marek Rogozinski
Marek Rogozinski
Arrow right icon
View More author details
Toc

Table of Contents (19) Chapters Close

Mastering Elasticsearch Second Edition
Credits
About the Author
Acknowledgments
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
1. Introduction to Elasticsearch FREE CHAPTER 2. Power User Query DSL 3. Not Only Full Text Search 4. Improving the User Search Experience 5. The Index Distribution Architecture 6. Low-level Index Control 7. Elasticsearch Administration 8. Improving Performance 9. Developing Elasticsearch Plugins Index

Index

A

  • additional configuration options, significant terms aggregation
    • about / Additional configuration options
    • background set filtering / Background set filtering
    • minimum document count / Minimum document count
    • execution hint / Execution hint
    • more options / More options
  • additional term suggester options
    • about / Additional term suggester options
    • lowercase_terms / Additional term suggester options
    • max_edits / Additional term suggester options
    • prefix_length / Additional term suggester options
    • min_word_length / Additional term suggester options
    • shard_size / Additional term suggester options
    • max_inspections / Additional term suggester options
    • min_doc_freq / Additional term suggester options
    • max_term_freq / Additional term suggester options
    • accuracy / Additional term suggester options
    • string_distance / Additional term suggester options
  • additive smoothing
    • URL / Configuring smoothing models
  • advices, for high query rate scenarios
    • about / Advices for high query rate scenarios
    • filter caches / Filter caches and shard query caches
    • shard query caches / Filter caches and shard query caches
    • think about optimal query structure / Think about the queries
    • routing, using / Using routing
    • queries, parallelizing / Parallelize your queries
    • field data cache / Field data cache and breaking the circuit
    • circuit, breaking / Field data cache and breaking the circuit
    • size and shard_size properties, controlling / Keeping size and shard_size under control
  • allocation awareness
    • about / Allocation awareness
    • forcing / Forcing allocation awareness
  • Amazon EC2 discovery
    • about / The Amazon EC2 discovery
    • plugin installation / The EC2 plugin installation
    • generic configuration / The EC2 plugin's generic configuration
    • optional configuration options / Optional EC2 discovery configuration options
    • nodes scanning configuration / The EC2 nodes scanning configuration
  • analysis
    • about / Analyzing your data
  • analysis binder
    • implementing / Implementing the analysis binder
  • analyzer indices component
    • implementing / Implementing the analyzer indices component
  • analyzer module
    • implementing / Implementing the analyzer module
  • analyzer plugin
    • implementing / Implementing the analyzer plugin
  • analyzer provider
    • implementing / Implementing the analyzer provider
  • AND operator
    • about / Understanding the basics
  • Apache Lucene
    • about / Introducing Apache Lucene
    • architecture / Overall architecture
    • inverted index / Overall architecture
    • data analysis / Analyzing your data
  • Apache Lucene scoring
    • altering / Altering Apache Lucene scoring
    • similarity models / Available similarity models
    • per-field similarity, setting / Setting a per-field similarity
    • similarity model configuration / Similarity model configuration
    • default similarity model, selecting / Choosing the default similarity model
  • Apache Lucene scoring mechanism
    • about / Default Apache Lucene scoring explained
    • document, returning by Lucene / When a document is matched
    • TF/IDF scoring formula / TF/IDF scoring formula
    • Elasticsearch point of view / Elasticsearch point of view
    • example / An example
  • Apache Maven
    • URL / Creating the Apache Maven project structure
  • Apache Maven project structure
    • creating / Creating the Apache Maven project structure
    • basics / Understanding the basics
    • about / The structure of the Maven Java project
    • pom.xml / The idea of POM
    • build process, running / Running the build process
    • Maven assembly plugin / Introducing the assembly Maven plugin
    • assembly Maven plugin / Introducing the assembly Maven plugin
  • Application Program Interface (API) / Communicating with Elasticsearch
  • architecture, Apache Lucene
    • about / Overall architecture
    • document / Overall architecture
    • field / Overall architecture
    • term / Overall architecture
    • token / Overall architecture
  • architecture, Elasticsearch
    • features / Key concepts behind Elasticsearch architecture
  • ASCII folding filter
    • about / Analyzing your data
  • Azure repository
    • about / The Azure repository

B

  • background set / Choosing significant terms
  • backups
    • about / Backing up
    • performing / Backing up
    • saving, in cloud / Saving backups in the cloud
    • S3 repository / The S3 repository
    • HDFS repository / The HDFS repository
    • Azure repository / The Azure repository
  • basic concepts, Elasticsearch
    • index / Index
    • document / Document
    • type / Type
    • mapping / Mapping
    • node / Node
    • cluster / Cluster
    • shard / Shard
    • replica / Replica
  • basic options, phrase suggester
    • highlight / Basic configuration
    • gram_size / Basic configuration
    • confidence / Basic configuration
    • max_errors / Basic configuration
    • separator / Basic configuration
    • force_unigrams / Basic configuration
    • token_limit / Basic configuration
    • collate / Basic configuration
    • real_word_error_likehood / Basic configuration
  • basic queries
    • about / Query categorization, Basic queries
    • examples / Basic queries
  • basic queries use cases
    • values, searching in range / Searching for values in range
    • simplified query, for multiple terms / Simplified query for multiple terms
    • lower scoring partial queries, ignoring / Ignoring lower scoring partial queries
  • benchmarking queries
    • about / Benchmarking queries
    • cluster configuration, preparing / Preparing your cluster configuration for benchmarking
    • benchmarks, running / Running benchmarks
    • benchmarks, controlling / Controlling currently run benchmarks
  • best fields matching
    • about / Best fields matching
  • Boolean operators
    • AND / Understanding the basics
    • OR / Understanding the basics
    • NOT / Understanding the basics
    • + / Understanding the basics
    • - / Understanding the basics
  • budget / The tiered merge policy
  • byte code
    • about / Knowing about garbage collector

C

  • caches
    • clearing / Clearing the caches
    • all caches, clearing / Index, indices, and all caches clearing
    • specific caches, clearing / Clearing specific caches
  • candidate generators
    • configuring / Configuring candidate generators
    • about / Configuring candidate generators
  • Cat API
    • about / The human-friendly status API – using the Cat API
    • basics / The basics
    • using / Using the Cat API
    • common arguments / Common arguments
    • examples / The examples
  • circuit breakers
    • using / Using circuit breakers
    • field data circuit breaker / The field data circuit breaker
    • request circuit breaker / The request circuit breaker
    • total circuit breaker / The total circuit breaker
  • class custom analyzer
    • implementing / Implementing the class custom analyzer
  • cluster
    • about / Cluster
  • cluster- level recovery configuration
    • about / Cluster-level recovery configuration
    • indices.recovery.concurrent_streams / Cluster-level recovery configuration
    • indices.recovery.max_bytes_per_sec / Cluster-level recovery configuration
    • indices.recovery.compress / Cluster-level recovery configuration
    • indices.recovery.file_chunk_size / Cluster-level recovery configuration
    • indices.recovery.translog_ops / Cluster-level recovery configuration
    • indices.recovery.translog_size / Cluster-level recovery configuration
  • common term suggester options
    • about / Common term suggester options
    • text / Common term suggester options
    • field / Common term suggester options
    • analyzer / Common term suggester options
    • size / Common term suggester options
    • sort / Common term suggester options
    • suggest_mode / Common term suggester options
  • communication, Elasticsearch
    • about / Communicating with Elasticsearch
    • data indexing / Indexing data
    • data querying / Querying data
  • completion suggester
    • about / The completion suggester
    • logic / The logic behind the completion suggester
    • using / Using the completion suggester
    • data indexing / Indexing data
    • data, querying / Querying data
    • custom weights / Custom weights
    • additional parameters / Additional parameters
  • compound queries
    • about / Query categorization, Compound queries
    • examples / Compound queries
  • compound queries use cases
    • matched documents, boosting / Boosting some of the matched documents
  • concurrent merge scheduler
    • about / The concurrent merge scheduler
  • configuration options, log byte size merge policy
    • merge_factor / The log byte size merge policy
    • min_merge_size / The log byte size merge policy
    • max_merge_size / The log byte size merge policy
    • maxMergeDocs / The log byte size merge policy
    • calibrate_size_by_deletes / The log byte size merge policy
  • configuration options, log doc merge policy
    • merge_factor / The log doc merge policy
    • min_merge_docs / The log doc merge policy
    • max_merge_docs / The log doc merge policy
    • calibrate_size_by_deletes / The log doc merge policy
  • configuration options, tiered merge policy
    • index.merge.policy.expunge_deletes_allowed / The tiered merge policy
    • index.merge.policy.floor_segment / The tiered merge policy
    • index.merge.policy.max_merge_at_once / The tiered merge policy
    • index.merge.policy.max_merge_at_once_explicit / The tiered merge policy
    • index.merge.policy.max_merged_segment / The tiered merge policy
    • index.merge.policy.segments_per_tier / The tiered merge policy
    • index.reclaim_deletes_weight / The tiered merge policy
    • index.compund_format / The tiered merge policy
  • cross fields matching
    • about / Cross fields matching
  • curl tool
    • URL / Indexing data
  • custom analysis plugin
    • creating / Creating the custom analysis plugin
    • implementation details / Implementation details
    • TokenFilter, implementing / Implementing TokenFilter
    • TokenFilter factory, implementing / Implementing the TokenFilter factory
    • class custom analyzer, implementing / Implementing the class custom analyzer
    • analyzer provider, implementing / Implementing the analyzer provider
    • analysis binder, implementing / Implementing the analysis binder
    • analyzer indices component, implementing / Implementing the analyzer indices component
    • analyzer module, implementing / Implementing the analyzer module
    • analyzer plugin, implementing / Implementing the analyzer plugin
    • testing / Testing our custom analysis plugin
    • building / Building our custom analysis plugin
    • installing / Installing the custom analysis plugin
    • checking / Checking whether our analysis plugin works
  • custom REST action
    • creating / Creating custom REST action
    • assumptions / The assumptions
    • implementation / Implementation details

D

  • data-only nodes
    • configuring / Configuring data-only nodes
  • data analysis
    • about / Analyzing your data
    • indexing / Indexing and querying
    • querying / Indexing and querying
  • data field caches
    • issues / The problem with field data cache
  • data node
    • about / Node
    • configuring / Configuring master and data nodes
  • data nodes
    • about / Data nodes
  • default shard allocation behaviour
    • altering / Altering the default shard allocation behavior
    • allocation awareness / Allocation awareness
    • filtering / Filtering
    • runtime allocation, updating / Runtime allocation updating
    • total shards allowed per node, defining / Defining total shards allowed per node
    • total shards allowed per physical server, defining / Defining total shards allowed per physical server
  • default similarity model
    • selecting / Choosing the default similarity model
  • default store type
    • about / The default store type
    • for Elasticsearch 1.3.0 / The default store type for Elasticsearch 1.3.0 and higher
    • for Elasticsearch versions older than 1.3.0 / The default store type for Elasticsearch versions older than 1.3.0
  • desired merge scheduler
    • setting / Setting the desired merge scheduler
  • DFR similarity
    • configuring / Configuring the DFR similarity
  • direct generators
    • about / Configuring candidate generators
    • configuring / Configuring direct generators
  • discovery module
    • about / Discovery and recovery modules
    • configuration / Discovery configuration
    • Zen discovery configuration / Zen discovery
  • divergence from randomness similarity model
    • about / Available similarity models
  • document
    • about / Document
  • documents
    • relations / Relations between documents
  • documents grouping
    • about / Documents grouping
    • top hits aggregation / Top hits aggregation
    • example / An example
    • additional parameters / Additional parameters
  • document types
    • about / Type
  • doc values / Doc values
    • used, for optimizing queries / Using doc values to optimize your queries
    • example, of usage / The example of doc values usage

E

  • EC2 discovery configuration options
    • cloud.aws.region / Optional EC2 discovery configuration options
    • cloud.aws.ec2.endpoint / Optional EC2 discovery configuration options
    • cloud.aws.protocol / Optional EC2 discovery configuration options
    • cloud.aws.proxy_host / Optional EC2 discovery configuration options
    • cloud.aws.proxy_port / Optional EC2 discovery configuration options
    • discovery.ec2.ping_timeout / Optional EC2 discovery configuration options
  • EC2 nodes scanning configuration
    • discovery.ec2.host_type / The EC2 nodes scanning configuration
    • discovery.ec2.groups / The EC2 nodes scanning configuration
    • discovery.ec2.availability_zones / The EC2 nodes scanning configuration
    • discovery.ec2.any_group / The EC2 nodes scanning configuration
    • discovery.ec2.tag / The EC2 nodes scanning configuration
  • EC2 plugin's generic configuration
    • cluster.aws.access_key / The EC2 plugin's generic configuration
    • cluster.aws.secret_key / The EC2 plugin's generic configuration
  • Elasticsearch
    • about / Introducing Elasticsearch
    • basic concepts / Basic concepts
    • key concepts / Key concepts behind Elasticsearch architecture
    • workings / Workings of Elasticsearch
    • startup process / The startup process
    • failure detection / Failure detection
    • communicating with / Communicating with Elasticsearch
    • query rewrite / Query rewrite explained
    • filters / Handling filters and why it matters
    • scaling / Scaling Elasticsearch
    • informing, about REST action / Informing Elasticsearch about our REST action
    • informing, about custom analyzer / Informing Elasticsearch about our custom analyzer
  • Elasticsearch, using for high load scenarios
    • about / Using Elasticsearch for high load scenarios
    • general Elasticsearch-tuning advices / General Elasticsearch-tuning advices
    • advices, for high query rate scenarios / Advices for high query rate scenarios
    • high indexing throughput scenarios / High indexing throughput scenarios and Elasticsearch
  • Elasticsearch Azure plugin, settings
    • container / The Azure repository
    • base_path / The Azure repository
    • chunk_size / The Azure repository
  • Elasticsearch caching
    • about / Understanding Elasticsearch caching
    • filter cache / The filter cache
    • field data cache / The field data cache
    • shard query cache / The shard query cache
    • circuit breakers, using / Using circuit breakers
    • caches, clearing / Clearing the caches
    • all caches, clearing / Index, indices, and all caches clearing
  • examples, Cat API
    • master node information, obtaining / Getting information about the master node
    • node information, obtaining / Getting information about the nodes
  • exclude parameter / What include, exclude, and require mean
  • expectations on nodes, gateway module
    • gateway.expected_nodes / Expectations on nodes
    • gateway.expected_data_nodes / Expectations on nodes
    • gateway.expected_master_nodes / Expectations on nodes

F

  • factors, for calculating score property of document
    • document boost / When a document is matched
    • filter boost / When a document is matched
    • coordination factor / When a document is matched
    • inverse document frequency / When a document is matched
    • length norm / When a document is matched
    • term frequency / When a document is matched
    • query norm / When a document is matched
  • failure detection, Elasticsearch
    • about / Failure detection
  • federated search
    • about / Federated search
    • test clusters / The test clusters
    • tribe node, creating / Creating the tribe node
    • indices conflicts, handling / Handling indices conflicts
    • write operation, blocking / Blocking write operations
  • field data cache
    • about / The field data cache
    • doc values / Field data or doc values
    • field data / Field data or doc values
    • node-level field data cache configuration / Node-level field data cache configuration
    • index-level field data cache configuration / Index-level field data cache configuration
    • field data formats / Field data formats
    • field data loading / Field data loading
  • field data cache filtering
    • about / The field data cache filtering
    • information, adding / Adding field data filtering information
    • filtering, by term frequency / Filtering by term frequency
    • filtering, by regex / Filtering by regex
    • filtering by regex and term frequency / Filtering by regex and term frequency
    • example / The filtering example
  • field data circuit breaker
    • about / The field data circuit breaker
  • field data formats
    • about / Field data formats
    • string-based fields / String-based fields
    • numeric-based fields / Numeric fields
    • geographical-based fields / Geographical-based fields
  • fields
    • querying / Querying fields
  • filter cache
    • about / The filter cache
    • types / Filter cache types
    • node-level filter cache configuration / Node-level filter cache configuration
    • index-level filter cache configuration / Index-level filter cache configuration
  • filters
    • about / Handling filters and why it matters
    • comparing, with query / Filters and query relevance
    • working / How filters work
    • bool filter / Bool or and/or/not filters
    • and filter / Bool or and/or/not filters
    • or filter / Bool or and/or/not filters
    • not filter / Bool or and/or/not filters
    • performance considerations / Performance considerations
    • post filtering / Post filtering and filtered query
    • filtered query / Post filtering and filtered query
    • filtering method, selecting / Choosing the right filtering method
  • flushing
    • about / The transaction log
  • foreground set / Choosing significant terms
  • full text search queries
    • about / Query categorization, Full text search queries
    • examples / Full text search queries
  • full text search queries use cases
    • Lucene query syntax, using / Using Lucene query syntax in queries
    • user queries, handling without errors / Handling user queries without errors

G

  • garbage collection problems
    • dealing with / Dealing with garbage collection problems
  • garbage collector
    • about / Knowing about garbage collector, More information on the garbage collector work
    • Java memory / Java memory
    • life cycle / The life cycle of Java objects and garbage collections
    • collection problems, dealing with / Dealing with garbage collection problems
    • logging, turning on / Turning on logging of garbage collection work
    • JStat using / Using JStat
    • memory dumps, creating / Creating memory dumps
    • adjusting / Adjusting the garbage collector work in Elasticsearch
    • standard start up script, using / Using a standard start up script
    • service wrapper / Service wrapper
    • swapping on Unix-like systems, avoiding / Avoid swapping on Unix-like systems
  • gateway configuration properties
    • gateway.recover_after_nodes / Configuration properties
    • gateway.recover_after_data_nodes / Configuration properties
    • gateway.recover_after_master_nodes / Configuration properties
    • gateway.recover_after_time / Configuration properties
  • gateway module
    • about / The gateway and recovery configuration
    • gateway recovery process / The gateway recovery process
    • configuration properties / Configuration properties
    • expectations on nodes / Expectations on nodes
    • local gateway / The local gateway
    • low-level recovery configuration / Low-level recovery configuration
  • general Elasticsearch-tuning advices
    • store, selecting / Choosing the right store
    • index refresh rate / The index refresh rate
    • thread pools tuning / Thread pools tuning
    • merge process, adjusting / Adjusting the merge process
    • data distribution / Data distribution
  • global options, _bench REST endpoint
    • name / Running benchmarks
    • competitors / Running benchmarks
    • num_executor_nodes / Running benchmarks
    • percentiles / Running benchmarks
    • iteration / Running benchmarks
    • concurrency / Running benchmarks
    • multiplier / Running benchmarks
    • warmup / Running benchmarks
    • clear_caches / Running benchmarks
  • Groovy
    • about / Short Groovy introduction
    • using, as scripting language / Using Groovy as your scripting language
    • variable, defining in scripts / Variable definition in scripts
    • conditional statements / Conditionals
    • loops / Loops
    • example / An example

H

  • HDFS repository
    • about / The HDFS repository
    • settings / The HDFS repository
  • high indexing throughput scenarios
    • about / High indexing throughput scenarios and Elasticsearch
    • bulk indexing / Bulk indexing
    • doc values, versus indexing speed / Doc values versus indexing speed
    • document fields, controlling / Keep your document fields under control
    • index architecture / The index architecture and replication
    • replication / The index architecture and replication
    • write-ahead log, tuning / Tuning write-ahead log
    • storage type / Think about storage
    • RAM buffer, for indexing / RAM buffer for indexing
  • horizontal scaling
    • about / Horizontal scaling
    • replicas, creating automatically / Automatically creating replicas
    • redundancy / Redundancy and high availability
    • high availability / Redundancy and high availability
    • cost / Cost and performance flexibility
    • performance flexibility / Cost and performance flexibility
    • continuous upgrades / Continuous upgrades
    • multiple Elasticsearch instances, on single physical machine / Multiple Elasticsearch instances on a single physical machine
    • nodes' roles, for larger clusters / Designated nodes' roles for larger clusters
  • Hot Threads API
    • about / Very hot threads
    • threads parameter / Very hot threads
    • interval parameter / Very hot threads
    • type parameter / Very hot threads
    • snapshots parameter / Very hot threads
    • usage clarification / Usage clarification for the Hot Threads API
    • response / The Hot Threads API response
  • human-friendly status API
    • Cat API / The human-friendly status API – using the Cat API
  • hybrid filesystem store
    • about / The hybrid filesystem store

I

  • I/O throttling
    • about / When it is too much for I/O – throttling explained
    • controlling / Controlling I/O throttling
  • I/O throttling configuration
    • about / Configuration
    • throttling type, configuring / The throttling type
    • maximum throughput per second / Maximum throughput per second
    • node throttling defaults / Node throttling defaults
    • performance considerations / Performance considerations
    • example / The configuration example
  • IB similarity
    • configuring / Configuring the IB similarity
  • implementation, custom analysis plugin
    • TokenFilter class extension / Implementation details
    • AbstractTokenFilterFactory extension / Implementation details
    • custom analyzer / Implementation details
    • analyzer provider / Implementation details
    • AnalysisModule.AnalysisBinderProcessor / Implementation details
    • AbstractComponent class / Implementation details
    • AbstractModule extension / Implementation details
    • AbstractPlugin extension / Implementation details
  • implementation, custom REST action
    • about / Implementation details
    • REST action class, using / Using the REST action class
    • plugin class / The plugin class
    • Elasticsearch, informing / Informing Elasticsearch about our REST action
  • include parameter
    • about / What include, exclude, and require mean
  • index
    • about / Index
    • updating / Updating the index and committing changes
    • changes, committing / Updating the index and committing changes
    • default refresh time, changing / Changing the default refresh time
    • transaction log / The transaction log
  • index-level filter cache configuration
    • about / Index-level filter cache configuration
    • index.cache.filter.type / Index-level filter cache configuration
    • index.cache.filter.max_size / Index-level filter cache configuration
    • index.cache.filter.expire / Index-level filter cache configuration
  • index-level recovery settings
    • about / Index-level recovery settings
    • quorum / Index-level recovery settings
    • quorum-1 / Index-level recovery settings
    • full / Index-level recovery settings
    • full-1 / Index-level recovery settings
    • integer value / Index-level recovery settings
  • index distribution architecture
    • right amount of shards and replicas, selecting / Choosing the right amount of shards and replicas
    • sharding / Sharding and overallocation
    • over allocation / Sharding and overallocation
    • example, over allocation / A positive example of overallocation
    • multiple shards, versus multiple indices / Multiple shards versus multiple indices
    • replicas / Replicas
  • indexing
    • altering / NRT, flush, refresh, and transaction log
  • indices conflicts
    • handling / Handling indices conflicts
  • indices recovery API
    • about / The indices recovery API
  • inverted index
    • about / Overall architecture

J

  • Java memory
    • about / Java memory
    • eden space / Java memory
    • survivor space / Java memory
    • tenured generation / Java memory
    • permanent generation / Java memory
    • code cache / Java memory
  • Java objects
    • life cycle / The life cycle of Java objects and garbage collections
  • Java service wrapper
    • URL / Service wrapper
  • Java Virtual Machine (JVM) / Communicating with Elasticsearch
  • JSON document
    • URL / Communicating with Elasticsearch

L

  • Laplace smoothing model
    • about / Configuring smoothing models
  • Least Recently Used cache type (LRU) / Node-level filter cache configuration
  • limitations, significant terms aggregation
    • about / There are limits
    • memory consumption / Memory consumption
    • avoiding, as top level aggregation / Shouldn't be used as top-level aggregation
    • approximated counts / Counts are approximated
    • floating point fields, avoiding / Floating point fields are not allowed
  • linear interpolation smoothing model
    • about / Configuring smoothing models
  • LM Dirichlet similarity
    • configuring / Configuring the LM Dirichlet similarity
  • LM Jelinek Mercer similarity
    • configuring / Configuring the LM Jelinek Mercer similarity
  • log byte size merge policy
    • about / The log byte size merge policy
    • configuration options / The log byte size merge policy
  • log doc merge policy
    • about / The log doc merge policy
    • configuration options / The log doc merge policy
  • low-level recovery configuration
    • about / Low-level recovery configuration
    • cluster- level recovery configuration / Cluster-level recovery configuration
    • index-level recovery settings / Index-level recovery settings
  • lowercase filter
    • about / Analyzing your data
  • Lucene analyzer
    • about / Analyzing your data
  • Lucene expressions
    • about / Lucene expressions explained
    • basics / The basics
    • example / An example
  • Lucene index
    • about / Getting deeper into Lucene index
    • norm / Norms
    • term vectors / Term vectors
    • posting formats / Posting formats
    • doc values / Doc values
  • Lucene query language
    • about / Lucene query language
    • basics / Understanding the basics
    • Boolean operators / Understanding the basics
    • fields, querying / Querying fields
    • term modifiers / Term modifiers
    • special characters, handling / Handling special characters

M

  • mapping
    • about / Mapping
  • master-only nodes
    • configuring / Configuring master-only nodes
  • master election
    • about / Master node
    • configuration / The master election configuration
    • Zen discovery fault detection / Zen discovery fault detection and configuration
    • Zen discovery configuration / Zen discovery fault detection and configuration
  • master eligible nodes
    • about / Master eligible nodes
  • master node
    • about / Node, Master node
    • configuring / Configuring master and data nodes
    • Amazon EC2 discovery / The Amazon EC2 discovery
    • discovery implementations / Other discovery implementations
  • Maven Assembly plugin
    • about / Introducing the assembly Maven plugin
    • using / Introducing the assembly Maven plugin
    • URL / Introducing the assembly Maven plugin
  • memory store
    • about / The memory store
    • properties / Additional properties
  • merge
    • tiered merge policy / The tiered merge policy
  • merge policy
    • selecting / Choosing the right merge policy
    • log byte size merge policy / The log byte size merge policy
    • log doc merge policy / The log doc merge policy
  • merge schedulers
    • about / Scheduling
    • concurrent merge scheduler / The concurrent merge scheduler
    • serial merge scheduler / The serial merge scheduler
    • desired merge scheduler, selecting / Setting the desired merge scheduler
  • MMap filesystem store
    • about / The MMap filesystem store
  • most fields matching
    • about / Most fields matching
  • multicast Zen discovery configuration
    • about / Multicast Zen discovery configuration
    • discovery.zen.ping.multicast.address / Multicast Zen discovery configuration
    • discovery.zen.ping.multicast.port / Multicast Zen discovery configuration
    • discovery.zen.ping.multicast.group / Multicast Zen discovery configuration
    • discovery.zen.ping.multicast.buffer_size / Multicast Zen discovery configuration
    • discovery.zen.ping.multicast.ttl / Multicast Zen discovery configuration
    • discovery.zen.ping.multicast.enabled / Multicast Zen discovery configuration
    • discovery.zen.ping.unicats.concurrent_connects / The unicast Zen discovery configuration
  • multimatch
    • controlling / Controlling multimatching
    • types / Multimatch types
    • best fields matching / Best fields matching
    • cross fields matching / Cross fields matching
    • most fields matching / Most fields matching
    • phrase matching / Phrase matching
    • phrase with prefixes matching / Phrase with prefixes matching
  • multiple Elasticsearch instances, on single physical machine
    • about / Multiple Elasticsearch instances on a single physical machine
    • shard and its replicas, preventing from being on same node / Preventing the shard and its replicas from being on the same node
  • multiple language stemming filters
    • about / Analyzing your data
  • multiple shards
    • versus multiple indices / Multiple shards versus multiple indices
  • multi_match query / Controlling multimatching
  • Mustache template engine
    • about / The Mustache template engine
    • URL / The Mustache template engine

N

  • N-gram smoothing models
    • URL / Configuring smoothing models
  • near real-time GET
    • about / Near real-time GET
  • nested documents
    • about / The nested documents
  • new I/O filesystem store
    • about / The new I/O filesystem store
  • node
    • about / Node
    • data node / Node
    • master node / Node
    • tribe node / Node
  • node-level filter cache configuration
    • about / Node-level filter cache configuration
  • nodes' roles
    • about / Preventing the shard and its replicas from being on the same node
    • master eligible node / Designated nodes' roles for larger clusters
    • data node / Designated nodes' roles for larger clusters
    • query aggregator node / Designated nodes' roles for larger clusters
  • norms
    • about / Norms
  • not analyzed queries
    • about / Query categorization, Not analyzed queries
    • examples / Not analyzed queries
  • not analyzed queries use cases
    • results, limiting to given tags / Limiting results to given tags
    • efficient query time stopwords handling / Efficient query time stopwords handling
  • NOT operator / Understanding the basics

O

  • object type
    • about / The object type
  • Okapi BM25 similarity
    • configuring / Configuring the Okapi BM25 similarity
  • Okapi BM25 similarity model
    • about / Available similarity models
  • old generation / Java memory
  • online book store
    • implementing / The story
  • options array, properties
    • text / Understanding the REST endpoint suggester response
    • score / Understanding the REST endpoint suggester response
    • freq / Understanding the REST endpoint suggester response
  • OR operator
    • about / Understanding the basics
  • over allocation
    • about / Sharding and overallocation
    • example / A positive example of overallocation

P

  • parameters, for transaction log configuration
    • about / The transaction log configuration
    • index.translog.flush_threshold_period / The transaction log configuration
    • index.translog.flush_threshold_ops / The transaction log configuration
    • index.translog.flush_threshold_size / The transaction log configuration
    • index.translog.interval / The transaction log configuration
    • index.gateway.local.sync / The transaction log configuration
    • index.translog.disable_flush / The transaction log configuration
  • parent-child relationship
    • about / Parent–child relationship
    • in cluster / Parent–child relationship in the cluster
  • pattern queries
    • about / Query categorization, Pattern queries
  • pattern queries use cases
    • autocomplete functionality, using prefixes / Autocomplete using prefixes
    • pattern matching / Pattern matching
    • matching phrases / Matching phrases
    • spans / Spans, spans everywhere
  • per-field similarity
    • setting / Setting a per-field similarity
  • phrase matching
    • about / Phrase matching
  • phrase suggester
    • about / The phrase suggester
    • usage example / Usage example
    • configuration / Configuration
    • basic configuration / Basic configuration
    • basic options / Basic configuration
    • smoothing models, configuring / Configuring smoothing models
    • candidate generators, configuring / Configuring candidate generators
    • direct generators, configuring / Configuring direct generators
  • phrase with prefixes matching
    • about / Phrase with prefixes matching
  • plugin class, custom REST action
    • about / The plugin class
    • constructor / The plugin class
    • onModule method / The plugin class
    • name method / The plugin class
    • description method / The plugin class
  • position aware queries
    • about / Query categorization, Position aware queries
  • posting formats / Posting formats
  • preference parameter
    • about / Introducing the preference parameter
    • _primary property / Introducing the preference parameter
    • _primary_first property / Introducing the preference parameter
    • _local property / Introducing the preference parameter
    • _only_node-wJq0kPSHTHCovjuCsVK0-A property / Introducing the preference parameter
    • _prefer_node-wJq0kPSHTHCovjuCsVK0-A property / Introducing the preference parameter
    • _shards-0,1 property / Introducing the preference parameter
    • custom, string value property / Introducing the preference parameter

Q

  • query aggregator nodes
    • about / Query aggregator nodes
  • Query API
    • about / Querying data
  • query categorization
    • about / Query categorization
    • basic queries / Query categorization, Basic queries
    • compound queries / Query categorization, Compound queries
    • not analyzed queries / Query categorization
    • full text search queries / Query categorization, Full text search queries
    • pattern queries / Query categorization, Pattern queries
    • similarity supporting queries / Query categorization, Similarity supporting queries
    • score altering queries / Query categorization, Score altering queries
    • position aware queries / Query categorization, Position aware queries
    • structure aware queries / Query categorization, Structure aware queries
  • Query DSL
    • about / Choosing the right query for the job
  • query execution preference
    • about / Query execution preference
    • preference parameter / Introducing the preference parameter
  • query processing-only nodes
    • configuring / Configuring the query processing-only nodes
  • query relevance improvment
    • about / Improving the query relevance
    • data / Data
    • quest / The quest for relevance improvement
    • standard query / The standard query
    • multi match query / The multi match query
    • phrases / Phrases comes into play
    • garbage, removing / Let's throw the garbage away
    • phrase queries, boosting / Now, we boost
    • misspelling-proof search, making / Performing a misspelling-proof search
    • faceting / Drill downs with faceting
  • query rescoring
    • about / Query rescoring, What is query rescoring?
    • example query / An example query
    • structure, rescore query / Structure of the rescore query
    • rescore parameters / Rescore parameters
    • scoring mode, selecting / Choosing the scoring mode
  • query rewrite
    • about / Query rewrite explained
    • working / Query rewrite explained
    • prefix query example / Prefix query as an example
    • Apache Lucene / Getting back to Apache Lucene
    • properties / Query rewrite properties
  • query templates
    • about / Query templates, Introducing query templates
    • providing, as string value / Templates as strings
    • Mustache template engine / The Mustache template engine
    • conditional expressions / Conditional expressions
    • loops / Loops
    • default values / Default values
    • storing, in files / Storing templates in files

R

  • real-time GET operation
    • about / Near real-time GET
  • recovery module
    • about / Discovery and recovery modules
  • relations, between documents
    • about / Relations between documents
    • object type / The object type
    • nested documents / The nested documents
    • parent-child relationship / Parent–child relationship
    • alternatives / A few words about alternatives
  • replica
    • about / Replica
  • replicas
    • about / Replicas
  • repository
    • about / Saving backups in the cloud
  • request circuit breaker
    • about / The request circuit breaker
  • require parameter / What include, exclude, and require mean
  • rescore parameters
    • window_size / Rescore parameters
    • query_weight / Rescore parameters
    • rescore_query_weight / Rescore parameters
  • REST action class
    • using / Using the REST action class
    • constructor / The constructor
    • requests, handling / Handling requests
    • response, writing / Writing response
  • REST action plugin
    • building / Building the REST action plugin
    • installing / Installing the REST action plugin
    • checking / Checking whether the REST action plugin works
  • rewrite property
    • about / Query rewrite properties
    • scoring_boolean / Query rewrite properties
    • constant_score_boolean / Query rewrite properties
    • constant_score_filter / Query rewrite properties
    • top_terms_N / Query rewrite properties
    • top_terms_boost_N / Query rewrite properties
  • routing
    • about / Routing explained
    • shards and data / Shards and data
    • testing / Let's test routing
    • indexing with / Indexing with routing
    • implementing / Routing in practice
    • querying / Querying
    • aliases / Aliases
    • multiple routing values / Multiple routing values
  • runtime allocation
    • updating / Runtime allocation updating
    • index level updates / Index level updates
    • cluster level updates / Cluster level updates

S

  • S3 repository
    • about / The S3 repository
    • creating / The S3 repository
  • scaling
    • about / Scaling Elasticsearch
    • vertical scaling / Vertical scaling
    • horizontal scaling / Horizontal scaling
  • score
    • about / Default Apache Lucene scoring explained
  • score altering queries
    • about / Query categorization, Score altering queries
  • score altering queries use cases
    • newer books, favoring / Favoring newer books
    • importance of books, decreasing with certain value / Decreasing importance of books with certain value
  • score_mode parameter
    • values / Choosing the scoring mode
  • scoring
    • about / Default Apache Lucene scoring explained
  • scripting, in full text context
    • about / Scripting in full text context
    • field-related information / Field-related information
    • shard level information / Shard level information
    • term level information / Term level information
    • advanced term information / More advanced term information
  • scripting changes
    • security issues / Security issues
    • Groovy / Groovy – the new default scripting language
    • MVEL language, removing / Removal of MVEL language
  • scripting changes, Elasticsearch versions
    • about / Scripting changes between Elasticsearch versions
    • scripting changes / Scripting changes
  • segment merging
    • about / Segment merging under control
    • merge policy, selecting / Choosing the right merge policy
    • scheduling / Scheduling
  • segments merge
    • about / Overall architecture
  • serial merge scheduler
    • about / The serial merge scheduler
  • settings, HDFS repository
    • uri / The HDFS repository
    • path / The HDFS repository
    • load_default / The HDFS repository
    • conf_location / The HDFS repository
    • chunk_size / The HDFS repository
    • conf.<key> / The HDFS repository
    • concurrent_streams / The HDFS repository
  • settings, memory store
    • about / Additional properties
    • cache.memory.direct / Additional properties
    • cache.memory.small_buffer_size / Additional properties
    • cache.memory.large_buffer_size / Additional properties
    • cache.memory.small_cache_size / Additional properties
    • cache.memory.large_cache_size / Additional properties
  • settings, S3 repository
    • about / The S3 repository
    • bucket / The S3 repository
    • region / The S3 repository
    • base_path / The S3 repository
    • server_side_encryption / The S3 repository
    • chunk_size / The S3 repository
    • buffer_size / The S3 repository
    • max_retries / The S3 repository
  • shard
    • about / Shard
  • sharding
    • about / Sharding and overallocation
  • shard query cache
    • about / The shard query cache
    • setting up / Setting up the shard query cache
  • significant terms aggregation
    • about / Significant terms aggregation
    • example / An example
    • significant terms, selecting / Choosing significant terms
    • multiple values analysis / Multiple values analysis
    • using, on full text search fields / Significant terms aggregation and full text search fields
    • additional configuration options / Additional configuration options
    • limitations / There are limits
  • similarity models
    • Okapi BM25 / Available similarity models
    • divergence from randomness (DFR) / Available similarity models
    • information-based model / Available similarity models
    • LM Dirichlet / Available similarity models
    • LM Jelinek Mercer / Available similarity models
    • configuration / Similarity model configuration
    • configuring / Configuring the chosen similarity model
    • TF/IDF similarity, configuring / Configuring the TF/IDF similarity
    • Okapi BM25 similarity, configuring / Configuring the Okapi BM25 similarity
    • DFR similarity, configuring / Configuring the DFR similarity
    • IB similarity, configuring / Configuring the IB similarity
    • LM Dirichlet similarity, configuring / Configuring the LM Dirichlet similarity
    • LM Jelinek Mercer similarity, configuring / Configuring the LM Jelinek Mercer similarity
  • similarity supporting queries
    • about / Query categorization, Similarity supporting queries
  • similarity supporting queries use cases
    • similar terms, searching / Finding terms similar to a given one
    • documents with similar field values, searching / Finding documents with similar field values
  • simple filesystem store
    • about / The simple filesystem store
  • single point of failure (SPOF) / Key concepts behind Elasticsearch architecture
  • smoothing models
    • about / Configuring smoothing models
    • configuring / Configuring smoothing models
    • stupid backoff model / Configuring smoothing models
    • Laplace smoothing model / Configuring smoothing models
    • linear interpolation smoothing model / Configuring smoothing models
  • special characters
    • handling / Handling special characters
  • split-brain / The master election configuration
  • SSD (solid state drives) / Performance considerations
  • startup process, Elasticsearch
    • about / The startup process
  • store module
    • about / Choosing the right directory implementation – the store module
  • store types
    • about / The store type
    • simple filesystem store / The simple filesystem store
    • new I/O filesystem store / The new I/O filesystem store
    • MMap filesystem store / The MMap filesystem store
    • hybrid filesystem store / The hybrid filesystem store
    • memory store / The memory store
    • default store type / The default store type
  • structure aware queries
    • about / Query categorization, Structure aware queries
  • structure aware queries use cases
    • parent documents with nested document, returning / Returning parent documents having a certain nested document
    • parent document score, affecting with nested document score / Affecting parent document score with the score of nested documents
  • stupid backoff smoothing model
    • about / Configuring smoothing models
  • suggester
    • _suggest REST endpoint / Using the _suggest REST endpoint
    • REST endpoint suggester response / Understanding the REST endpoint suggester response
    • suggestion requests, including in query / Including suggestion requests in query
    • term suggester / The term suggester
    • phrase suggester / The phrase suggester
    • completion suggester / The completion suggester
  • suggesters
    • about / Suggesters

T

  • term modifiers
    • about / Term modifiers
  • term suggester
    • configuration / Configuration
    • common options / Common term suggester options
  • term vectors
    • about / Term vectors
  • TF/IDF algorithm / Default Apache Lucene scoring explained
  • TF/IDF scoring formula
    • about / TF/IDF scoring formula
    • Lucene conceptual scoring formula / Lucene conceptual scoring formula
    • Lucene practical scoring formula / Lucene practical scoring formula
  • TF/IDF similarity
    • configuring / Configuring the TF/IDF similarity
  • tiered merge policy
    • about / The tiered merge policy
    • configuration options / The tiered merge policy
  • TokenFilter
    • implementing / Implementing TokenFilter
  • TokenFilter factory
    • implementing / Implementing the TokenFilter factory
  • total circuit breaker
    • about / The total circuit breaker
  • total shards allowed per node
    • defining / Defining total shards allowed per node
  • total shards allowed per physical server
    • defining / Defining total shards allowed per physical server
    • inclusion / Inclusion
    • requirement / Requirement
    • exclusion / Exclusion
    • disk-based allocation / Disk-based allocation
  • transaction log
    • about / The transaction log
    • configuration / The transaction log configuration
  • tribe node / Federated search
    • about / Node
    • creating / Creating the tribe node
    • unicast discovery, using / Using the unicast discovery for tribes
    • data, reading with / Reading data with the tribe node
    • master-level read operations / Master-level read operations
    • data, writing with / Writing data with the tribe node
    • master-level write operations / Master-level write operations

U

  • unicast Zen discovery configuration
    • about / The unicast Zen discovery configuration
    • discovery.zen.ping.unicats.hosts / The unicast Zen discovery configuration
  • use cases, queries
    • about / The use cases
    • example data / Example data
    • basic queries use cases / Basic queries use cases
    • compound queries use cases / Compound queries use cases
    • not analyzed queries use cases / Not analyzed queries use cases
    • full text search queries use cases / Full text search queries use cases
    • pattern queries use cases / Pattern queries use cases, Pattern queries use cases
    • similarity supporting queries use cases / Similarity supporting queries use cases
    • score altering queries use cases / Score altering queries use cases
    • structure aware queries use cases / Structure aware queries use cases
  • user spelling mistakes, correcting
    • about / Correcting user spelling mistakes
    • data, testing / Testing data
    • technical details / Getting into technical details

V

  • vertical scaling
    • about / Vertical scaling

W

  • write operations
    • blocking / Blocking write operations

Y

  • YAML
    • URL / Writing response
  • young generation heap space / Java memory

Z

  • Zen discovery
    • about / Zen discovery
    • multicast Zen discovery configuration / Multicast Zen discovery configuration
    • unicast Zen discovery configuration / The unicast Zen discovery configuration
lock icon The rest of the chapter is locked
arrow left Previous Section
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime
Visually different images
Modal Close icon
Modal Close icon