





















































In this article by Alberto Maria Angelo Paro, the author of the book ElasticSearch 5.0 Cookbook - Third Edition, you will learn the following recipes:
(For more resources related to this topic, see here.)
The first operation to do before starting indexing data in Elasticsearch is to create an index--the main container of our data.
An index is similar to the concept of database in SQL, a container for types (tables in SQL) and documents (records in SQL).
To execute curl via the command line you need to install curl for your operative system.
The HTTP method to create an index is PUT (but also POST works); the REST URL contains the index name:
http://<server>/<index_name>
For creating an index, we will perform the following steps:
curl -XPUT http://127.0.0.1:9200/myindex -d '{
"settings" : {
"index" : {
"number_of_shards" : 2,
"number_of_replicas" : 1
}
}
}'
{"acknowledged":true,"shards_acknowledged":true}
{
"error" : {
"root_cause" : [
{
"type" : "index_already_exists_exception",
"reason" : "index [myindex/YJRxuqvkQWOe3VuTaTbu7g] already exists",
"index_uuid" : "YJRxuqvkQWOe3VuTaTbu7g",
"index" : "myindex"
}
],
"type" : "index_already_exists_exception",
"reason" : "index [myindex/YJRxuqvkQWOe3VuTaTbu7g] already exists",
"index_uuid" : "YJRxuqvkQWOe3VuTaTbu7g",
"index" : "myindex"
},
"status" : 400
}
Because the index name will be mapped to a directory on your storage, there are some limitations to the index name, and the only accepted characters are:
During index creation, the replication can be set with two parameters in the settings/index object:
The API call initializes a new index, which means:
The index creation API allows defining the mapping during creation time. The parameter required to define a mapping is mapping and accepts multi mappings. So in a single call it is possible to create an index and put the required mappings.
The create index command allows passing also the mappings section, which contains the mapping definitions. It is a shortcut to create an index with mappings, without executing an extra PUT mapping call:
curl -XPOST localhost:9200/myindex -d '{
"settings" : {
"number_of_shards" : 2,
"number_of_replicas" : 1
},
"mappings" : {
"order" : {
"properties" : {
"id" : {"type" : "keyword", "store" : "yes"},
"date" : {"type" : "date", "store" : "no" , "index":"not_analyzed"},
"customer_id" : {"type" : "keyword", "store" : "yes"},
"sent" : {"type" : "boolea+n", "index":"not_analyzed"},
"name" : {"type" : "text", "index":"analyzed"},
"quantity" : {"type" : "integer", "index":"not_analyzed"},
"vat" : {"type" : "double", "index":"no"}
}
}
}
}'
The counterpart of creating an index is deleting one.
Deleting an index means deleting its shards, mappings, and data. There are many common scenarios when we need to delete an index, such as:
To execute curl via command line you need to install curl for your operative system.
The index created is required to be deleted.
The HTTP method used to delete an index is DELETE.
The following URL contains only the index name:
http://<server>/<index_name>
For deleting an index, we will perform the steps given as follows:
curl -XDELETE http://127.0.0.1:9200/myindex
{"acknowledged":true}
{
"error" : {
"root_cause" : [
{
"type" : "index_not_found_exception",
"reason" : "no such index",
"resource.type" : "index_or_alias",
"resource.id" : "myindex",
"index_uuid" : "_na_",
"index" : "myindex"
}
],
"type" : "index_not_found_exception",
"reason" : "no such index",
"resource.type" : "index_or_alias",
"resource.id" : "myindex",
"index_uuid" : "_na_",
"index" : "myindex"
},
"status" : 404
}
When an index is deleted, all the data related to the index is removed from disk and is lost.
During the delete processing, first the cluster is updated, and then the shards are deleted from the storage. This operation is very fast; in a traditional filesystem it is implemented as a recursive delete.
It's not possible restore a deleted index, if there is no backup.
Also calling using the special _all index_name can be used to remove all the indices. In production it is good practice to disable the all indices deletion by adding the following line to Elasticsearch.yml:
action.destructive_requires_name:true
If you want to keep your data, but save resources (memory/CPU), a good alternative to delete indexes is to close them.
Elasticsearch allows you to open/close an index to put it into online/offline mode.
To execute curl via the command line you need to install curl for your operative system.
For opening/closing an index, we will perform the following steps:
curl -XPOST http://127.0.0.1:9200/myindex/_close
{,"acknowledged":true}
curl -XPOST http://127.0.0.1:9200/myindex/_open
{"acknowledged":true}
When an index is closed, there is no overhead on the cluster (except for metadata state): the index shards are switched off and they don't use file descriptors, memory, and threads.
There are many use cases when closing an index:
An alias cannot have the same name as an index
When an index is closed, calling the open restores its state.
We saw how to build mapping by indexing documents. This recipe shows how to put a type mapping in an index. This kind of operation can be considered as the Elasticsearch version of an SQL created table.
To execute curl via the command line you need to install curl for your operative system.
The HTTP method to put a mapping is PUT (also POST works).
The URL format for putting a mapping is:
http://<server>/<index_name>/<type_name>/_mapping
For putting a mapping in an index, we will perform the steps given as follows:
curl -XPUT 'http://localhost:9200/myindex/order/_mapping' -d '{
"order" : {
"properties" : {
"id" : {"type" : "keyword", "store" : "yes"},
"date" : {"type" : "date", "store" : "no" , "index":"not_analyzed"},
"customer_id" : {"type" : "keyword", "store" : "yes"},
"sent" : {"type" : "boolean", "index":"not_analyzed"},
"name" : {"type" : "text", "index":"analyzed"},
"quantity" : {"type" : "integer", "index":"not_analyzed"},
"vat" : {"type" : "double", "index":"no"}
}
}
}'
{"acknowledged":true}
This call checks if the index exists and then it creates one or more type mapping as described in the definition.
During mapping insert if there is an existing mapping for this type, it is merged with the new one. If there is a field with a different type and the type could not be updated, an exception expanding fields property is raised. To prevent an exception during the merging mapping phase, it's possible to specify the ignore_conflicts parameter to true (default is false).
The put mapping call allows you to set the type for several indices in one shot; list the indices separated by commas or to apply all indexes using the _all alias.
There is not a delete operation for mapping. It's not possible to delete a single mapping from an index. To remove or change a mapping you need to manage the following steps:
After having set our mappings for processing types, we sometimes need to control or analyze the mapping to prevent issues. The action to get the mapping for a type helps us to understand structure or its evolution due to some merge and implicit type guessing.
To execute curl via command-line you need to install curl for your operative system.
The HTTP method to get a mapping is GET.
The URL formats for getting mappings are:
http://<server>/_mapping
http://<server>/<index_name>/_mapping
http://<server>/<index_name>/<type_name>/_mapping
To get a mapping from the type of an index, we will perform the following steps:
curl -XGET 'http://localhost:9200/myindex/order/_mapping?pretty=true'
The pretty argument in the URL is optional, but very handy to pretty print the response output.
{
"myindex" : {
"mappings" : {
"order" : {
"properties" : {
"customer_id" : {
"type" : "keyword",
"store" : true
},
… truncated
}
}
}
}
}
The mapping is stored at the cluster level in Elasticsearch. The call checks both index and type existence and then it returns the stored mapping.
The returned mapping is in a reduced form, which means that the default values for a field are not returned.
Elasticsearch stores only not default field values to reduce network and memory consumption.
Retrieving a mapping is very useful for several purposes:
If you need to fetch several mappings, it is better to do it at index level or cluster level to reduce the numbers of API calls.
We learned how to manage indices and perform operations on documents. We'll discuss different operations on indices such as create, delete, update, open, and close. These operations are very important because they allow better define the container (index) that will store your documents. The index create/delete actions are similar to the SQL create/delete database commands.