Introducing the rewrite module
The rewrite
module of NGINX is a simple regular expression matcher combined with a virtual stack machine. The first part of any rewrite rule is a regular expression. As such, it is possible to use parentheses to define certain parts as captures, which can later be referenced by positional variables. A positional variable is one in which its value depends on the order of the capture in the regular expression. They are labeled by number, so positional variable $1
references what is matched by the first set of parentheses, $2
references what is matched by the second set, and so on. For example, refer to the following regular expression:
^/images/([a-z]{2})/([a-z0-9]{5})/(.*)\.(png|jpg|gif)$
The first positional variable, $1
, references a two-letter string, which comes immediately after the /images/
string at the beginning of the URI. The second positional variable, $2
, refers to a five-character string composed of lowercase letters and the numbers from 0 to 9. The third positional variable, $3
, is presumably the name of a file. And the last variable to be extracted from this regular expression, $4
, is one of .png
, .jpg
, or .gif
, which appears at the very end of the URI.
The second part of a rewrite rule is the URI to which the request is rewritten. The URI may contain any positional variable captured in the regular expression indicated by the first argument, or any other variable valid at this level of NGINX's configuration:
/data?file=$3.$4
If this URI does not match any of the other locations in the NGINX configuration, it is returned to the client in the Location
header with either a 301
(Moved Permanently
) or a 302
(Found
) HTTP status code, indicating the type of redirect that is to be performed. This status code may be specified explicitly if permanent
or redirect
is the third parameter.
This third parameter to the rewrite rule may also be either last
or break
, indicating that no further rewrite
module directives will be processed. Using the last
flag will cause NGINX to search for another location
matching the rewritten URI:
rewrite '^/images/([a-z]{2})/([a-z0-9]{5})/(.*)\.(png|jpg|gif)$' /data?file=$3.$4 last;
The break
parameter may also be used as a directive on its own to stop the rewrite
module directive processing within an if
block, or any other context in which the rewrite
module is active. The following snippet presumes that some external method is used to set the $bwhog
variable to a nonempty and nonzero value when a client has used too much bandwidth. The limit_rate
directive will then enforce a lower transfer rate. The break
parameter is used here because we entered the rewrite
module with if
, and we don't want to process any further such directives:
if ($bwhog) { limit_rate 300k; break; }
Another way to stop the processing of the rewrite
module directives is to return
control to the main http
module processing the request. This may mean that NGINX returns information directly to the client, but return
is often combined with error_page
to either present a formatted HTML page to the client or activate a different module to finish processing the request. The return
directive may indicate a status code, a status code with some text, or a status code with a URI. If a bare URI is the sole parameter, the status code is understood to be a 302
. When the text is placed after the status code, this text becomes the body of the response. If a URI is used instead, this URI becomes the value of the Location
header, to which the client will then be redirected.
As an example, we want to set a short text as the output for a file not found error in a particular location. We specify location
with an equals sign (=
) to exactly match this URI:
location = /image404.html { return 404 "image not found\n"; }
Any call to this URI would then be answered with an HTTP code of 404
and the text image not found\n
. So, we can use /image404.html
at the end of a try_files
directive or as an error page for the image files.
In addition to directives relating to the act of rewriting a URI, the rewrite
module also includes the set
directive to create new variables and set their values. This is useful in a number of ways, from creating flags when certain conditions are present, to passing named arguments on to other locations and logging what was done.
The following example demonstrates some of these concepts and the usage of the corresponding directives:
http { # a special log format referencing variables we'll define later log_format imagelog '[$time_local] ' $image_file ' ' $image_type ' ' $body_bytes_sent ' ' $status; # we want to enable rewrite-rule debugging to see if our rule does # what we intend rewrite_log on; server { root /home/www; location / { # we specify which logfile should receive the rewrite-ruledebug # messages error_log logs/rewrite.log notice; # our rewrite rule, utilizing captures and positional variables # note the quotes around the regular expression - theseare # required because we used {} within the expression itself rewrite '^/images/([a-z]{2})/([a-z0-9]{5})/(.*)\.(png|jpg|gif)$' /data?file=$3.$4; # note that we didn't use the 'last' parameter above; if we had, # the variables below would not be set because NGINX would # have ended rewrite module processing # here we set the variables that are used in the custom log # format 'imagelog' set $image_file $3; set $image_type $4; } location /data { # we want to log all images to this specially-formatted logfile # to make parsing the type and size easier access_log logs/images.log imagelog; root /data/images; # we could also have used the $image-variables we defined # earlier, but referencing the argument is easier to read try_files /$arg_file /image404.html; } location = /image404.html { # our special error message for images that don't exist return 404 "image not found\n"; } } }
The following table summarizes the rewrite
module directives we discussed in this section:
Rewrite module directives |
Explanation |
---|---|
|
Ends the processing of the |
|
Evaluates a condition, and if if (condition) { … }
The condition may be any of the following cases:
|
|
Stops processing and returns the specified code to the client. The nonstandard code, |
|
Changes the URI from the one matched by the regular expression in the first parameter to the string in the second parameter. If a third parameter is given, it is one of the following flags:
|
|
Activates the |
|
Sets a given variable to a specific value. |
|
Controls whether or not warnings about uninitialized variables are logged. |