Data onboarding – defining field extractions
Splunk has many built-in features, including knowledge of several common source types, which lets it automatically know which fields exist within your data. Splunk, by default, also extracts any key-value pairs present within the log data and all the fields within JSON-formatted logs. However, often the fields within raw log data cannot be interpreted out of the box, and this knowledge must be provided to Splunk to make these fields easily searchable.
The sample data that we will be using in subsequent chapters contains data we wish to present as fields to Splunk. Much of the raw log data contains key-value fields that Splunk will extract automatically, but there is one field we need to tell Splunk how to extract, representing the page response time. To do this, we will be adding a custom field extraction, which will tell Splunk how to extract the field for us.
Getting ready
To step through this recipe, you will need a running Splunk server with the operational intelligence sample data loaded. No other prerequisites are required.
How to do it...
Follow these steps to add a custom field extraction for a response:
- Log in to your Splunk server.
- In the top right-hand corner, click on the
Settings
menu and then click on theFields
link.
- Click on the
Field extractions
link: - Click on
New
. - In the
Destination app
field, select thesearch
app, and in theName
field, enterresponse
. Set theApply to
dropdown tosourcetype
and thenamed
field toaccess_combined
. Set theType
dropdown toInline
, and for theExtraction/Transform
field, carefully enter the(?i)^(?:[^"]*"){8}s+(?P<response>.+)
regex: - Click on
Save
. - On the
Field extractions
listing page, find the recently added extraction, and in theSharing
column, click on thePermissions
link:
- Update the
Object should appear in
setting toAll apps
. In thePermissions
section, for theRead
column, checkEveryone
, and in theWrite
column, checkadmin
. Then, click onSave
: - Navigate to the Splunk search screen and enter the following search over the
Last 60 minutes
time range:
index=main sourcetype=access_combined
- You should now see a field called
response
extracted on the left-hand side of the search screen under theInteresting Fields
section.
How it works...
All field extractions are maintained in the props.conf
and transforms.conf
configuration files. The stanzas in props.conf
include an extraction class that leverages regular expressions to extract field names and/or values to be used at search time. The transforms.conf
file goes further and can be leveraged for more advanced extractions, such as reusing or sharing extractions over multiple sources, source types, or hosts.
See also
- The Loading the sample data for this book recipe
- The Data onboarding – defining event types and tags recipe