How to deploy Splunk binary and set up its configuration [Tutorial]

Splunk provides binary distributions for Windows and a variety of Unix operating systems. For all Unix operating systems, a compressed .tar file is provided. For some platforms, packages are also provided.

This article is an excerpt taken from the book Implementing Splunk 7 - Third Edition written by James Miller. This book covers the new modules of Splunk: Splunk Cloud and the Machine Learning Toolkit to ease data usage and more.

In this tutorial, you will learn how to deploy Splunk library effectively within your system. It also includes how to set up configuration distributions in Splunk.

If your organization uses packages, such as deb or rpm, you should be able to use the provided packages in your normal deployment process. Otherwise, installation starts by unpacking the provided tar to the location of your choice.

The process is the same, whether you are installing the full version of Splunk or the Splunk universal forwarder.

The typical installation process involves the following steps:

Installing the binary

Adding a base configuration

Configuring Splunk to launch at boot

Restarting Splunk

Having worked with many different companies over the years, I can honestly say that none of them used the same product or even methodology for deploying software. Splunk takes a hands-off approach to fit in as easily as possible into customer workflows.

Deploying from a tar file

To deploy from a tar file, the command depends on your version of tar. With a modern version of tar, you can run the following command:

tar xvzf splunk-7.0.x-xxx-Linux-xxx.tgz

Older versions may not handle gzip files directly, so you may have to run the following command:

gunzip -c splunk-7.0.x-xxx-Linux-xxx.tgz | tar xvf -

This will expand into the current directory. To expand into a specific directory, you can usually add -C, depending on the version of TAR, as follows:

tar -C /opt/ -xvzf splunk-7.0.x-xxx-Linux-xxx.tgz

Deploying using msiexec

In Windows, it is possible to deploy Splunk using msiexec. This makes it much easier to automate deployment on a large number of machines. To install silently, you can use the combination of AGREETOLICENSE and /quiet, as follows:

msiexec.exe /i splunk-xxx.msi AGREETOLICENSE=Yes /quiet

If you plan to use a deployment server, you can specify the following value:

msiexec.exe /i splunk-xxx.msi AGREETOLICENSE=Yes 
DEPLOYMENT_SERVER="deployment_server_name:8089" /quiet

Or, if you plan to overlay an app that contains deploymentclient.conf, you can forego starting Splunk until that app has been copied into place, as follows:

msiexec.exe /i splunk-xxx.msi AGREETOLICENSE=Yes LAUNCHSPLUNK=0 /quiet

There are options available to start reading data immediately, but I would advise deploying input configurations to your servers, instead of enabling inputs via installation arguments.

Adding a base configuration

If you are using the Splunk deployment server, this is the time to set up deploymentclient.conf. This can be accomplished in several ways, as follows:

On the command line, by running the following code:

$SPLUNK_HOME/bin/splunk set deploy-poll 
deployment_server_name:8089

By placing a deploymentclient.conf in:

$SPLUNK_HOME/etc/system/local/

By placing an app containing deploymentclient.conf in:

$SPLUNK_HOME/etc/apps/

The third option is what I would recommend because it allows overriding this configuration, via a deployment server, at a later time. We will work through an example later in the Using Splunk deployment server section.

If you are deploying configurations in some other way, for instance with puppet, be sure to restart the Splunk forwarder processes after deploying the new configuration.

Configuring Splunk to launch at boot

On Windows machines, Splunk is installed as a service that will start after installation and on reboot.

On Unix hosts, the Splunk command line provides a way to create startup scripts appropriate for the operating system that you are using. The command looks like this:

$SPLUNK_HOME/bin/splunk enable boot-start

To run Splunk as another user, provide the flag -user, as follows:

$SPLUNK_HOME/bin/splunk enable boot-start -user splunkuser

The startup command must still be run as root, but the startup script will be modified to run as the user provided.

If you do not run Splunk as root, and you shouldn't if you can avoid it, be sure that the Splunk installation and data directories are owned by the user specified in the enable boot-start command. You can ensure this by using chmod, such as in chmod -R splunkuser $SPLUNK_HOME

On Linux, you could then start the command using service splunk start.

Configuration distribution in Splunk

As we have covered, in some depth, configurations in Splunk are simply directories of plain text files. Distribution essentially consists of copying these configurations to the appropriate machines and restarting the instances. You can either use your own system for distribution, such as puppet or simply a set of scripts, or use the deployment server included with Splunk.

Using your own deployment system

The advantage of using your own system is that you already know how to use it.

Assuming that you have normalized your apps, as described in the section Using apps to organize configuration, deploying apps to a forwarder or indexer consists of the following steps:

Set aside the existing apps at $SPLUNK_HOME/etc/apps/.

Copy the apps into $SPLUNK_HOME/etc/apps/.

Restart Splunk forwarder. Note that this needs to be done as the user that is running Splunk, either by calling the service script or calling su. In Windows, restart the splunkd service.

Assuming that you already have a system for managing configurations, that's it.

If you are deploying configurations to indexers, be sure to only deploy the configurations when downtime is acceptable, as you will need to restart the indexers to load the new configurations, ideally in a rolling manner.

Do not deploy configurations until you are ready to restart, as some (but not all) configurations will take effect immediately.

Using the Splunk deployment server

If you do not have a system for managing configurations, you can use the deployment server included with Splunk.

Some advantages of the included deployment server are as follows:

Everything you need is included in your Splunk installation

It will restart forwarder instances properly when new app versions are deployed

It is intelligent enough not to restart when unnecessary

It will remove apps that should no longer be installed on a machine

It will ignore apps that are not managed

The logs for the deployment client and server are accessible in Splunk itself

Some disadvantages of the included deployment server are:

As of Splunk 4.3, there are issues with scale beyond a few hundred deployment clients, at which point tuning is required (although a solution option is to use multiple instances of deployment servers).

The configuration is complicated and prone to typos

With these caveats out of the way, let's set up a deployment server for the apps that we laid out before.

Step 1 – deciding where your deployment server will run

For a small installation with less than a few dozen forwarders, your main Splunk instance can run the deployment server without any issue. For more than a few dozen forwarders, a separate instance of Splunk makes sense.

Ideally, this instance would run on its own machine. The requirements for this machine are not large, perhaps 4 gigabytes of RAM and two processors, or possibly less. A virtual machine would be fine.

Define a DNS entry for your deployment server, if at all possible. This will make moving your deployment server later, much simpler.

If you do not have access to another machine, you could run another copy of Splunk on the same machine that is running some other part of your Splunk deployment. To accomplish this, follow these steps:

Install Splunk in another directory, perhaps /opt/splunk-deploy/splunk/.

Start this instance of Splunk by using /opt/splunk-deploy/splunk/bin/splunk start. When prompted, choose different port numbers apart from the default and note what they are. I would suggest one number higher: 8090 and 8001.

Unfortunately, if you run splunk enable boot-start in this new instance, the existing startup script will be overwritten. To accommodate both instances, you will need to either edit the existing startup script, or rename the existing script so that it is not overwritten.

Step 2 - defining your deploymentclient.conf configuration

Using the address of our new deployment server, ideally a DNS entry, we will build an app named deploymentclient-yourcompanyname. This app will have to be installed manually on forwarders but can then be managed by the deployment server.

This app should look somewhat like this:

deploymentclient-yourcompanyname 
local/deploymentclient.conf 
[deployment-client] 
[target-broker:deploymentServer] 
targetUri=deploymentserver.foo.com:8089

Step 3 - defining our machine types and locations

Starting with what we defined in the Separate configurations by purpose section, we have, in the locations west and east, the following machine types:

Splunk indexers

db servers

Web servers

App servers

Step 4 - normalizing our configurations into apps appropriately

Let's use the apps that we defined in the section Separate configurations by purpose plus the deployment client app that we created in the Step 2 - defining your deploymentclient.conf configuration section. These apps will live in $SPLUNK_HOME/etc/deployment-apps/ on your deployment server.

Step 5 - mapping these apps to deployment clients in serverclass.conf

To get started, I always start with example 2 from SPLUNK_HOME/etc/system/README/serverclass.conf example:

[global] 
[serverClass:AppsForOps] 
whitelist.0=*.ops.yourcompany.com 
[serverClass:AppsForOps:app:unix] 
[serverClass:AppsForOps:app:SplunkLightForwarder]

Let's assume that we have the machines mentioned next. It is very rare for an organization of any size to have consistently named hosts, so I threw in a couple of rogue hosts at the bottom, as follows:

spl-idx-west01 
spl-idx-west02 
spl-idx-east01 
spl-idx-east02 
app-east01 
app-east02 
app-west01 
app-west02 
web-east01 
web-east02 
web-west01 
web-west02 
db-east01 
db-east02 
db-west01 
db-west02 
qa01 
homer-simpson

The structure of serverclass.conf is essentially as follows:

[serverClass:<className>] 
#options that should be applied to all apps in this class 
[serverClass:<className>:app:<appName>] 
#options that should be applied only to this app in this serverclass

Please note that:

<className> is an arbitrary name of your choosing.

<appName> is the name of a directory in $SPLUNK_HOME/etc/deploymentapps/.

The order of stanzas does not matter. Be sure to update <className> if you copy an :app: stanza. This is, by far, the easiest mistake to make.

It is important that configuration changes do not trigger a restart of indexers.

Let's apply this to our hosts, as follows:

[global] 
restartSplunkd = True 
#by default trigger a splunk restart on configuration change 
####INDEXERS 
##handle indexers specially, making sure they do not restart 
[serverClass:indexers] 
whitelist.0=spl-idx-* 
restartSplunkd = False 
[serverClass:indexers:app:indexerbase] 
[serverClass:indexers:app:deploymentclient-yourcompanyname] 
[serverClass:indexers:app:props-web] 
[serverClass:indexers:app:props-app] 
[serverClass:indexers:app:props-db] 
#send props-west only to west indexers 
[serverClass:indexers-west] 
whitelist.0=spl-idx-west* 
restartSplunkd = False 
[serverClass:indexers-west:app:props-west] 
#send props-east only to east indexers 
 [serverClass:indexers-east] 
whitelist.0=spl-idx-east* 
restartSplunkd = False 
[serverClass:indexers-east:app:props-east] 
####FORWARDERS 
#send event parsing props apps everywhere 
#blacklist indexers to prevent unintended restart 
[serverClass:props] 
whitelist.0=* 
blacklist.0=spl-idx-* 
[serverClass:props:app:props-web] 
[serverClass:props:app:props-app] 
[serverClass:props:app:props-db] 
#send props-west only to west datacenter servers 
#blacklist indexers to prevent unintended restart 
[serverClass:west] 
whitelist.0=*-west* 
whitelist.1=qa01 
blacklist.0=spl-idx-* 
[serverClass:west:app:props-west] 
[serverClass:west:app:deploymentclient-yourcompanyname] 
#send props-east only to east datacenter servers 
#blacklist indexers to prevent unintended restart 
[serverClass:east] 
whitelist.0=*-east* 
whitelist.1=homer-simpson 
blacklist.0=spl-idx-* 
[serverClass:east:app:props-east] 
[serverClass:east:app:deploymentclient-yourcompanyname] 
#define our appserver inputs 
[serverClass:appservers] 
whitelist.0=app-* 
whitelist.1=qa01 
whitelist.2=homer-simpson 
[serverClass:appservers:app:inputs-app] 
#define our webserver inputs 
[serverClass:webservers] 
whitelist.0=web-* 
whitelist.1=qa01 
whitelist.2=homer-simpson 
[serverClass:webservers:app:inputs-web] 
#define our dbserver inputs 
[serverClass:dbservers] 
whitelist.0=db-* 
whitelist.1=qa01 
[serverClass:dbservers:app:inputs-db] 
#define our west coast forwarders 
[serverClass:fwd-west] 
whitelist.0=app-west* 
whitelist.1=web-west* 
whitelist.2=db-west* 
whitelist.3=qa01 
[serverClass:fwd-west:app:outputs-west] 
#define our east coast forwarders 
[serverClass:fwd-east] 
whitelist.0=app-east* 
whitelist.1=web-east* 
whitelist.2=db-east* 
whitelist.3=homer-simpson 
[serverClass:fwd-east:app:outputs-east]

You should organize the patterns and classes in a way that makes sense to your organization and data centers, but I would encourage you to keep it as simple as possible. I would strongly suggest opting for more lines than more complicated logic.

A few more things to note about the format of serverclass.conf:

The number following whitelist and blacklist must be sequential, starting with zero. For instance, in the following example, whitelist.3 will not be processed, since whitelist.2 is commented:

[serverClass:foo] 
whitelist.0=a* 
whitelist.1=b* 
# whitelist.2=c* 
whitelist.3=d*

whitelist.x and blacklist.x are tested against these values in the following order:
- clientName as defined in deploymentclient.conf: This is not commonly used but is useful when running multiple Splunk instances on the same machine, or when the DNS is completely unreliable.
- IP address: There is no CIDR matching, but you can use string patterns.
- Reverse DNS: This is the value returned by the DNS for an IP address.
  If your reverse DNS is not up to date, this can cause you problems, as this value is tested before the value of hostname, as provided by the host itself. If you suspect this, try ping <ip of machine> or something similar to see what the DNS is reporting.

Hostname as provided by forwarder: This is always tested after reverse DNS, so be sure your reverse DNS is up to date.

When copying :app: lines, be very careful to update the <className> appropriately! This really is the most common mistake made in serverclass.conf.

Step 6 - restarting the deployment server

If serverclass.conf did not exist, a restart of the Splunk instance which is running deployment server is required to activate the deployment server. After the deployment server is loaded, you can use the following command:

$SPLUNK_HOME/bin/splunk reload deploy-server

This command should be enough to pick up any changes in serverclass.conf a in etc/deployment-apps.

Step 7 - installing deploymentclient.conf

Now that we have a running deployment server, we need to set up the clients to call home. On each machine that will be running the deployment client, the procedure is essentially as follows:

Copy the deploymentclient-yourcompanyname app to $SPLUNK_HOME/etc/apps/

Restart Splunk

If everything is configured correctly, you should see the appropriate apps appear in $SPLUNK_HOME/etc/apps/, within a few minutes. To see what is happening, look at the log $SPLUNK_HOME/var/log/splunk/splunkd.log.

If you have problems, enable debugging on either the client or the server by editing $SPLUNK_HOME/etc/log.cfg, followed by a restart. Look for the following lines:

category.DeploymentServer=WARN 
category.DeploymentClient=WARN

Once found, change them to the following lines and restart Splunk:

category.DeploymentServer=DEBUG 
category.DeploymentClient=DEBUG

After restarting Splunk, you will see the complete conversation in $SPLUNK_HOME/var/log/splunk/splunkd.log. Be sure to change the setting back once you no longer need the verbose logging!

To summarize, we learned how to deploy a binary and set up configuration distribution in Splunk. If you've enjoyed this excerpt, head over to the book, Implementing Splunk 7 - Third Edition to learn how to use the Machine Learning Toolkit and best practices and tips to help you implement Splunk services effectively and efficiently.

Splunk introduces machine learning capabilities in Splunk Enterprise and Cloud

Creating effective dashboards using Splunk [Tutorial]

Why should enterprises use Splunk?