Spark configuration
There are a of ways to configure your Spark jobs. In this section, we will discuss these ways. More specifically, according to Spark 2.x release, there are three locations to configure the system:
- Spark properties
- Environmental variables
- Logging
Spark properties
As discussed previously, Spark control most of the application-specific parameters and can be set using a SparkConf
object of Spark. Alternatively, these parameters can be set through the Java system properties. SparkConf
allows you to configure some of the common properties as follows:
setAppName() // App name setMaster() // Master URL setSparkHome() // Set the location where Spark is installed on worker nodes. setExecutorEnv() // Set single or multiple environment variables to be used when launching executors. setJars() // Set JAR files to distribute to the cluster. setAll() // Set multiple parameters together.
An application can be configured to use a number of available cores on your machine. For example...