Modern IDEs provide the functionality to generate the required build configuration files, but we will give some generic examples that could be useful not only here, but in future projects. Depending on the IDE you prefer, you might need to install some extra plugins to have things up and running, and a quick Google search should help.
SBT stands for Simple Build Tool and it uses the Scala syntax to define how a project is built, managing dependencies, and so on. It uses .sbt
files for this purpose. It also supports a setup based on Scala code in .scala
files, as well as a mix of both.
To download SBT, go to http://www.scala-sbt.org/1.0/docs/Setup.html and follow the instructions. If you wish to obtain the newest version, then simply Google it and use the result you get back.
The following screenshot shows the structure of a skeleton SBT project:
It is important to show the contents of the main .sbt
files.
The version.sbt
file looks as follows:
version in ThisBuild := "1.0.0-SNAPSHOT"
It contains the current version that is automatically incremented if a release is made.
The assembly.sbt
file has the following content:
assemblyMergeStrategy in assembly := {
case PathList("javax", "servlet", xs @ _*) => MergeStrategy.first
case PathList(ps @ _*) if ps.last endsWith ".html" => MergeStrategy.first
case "application.conf" => MergeStrategy.concat
case "unwanted.txt" => MergeStrategy.discard
case x =>
val oldStrategy = (assemblyMergeStrategy in assembly).value
oldStrategy(x)
}
assemblyJarName in assembly := { s"${name.value}_${scalaVersion.value}-${version.value}-assembly.jar" }
artifact in (Compile, assembly) := {
val art = (artifact in (Compile, assembly)).value
art.withClassifier(Some("assembly"))
}
addArtifact(artifact in (Compile, assembly), assembly)
It contains information about how to build the assembly JAR—a merge strategy, final JAR name, and so on. It uses a plugin called sbtassembly
(https://github.com/sbt/sbt-assembly).
The build.sbt
file is the file that contains the dependencies of the project, some extra information about the compiler, and metadata. The skeleton file looks as follows:
organization := "com.ivan.nikolov"
name := "skeleton-sbt"
scalaVersion := "2.12.4"
scalacOptions := Seq("-unchecked", "-deprecation", "-encoding", "utf8")
javaOptions ++= Seq("-target", "1.8", "-source", "1.8")
publishMavenStyle := true
libraryDependencies ++= {
val sparkVersion = "2.2.0"
Seq(
"org.apache.spark" % "spark-core_2.11" % sparkVersion % "provided",
"com.datastax.spark" % "spark-cassandra-connector_2.11" % "2.0.5",
"org.scalatest" %% "scalatest" % "3.0.4" % "test",
"org.mockito" % "mockito-all" % "1.10.19" % "test" // mockito for tests
)
}
As you can see, here we define the Java version against which we compile some manifest information and the library dependencies.
The dependencies for our project are defined in the libraryDependencies
section of our SBT file. They have the following format:
"groupId" %[%] "artifactId" % "version" [% "scope"]
If we decide to separate groupId
and artifactId
with %%
instead of %
, SBT will automatically use scalaVersion
and append _2.12
(for Scala 2.12.*) to artifactId
. This syntax is usually used when we include dependencies written in Scala, as the convention there requires us to have the Scala version added as part of artifactId
. We can, of course, manually append the Scala version to artifactId
and use %
. This is also done in cases when we import libraries written in a different major version of Scala. In the latter case, however, we need to be careful with binary compatibility. Of course, not all libraries will be written in the version we use, so we either have to thoroughly test them and make sure they won't break our application, change our Scala version, or look for alternatives.
Note
The dependencies shown will not be needed at any point in this book (the one for Spark and the Datastax one). They are here just for illustration purposes, and you can safely remove them if not needed.
SBT requires each statement to be on a new line and to be separated with a blank line from the previous one if we work with .sbt
files. When using .scala
files, we just write code in Scala.
The %%
syntax in the dependencies is a syntactic sugar, which, using scalaVersion
, will replace the name of the library, for example, scalatest
will become scalatest_2.12
in our case.
SBT allows the engineer to express the same things differently. One example is the preceding dependencies—instead of adding a sequence of dependencies, we can add them one by one. The final result will be the same. There is also a lot of flexibility with other parts of SBT. For more information on SBT, refer to the documentation.
The project/build.properties
defines the sbt
version to be used when building and interacting with the application under sbt
. It is as simple as the following:
sbt.version = 1.1.0
Finally, there is the project/plugins.sbt
file that defines different plugins used to get things up and running. We already mentioned sbtassembly
:
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.5")
Note
There are different plugins online that provide useful functionalities. Here are some common sbt
commands that can be run from the root folder in the Terminal of this skeleton project:
sbt
: This opens the sbt console for the current project. All of the commands that will follow can be issued from here by omitting the sbt
keyword.sbt test
: This runs the application unit tests.sbt compile
: This compiles the application.sbt assembly
: This creates an assembly of the application (a fat JAR) that can be used to run as any other Java JAR.
Maven holds its configuration in files named pom.xml
. It supports multimodule projects easily, while for sbt
, there needs to be some extra work done. In Maven, each module simply has its own child pom.xml
file.
To download Maven, go to https://maven.apache.org/download.cgi.
The following screenshot shows the structure of a skeleton Maven project:
The main pom.xml
file is much longer than the preceding SBT solution. Let's have a look at its parts separately.
There is usually some metadata about the project and different properties that can be used in the POM files in the beginning:
<modelVersion>4.0.0</modelVersion>
<groupId>com.ivan.nikolov</groupId>
<artifactId>skeleton-mvn</artifactId>
<version>1.0.0-SNAPSHOT</version>
<properties>
<scala.version>2.12.4</scala.version>
<scalatest.version>3.0.4</scalatest.version>
<spark.version>2.2.0</spark.version>
</properties>
Then, there are the dependencies:
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.11</artifactId>
<version>2.0.5</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
<dependency>
<groupId>org.scalatest</groupId>
<artifactId>scalatest_2.12</artifactId>
<version>${scalatest.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.mockito</groupId>
<artifactId>mockito-all</artifactId>
<version>1.10.19</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.12</version>
<scope>test</scope>
</dependency>
</dependencies>
Finally, there are the build definitions. Here, we can use various plugins to do different things with our project and give hints to the compiler. The build definitions are enclosed in the <build>
tags.
First, we specify some resources:
<sourceDirectory>src/main/scala</sourceDirectory>
<testSourceDirectory>src/test/scala</testSourceDirectory>
<resources>
<resource>
<directory>${basedir}/src/main/resources</directory>
</resource>
</resources>
The first plugin we have used is scala-maven-plugin
, which is used when working with Scala and Maven:
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.3.1</version>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
<configuration>
<scalaVersion>${scala.version}</scalaVersion>
</configuration>
</plugin>
Another plugin we use is maven-assembly-plugin
, which is used for building the fat JAR of the application:
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<version>3.1.0</version>
<configuration>
<appendAssemblyId>false</appendAssemblyId>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
<executions>
<execution>
<id>make-assembly</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
The complete pom.xml
file is equivalent to the preceding sbt
files that we presented.
As before, the Spark and Datastax dependencies are here just for illustration purposes.
Note
The use of JUnit to run unit tests in Scala 2.12
If you look into the dependencies in more depth, you will see that we have imported junit
, which is a Java testing framework. At first glance, someone might think that we don't actually need it. However, there is a catch. A quick Google search about how to run Scalatest unit tests with Maven would point to resources recommending the use of scalatest-maven-plugin
. If we followed those instructions and tried running some tests from the command line, we would get a strange error. This is due to the fact that we used Scala 2.12 and the scalatest-maven-plugin
at its current version is not binary compatible with this version of the language.Like many things in software engineering, we have to find workarounds. Here, we could do two things:
- Use an older version of Scala.
- Force Maven to run our tests.Of course, the second option is the more desirable. This means that the only thing we need to do in each Scalatest we write is to add the following annotation to each test class:
@RunWith(classOf[JUnitRunner])
and make sure our test classes contain the word Test
in their name.
Similarly to SBT, you can use Maven from the command line. Some of the commands you might find most useful with the example projects in this book are shown in the next tip.
Note
Useful Maven commands:
mvn clean test
: This runs the application unit testsmvn clean compile
: This compiles the applicationmvn clean package
: This creates an assembly of the application (a fat JAR) that can be used to run as any other Java JAR