Using the Spark shell
Spark shell provides a simple way to interactive analysis of data. It also enables you to learn the Spark APIs by quickly trying out various APIs. In addition, the similarity to Scala shell and support for Scala APIs also lets you also adapt quickly to Scala language constructs and make better of Spark APIs.
Note
Spark shell implements the concept of read-evaluate-print-loop (REPL), which allows you to interact with the shell by typing in code which is evaluated. The result is then printed on the console, without needing to be compiled, so building executable code.
Start it by running the following in the directory where you installed Spark:
./bin/spark-shell
Spark shell launches and the Spark shell automatically creates the SparkSession
and SparkContext
objects. The SparkSession
is available as a Spark and the SparkContext
is available as sc.
spark-shell
can be launched with several options as shown in the following snippet (the most important ones are in bold):
./bin/spark...