Creating a DataFrame from CSV
In this recipe, we'll look at how to create a new DataFrame from a delimiter-separated values file.
Note
The code for this recipe can be found at https://github.com/arunma/ScalaDataAnalysisCookbook/blob/master/chapter1-spark-csv/src/main/scala/com/packt/scaladata/spark/csv/DataFrameCSV.scala.
How to do it...
This recipe involves four steps:
- Add the - spark-csvsupport to our project.
- Create a Spark Config object that gives information on the environment that we are running Spark in. 
- Create a Spark context that serves as an entry point into Spark. Then, we proceed to create an - SQLContextfrom the Spark context.
- Load the CSV using the - SQLContext.
- CSV support isn't first-class in Spark, but it is available through an external library from Databricks. So, let's go ahead and add that to our - build.sbt.- After adding the - spark-csvdependency, our complete- build.sbtlooks like this:- organization := "com.packt" name := "chapter1-spark-csv" scalaVersion := "2.10.4" val sparkVersion... 
 
                                             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
     
         
         
                 
                 
                