VertexRDD and EdgeRDD
A VertexRDD
contains the set of vertices or in a special data structure and an EdgeRDD
contains the set of edges or links between the nodes/vertices again in a special data structure. Both the VertexRDD
and the EdgeRDD
are based on RDDs and the VertexRDD
deals with every single node in the graph while the EdgeRDD
contains all links between all nodes. In this section, we will look at how to create VertexRDD
and EdgeRDD
and then use these objects in building a graph.
VertexRDD
As seen earlier, the VertexRDD
is an RDD containing the vertices and their associated attributes. Each element in the RDD represents a vertex or node in the graph. In order to maintain the uniqueness of the vertex, we need to have a way of assigning a unique ID to each of the vertexes. For this purpose, GraphX defines a very identifier known as VertexId
.
Note
VertexId
is defined as a 64-bit vertex identifier that uniquely identifies a vertex within a graph. It does not need to follow any ordering...