Spark LagContext Programming Guide
- Spark Shell Example
- Start Spark Shell with LAGraph
- Setup environment
- Create Graph from RDD
- Initialize distance-only semiring
- Initialize adjacency matrix for distance-only (common)
- Iterate for distance-only (common)
- Initialize path-augmented semiring
- Initialize adjacency matrix for path-augmented (common)
- Iterate for path-augmented (common)
Spark Shell Example
This section uses the Bellman-Ford description to demonstrate the implementation and application of an linear algebra-based graph algorithm.
Start Spark Shell with LAGraph
The Spark Overview provides instructions for accessing Spark.
To use LAGraph with Spark Shell, the LAGraph jar can be referenced using Spark Shell’s --jars
option.
For example:
Setup environment
First, import classes from the LAGraph package and define a couple of functions to facilitate printing results.
Create Graph from RDD
Create a simple graph in the form of an RDD where each edge is
represented as an element with an index of (src, destination) and a
value of $1.0$. Then set the parallelism and use the SparkContext
to
obtain a
DstrLagContext
.
Initialize distance-only semiring
Setup a semiring to perform the distance-only version of the semiring.
Initialize adjacency matrix for distance-only (common)
Setup the algorithm using code that is common to both the distance-only and path-augmented semirings.
Iterate for distance-only (common)
Perform the calculation using code that is common to both the distance-only and path-augmented semirings.
Initialize path-augmented semiring
Setup a semiring to perform the path-augmented version of the semiring.
Initialize adjacency matrix for path-augmented (common)
Using the same code as was used for the distance-only semiring, setup the algorithm for the path-augmented semirings.
Iterate for path-augmented (common)
Using the same code as was used for the distance-only semiring, perform the calculation for the path-augmented semiring.