score:2

Accepted answer

I ended up using MongoDB's official Java driver instead. This was my first experience with Spark and the Scala programming language, so I wasn't very familiar with the idea of using plain Java JARs yet.

The solution

I downloaded the necessary JARs and stored them in the same directory as the job file, which is a Scala file. So the directory looked something like:

/job_directory
|--job.scala
|--bson-3.0.1.jar
|--mongodb-driver-3.0.1.jar
|--mongodb-driver-core-3.0.1.jar

Then, I start spark-shell as follows to load the JARs and its classes into the shell environment:

spark-shell --jars "mongodb-driver-3.0.1.jar,mongodb-driver-core-3.0.1.jar,bson-3.0.1.jar"

Next, I execute the following to load the source code of the job into the spark-shell:

:load job.scala

Finally I execute the main object in my job like so:

MainObject.main(Array())

As of the code inside the MainObject, it is merely as the tutorial states:

val mongo = new MongoClient(IP_OF_REMOTE_MONGO , 27017)
val db = mongo.getDB(DB_NAME)

Hopefully this will help future readers and spark-shell/Scala beginners!


Related Query

More Query from same tag