score:2
I ended up using MongoDB's official Java driver instead. This was my first experience with Spark and the Scala programming language, so I wasn't very familiar with the idea of using plain Java JARs yet.
The solution
I downloaded the necessary JARs and stored them in the same directory as the job file, which is a Scala file. So the directory looked something like:
/job_directory
|--job.scala
|--bson-3.0.1.jar
|--mongodb-driver-3.0.1.jar
|--mongodb-driver-core-3.0.1.jar
Then, I start spark-shell as follows to load the JARs and its classes into the shell environment:
spark-shell --jars "mongodb-driver-3.0.1.jar,mongodb-driver-core-3.0.1.jar,bson-3.0.1.jar"
Next, I execute the following to load the source code of the job into the spark-shell:
:load job.scala
Finally I execute the main object in my job like so:
MainObject.main(Array())
As of the code inside the MainObject, it is merely as the tutorial states:
val mongo = new MongoClient(IP_OF_REMOTE_MONGO , 27017)
val db = mongo.getDB(DB_NAME)
Hopefully this will help future readers and spark-shell/Scala beginners!
Source: stackoverflow.com
Related Query
- Cannot connect to remote MongoDB from EMR cluster with spark-shell
- Scala Spark connect to remote cluster
- AWS EMR Spark Cluster - Steps with Scala fat JAR, can't find MainClass
- How do I connect to a Kerberos-secured Kafka cluster with Spark Structured Streaming?
- Cannot connect to Cassandra from Spark (Contact points contain multiple data centers)
- How to get AWS EMR cluster id and step id from inside the spark application step submitted
- How to reindex data from one Elasticsearch cluster to another with elasticsearch-hadoop in Spark
- SBT : Running Spark job on remote cluster from sbt
- Cannot connect to Hive metastore from Spark application
- Reading Huge MongoDB collection from Spark with help of Worker
- Spark shell started with assembly jar cannot resolve decline's cats dependency
- Can't connect Spark and Scala 2.11 to MongoDB with SSL with .pem configuration
- Spark cluster can't assign resources from remote scala application
- Cannot connect to spark cluster on intellij but spark-submit can
- Cannot pass arrays from MongoDB into Spark Machine Learning functions that require Vectors
- How to connect to Hive in Virtual Box from IntelliJ IDEA with Spark Scala
- Cannot connect to Spark cluster programmatically but spark-shell can?
- Read Data from MongoDB through Apache Spark with a query
- Read a remote file from a specific node in my cluster using Spark Submit
- End of file exception while reading a file from remote hdfs cluster using spark
- Connect to Vertica from Spark 2.3.1 Scala 2.11.8 with jdbc
- Possible serialization error when executing Spark job from cluster with 1 master and 2 workers
- Develop Spark app from local PC to remote cluster
- Cassandra cluster is running but not alble to connect from Spark App
- Spark : how to run spark file from spark shell
- 'sbt run' with CLI arguments from shell
- How to use s3 with Apache spark 2.2 in the Spark shell
- Spark Scala: Cannot up cast from string to int as it may truncate
- Writing files to local system with Spark in Cluster mode
- How to use from_json with Kafka connect 0.10 and Spark Structured Streaming?
More Query from same tag
- flatmap(GenTraversableOnce) on Options
- scalaz Monoid for SortedMap
- Problems executing reactivemongo aggregation framework query getting a DefaultJSONCommandError
- What does 'this: =>' construct mean?
- What do variadic functions get compiled to?
- Adding a name to source processor of Kafka streams app results in serialization exception
- Spark Dataframe: How to aggregate both numerical and nominal columns
- scalastyle producing empty output
- How to assign keys to items in a column in Scala?
- Flink consume message from kafka scala cannot flatMap
- How to create a trigger in akka quartz, which will start executing a job only once in special time
- udf in Spark SQL
- How to handle AskTimeoutException in play controller
- at scala project, compiler error - Cannot resolve symbol List?
- All of a sudden my Scala code expects `;` at the end of each statement
- How to keep separate dev, test, and prod databases in Play! 2 Framework?
- Scala - Vals and Defs - Decompiled
- How do I extend an instance of a class (e.g. a List[Int]) with some extra methods?
- Does spark rearranges the order of fields in a dataframe internally in alphabetical order?
- How to stream data from Kafka topic to Delta table using Spark Structured Streaming
- Improvement of State Machine implemented in scala
- Bigram frequency of query logs events with Apache Spark
- Subtype relation of Option type
- not found: type MultipleTextOutputFormat in spark scala
- Error in Spark Streaming code in scala
- Empty Collection when doing map reduce in scala
- Scala - Play! 2 - Generic Crud
- Create a range of dates in a pyspark DataFrame
- Scala structured data extraction via regex and pattern matching
- how to persist a spark Date type column into a DB Date column