score:1
Yes. I haven't used MongoDB, but based on other things I've done with Spark, these should all be quite possible.
However, do keep in mind that a Spark application is not typically fault-tolerant. The application (aka "driver") itself is a single point of failure. There's a related question on that topic (Resources/Documentation on how does the failover process work for the Spark Driver (and its YARN Container) in yarn-cluster mode), but I think it doesn't have a really good answer at the moment.
I have no experience running a critical HDFS cluster, so I don't know how much of a problem the single point of failure is. But another idea may be running on top of Amazon S3 or Google Cloud Storage. I would expect these to be way more reliable than anything you can cook up. They have large support teams and lots of money and expertise invested.
Source: stackoverflow.com
Related Query
- apache spark stand alone connecting to mongodb with scala using casbah
- How to Compile Apache Spark with Scala 2.11.1 using SBT?
- How to create nested json using Apache Spark with Scala
- How to replace nulls with empty string ("") in Apache spark using scala
- Fuzzy Compare between two hive columns using apache spark with scala
- Convert json to dataframe using Apache Spark with Scala
- Concatenating datasets of different RDDs in Apache spark using scala
- Using Scala 2.12 with Spark 2.x
- Configuring Apache Spark Logging with Scala and logback
- Splitting strings in Apache Spark using Scala
- How to read json data using scala from kafka topic in apache spark
- Multiple constructors with the same number of parameters exception while transforming data in spark using scala
- Are there known problems using Scala with Apache Camel?
- Using aws credentials profiles with spark scala app
- Rerun Scala code with -deprecation using Apache Zeppelin
- Convert RDD of Vector in LabeledPoint using Scala - MLLib in Apache Spark
- Convert Matrix to RowMatrix in Apache Spark using Scala
- Using Apache Spark in IntelliJ Scala Worksheet
- Apache Spark: Convert column with a JSON String to new Dataframe in Scala spark
- How to work with Apache Spark using Intellij Idea?
- Attach column names to elements with Spark and Scala using FlatMap
- How to run Multi threaded jobs in apache spark using scala or python?
- Insert data into a Hive table with HiveContext using Spark Scala
- process a text file with xml column in apache spark scala
- How to persist a scala list to mongodb using spark
- Merging RDDs using Scala Apache Spark
- calculate co-occurrence terms with spark using scala
- Data preprocessing with apache spark and scala
- Not able to connect Oracle with Apache Spark using SSO Wallet
- Save MongoDB data to parquet file format using Apache Spark
More Query from same tag
- Kafka - Why fresh groupId doesn't return all messages in topic when setting AUTO_OFFSET_RESET_CONFIG as "latest"
- Argonaut: decoding a polymorphic array
- Non-strict view of scanLeft
- Scala with spark - "javax.servlet.ServletRegistration"'s signer information does not match signer information of other classes in the same package
- About Scala fields and property change events
- Why does Scala prefer implicit parameters over extending a trait?
- Resolving MySQL 0000-00-00 dateformat in AWS Glue using Scala
- A Monoid application to subtypes doesn't compile with append operator, but works when explicitly called
- type mismatch with some
- StreamingContext does not have a constructor
- Are there good uses for non-short-circuiting logical (boolean) operators in Java/Scala?
- case class having primary constructor without having a no parameter in scala
- How to extract a part of string in RDD?
- akka-http: send element to akka sink from http route
- reduceByKey method not being found in Scala Spark
- How to save output of multiple queries under single JSON file in appended mode using spark scala
- Scala Transform recursive function with a map inside to tail recursion
- Why is it not possible (in scala) to provide implementation for an abstract override method in the implementing base class
- Scala UOM library
- Getting workflow runtime properties for AWS Glue workflow in Scala
- Nested loops in scala
- Splitting column with key/value pairs into separate columns
- Compare json equality in Scala
- sbt does not resolve Typesafe repository
- scala string interpolation for "$"
- Verify method call with implicit default value
- Can ScalaCheck/Specs warnings safely be ignored when using SBT with ScalaTest?
- In Lift, changing the way Menu.param behaves
- How to split a list by another list in Scala
- how to send message to every Actor (or ActorRef) in array in Akka?