score:1
Accepted answer
you can group the array
and calculate the sum
of values.
// raw rdd
val hashedrdd = spark.sparkcontext.parallelize(seq(
("abc",array(("asd",1),("asd",3),("cvd",2),("cvd",2),("xyz",1)))
))
//group by first value and calculate the sum
val y = hashedrdd.map(x => {
(x._1, x._2.groupby(_._1).mapvalues(_.map(_._2).sum))
})
output:
y.foreach(println)
(abc,map(xyz -> 1, asd -> 4, cvd -> 4))
hope this helps!
score:0
one way would be to reduce
on the tuples after groupby
(of the first entry):
@ hashedrdd.map { f => (f._1, f._2.groupby{ _._1 }.map{ _._2.reduce{ (a,b)=>(a._1, a._2+b._2) } } )}.collect
res11: array[(string, map[string, int])] = array(("abc", map("xyz" -> 1, "asd" -> 4, "cvd" -> 4)))
Source: stackoverflow.com
Related Query
- Add Int values of RDD[String,Array[String,Int]]
- Scala - add unapply to Int
- How to add days (as values of a column) to date?
- Add values to Session during testing (FakeRequest, FakeApplication)
- Writing a scala function which can take Int or Double values
- How do add values of selective rows from a list in an functional style?
- How to add null values in an array in spark scala
- Spark Scala - How do I iterate rows in dataframe, and add calculated values as new columns of the data frame
- About how to add a new column to an existing DataFrame with random values in Scala
- In Scala, only add items to Map if Optional values are present
- Map of Iterables - a generic way to add values to an iterable under the given key
- Scala: Add values to sortedSet or array using for loop
- How to add a dot in scala enums values
- Add auto incremented values to scala map for null values
- Add or modify values in a shapeless HMap
- Spark comparing boolean column with string column works differently to comparing int and string where values are equal
- How to check whether multiple columns values of a row are not null and then add a true/false resulting column in Spark Scala
- How can I add values in key value pairs generated in scala
- Add a new calculated column from 2 values in RDD
- How to add List[String] values to a single column in Dataframe
- Why I get different values in scala comparing Long to Int
- Add Column to DataFrame With Aggregated Values
- How to add index in values with array type column
- Add new column to dataframe based on previous values and condition
- Add multiple array values to a list is possible in scala?
- How to add data frame contents in scala ignore null values
- Add global custom values to Play Framework logger
- Is there a way to add literals as columns to a spark dataframe when reading the multiple files at once if the column values depend on the filepath?
- add new column in a dataframe depending on another dataframe's row values
- How to add a column collection based on the maximum and minimum values in a dataframe
More Query from same tag
- (Pattern) matching against strings in Scala
- Difference between 1 :: 2 :: Nil vs. (1 :: (2 :: Nil))?
- Passing Parameters Made in a Case Class to a Trait
- Type-safe usage of Java reflection in Scala
- Scala: Get the id attribute of the last object in a list, or get None if list is empty
- Scala List toString()
- way to connect windows on intellij to spark on cluster
- Meaning of operator ~> in Scala ?
- How to initialize fixture once for all tests in scalatest?
- What does com.typesafe.config.ConfigFactory.load(Config) do?
- How to deal with ^C in JVM console applications?
- Spark shell command lines
- In Scala, does Futures.awaitAll terminate the thread on timeout?
- Why am I getting an "undefined setting" for proj/test:executeTests with the new <task>.all(<scope filter>) API?
- Futures to send concurrent HTTP GET requests
- Scala, Exercise with recursive function
- Scala foreach strange behaviour
- Validate elements in a collection, return Failure for first invalid element
- Spark (Scala) Pairwise subtract all rows in data frame
- Difference between GroupByKey($"col") and GroupBy($"col") in spark scala
- Extending generic Serializables with implicit conversions
- p.nettyException - Handling TooLongFrameException - Play! framework
- scala: shortcut to specify different alternatives in a condition (operator isOneOf)
- S.redirectTo leads always to a blank screen
- Improvements to a custom scala recursion prevention mechanisem
- Importing functions from outside a project in scala [sbt]
- Java API interface
- Recompile build.sbt and project/ before testOnly
- how to pass a Graph as parameter to a Method in Scala
- Scala 2.12 uses Java 1.8; what should we do if we are unable to upgrade to Java 1.8?