score:2
Accepted answer
For vertical solution: you can union many DataFrames
val dfs = x.map(field => spark.sql(s"select '$field' as fieldName, sum($field) from dftable"))
val withSum = dfs.reduce((x, y) => x.union(y)).distinct()
Probably this would be helpful
val sums = x.map(y => s"sum($y)").mkString(", ")
spark.sql(s"select $sums from dftable");
Sums will be in format: "sum(field1), sum(field2)"
You can use DSL also:
import org.apache.spark.sql.functions._
val sums = for (field <- x) yield { sum(col(field)) }
df.agg(sums : _*)
The result should be exactly the same
score:1
I hope this is helpful
import org.apache.spark.sql.functions._
import spark.implicits._
val df1 = Seq((1,2,3), (3,4,5), (1,2,4)).toDF("A", "B", "C")
df1.describe().show()
val exprs = df1.columns.map(c => sum(col(c))).toList
df1.agg(lit(1).alias("temp"),exprs: _*).drop("temp")
Output:
+------+------+------+
|sum(A)|sum(B)|sum(C)|
+------+------+------+
| 5| 8| 12|
+------+------+------+
Source: stackoverflow.com
Related Query
- pass foreach variable to spark sql to calculate sum in Spark
- What is the correct way to dynamically pass a list or variable into a SQL cell in a spark databricks notebook in Scala?
- How to pass -D parameter or environment variable to Spark job?
- Pass array as an UDF parameter in Spark SQL
- Iterate each row in a dataframe, store it in val and pass as parameter to Spark SQL query
- Spark sql group by and sum changing column name?
- Spark Sql udf with variable number of parameters
- What is the alternate to posexplode() in Spark Sql as it doesn't take variable number of arguments dynamically?
- implement sum aggregator on custom case class on spark sql dataframe (UDAF)
- Dynamically pass columns into when otherwise functions in Spark SQL
- How can we pass a variable to where clause in Spark Dataframe
- Spark SQL DF - How to pass multiple values dynamically for `isin` method of `Column`
- Spark RDD: Sum one column without creating SQL DataFrame
- Spark SQL / Scala - Calculate Minute winner based on number of seconds watched in a given minute
- Spark SQL sum rows with the same key and appending the sum value
- How to calculate sum list and tuple using RDD Spark
- How to refer user defined collection variable in Spark DataFrame SQL
- Assigning Spark SQL function to variable
- How to use string Variable in Spark SQL expression?
- How do i pass Spark context to a function from foreach
- Calculate the sum on the 24 hours time frame in spark dataframe
- Pass Spark SQL function name as parameter in Scala
- Spark SQL query with Int val is not working when we pass int val as argument
- calculate difference in seconds between two columns with spark sql
- org.apache.spark.sql.AnalysisException: cannot resolve '`AB`' given input columns: not able to resolve a variable in spark sql queries
- Calculate frequency of column in data frame using spark sql
- sum MADlib UDF Spark SQL
- Querying Spark SQL DataFrame with complex types
- Automatically and Elegantly flatten DataFrame in Spark SQL
- How do I check for equality using Spark Dataframe without SQL Query?
More Query from same tag
- Example of Scala NonFatal
- How to map over a List using a function returning a Future?
- Future inside guard
- Use Scala as if it was Java
- How to define a tag with Play 2.0?
- variable parameters in Scala constructor
- Scala classOf generic type in Kafka json deserializer
- Parse JSON with Interface/trait and Different implementations
- Akka Remote producing weird java.lang.AbstractMethodError
- Scalatra - not found: object scalate
- Play 2.0 eclipsify command removes Scala nature
- Scala fold evaluate function ifEmpty instead of expression
- Good, idiomatic way to refactor out business logic from controllers
- Translating custom type from F# to Scala
- Help me understand this Scala code: scalaz IO Monad
- Scala var best practice - Encapsulation
- Why does Scala's indexOf (in List etc) return Int instead of Option[Int]?
- How to conditionally remove the first two characters from a column
- Converting various Seqs to CSV?
- Scala: Operator Pattern Matching
- log4j is not working with Jetty and liftweb app
- How to group by keys in a list
- Combinatorial Subtyping (in Scala)
- Assigning one value to many elements of an array in Scala
- Gatling Feeder Issue : No attribute name 'CSVFieldName' is defined issue
- Scala plugin in IDEA dosn't work
- Spark shell error : ERROR SparkDeploySchedulerBackend: Asked to remove non-existent executor 11
- Easiest way of getting third party database information into java objects
- process values of records
- DataFrame numPartitions default value