score:1

Accepted answer

Based on your additional information in your comment:

I need this list to use as variables when creating another dataframe via jdbc (I need to make a specific select within postgresql). Is there a more performative way to pass values from a dataframe as parameters in a select?

Given your initial dataset:

val yearsDS: Dataset[Year] = ???

and that you want to do something like:

val desiredColumns: Array[String] = ???

spark.read.jdbc(..).select(desiredColumns.head, desiredColumns.tail: _*)

You could find the column names of yearsDS by doing:

val desiredColumns: Array[String] = yearsDS.columns

Spark achieves this by using def schema, which is defined on Dataset. You can see the definition of def columns here.

score:-2

May be you got a DataFrame,not a DataSet. try to use "as" to transform dataframe to dataset. like this

val year = Year(1,1,1)
val years = Array(year,year).toList
import spark.implicits._
val df = spark.
  sparkContext
  .parallelize(years)
  .toDF("day","month","Year")
  .as[Year]
println(df.collect().toList)

Related Query

More Query from same tag