score:1

Accepted answer

you get this number of columns doesn't match error because your erdf dataframe contains only one column, that contains an array:

+----------------------------+
|value                       |
+----------------------------+
|[a, b, c, d, e, f, g]       |
|[a2, b2, c2, d2, e2, f2, g2]|
+----------------------------+

you can't match this unique column with the seven columns contained in your header.

the solution here is, given this erdf dataframe, to iterate over your header columns list to build columns one by one. your complete code thus become:

val spark = sparksession.builder.appname("er").master("local").getorcreate()
import spark.implicits._
val erresponse = response.body.tostring.split("\\\n")
val header = erresponse(0).split(", ") // build header columns list
val body = erresponse.drop(1).map(x => x.split(",").tolist).tolist
val erdf = header
  .zipwithindex
  .foldleft(body.todf())((acc, elem) => acc.withcolumn(elem._1, col("value")(elem._2)))
  .drop("value")

that will give you the following erdf dataframe:

+-------+-------+-------+-------+-------+-------+-------+
|column1|column2|column3|column4|column5|column6|column7|
+-------+-------+-------+-------+-------+-------+-------+
|      a|      b|      c|      d|      e|      f|      g|
|     a2|     b2|     c2|     d2|     e2|     f2|     g2|
+-------+-------+-------+-------+-------+-------+-------+

Related Query

More Query from same tag