score:2
Accepted answer
Well, the error message should tell you everything you have to know here - StructType
expects a sequence of fields as an argument. So in your case schema should look like this:
StructType(Seq(
StructField("comments", ArrayType(StructType(Seq( // <- Seq[StructField]
StructField("comId", StringType, true),
StructField("content", StringType, true))), true), true),
StructField("createHour", StringType, true),
StructField("gid", StringType, true),
StructField("replies", ArrayType(StructType(Seq( // <- Seq[StructField]
StructField("content", StringType, true),
StructField("repId", StringType, true))), true), true),
StructField("revisions", ArrayType(StructType(Seq( // <- Seq[StructField]
StructField("modDate", StringType, true),
StructField("revId", StringType, true))),true), true)))
score:5
I recently ran into this. I'm using Spark 2.0.2 so I don't know if this solution works with earlier versions.
import scala.util.Try
import org.apache.spark.sql.Dataset
import org.apache.spark.sql.catalyst.parser.LegacyTypeStringParser
import org.apache.spark.sql.types.{DataType, StructType}
/** Produce a Schema string from a Dataset */
def serializeSchema(ds: Dataset[_]): String = ds.schema.json
/** Produce a StructType schema object from a JSON string */
def deserializeSchema(json: String): StructType = {
Try(DataType.fromJson(json)).getOrElse(LegacyTypeStringParser.parse(json)) match {
case t: StructType => t
case _ => throw new RuntimeException(s"Failed parsing StructType: $json")
}
}
Note that the "deserialize" function I just copied from a private function in the Spark StructType object. I don't know how well it will be supported across versions.
Source: stackoverflow.com
Related Query
- Re-using A Schema from JSON within a Spark DataFrame using Scala
- Fetch all values irrespective of keys from a column of JSON type in a Spark dataframe using Spark with scala
- How can i create a dataframe from a complex JSON in string format using Spark scala
- Apply custom schema to post response JSON from rest api using scala spark
- Spark scala creating dataFrame from rdd using Row and Schema
- How to read json data using scala from kafka topic in apache spark
- make avro schema from a dataframe - spark - scala
- create a spark dataframe from a nested json file in scala
- Flatten any nested json string and convert to dataframe using spark scala
- Check if value from one dataframe column exists in another dataframe column using Spark Scala
- Schema conversion from String to Array[Structype] using Spark Scala
- Convert spark dataframe to json using scala
- How to calculate the ApproxQuanitiles from list of Integers into Spark DataFrame column using scala
- Scala Spark - copy data from 1 Dataframe into another DF with nested schema & same column names
- Spark scala - parse json from dataframe column and return RDD with columns
- How to load a json file which is having double quotes within a string into a dataframe in spark scala
- Creating Schema of JSON type and Reading it using Spark in Scala [Error : cannot resolve jsontostructs]
- Infer Schema from rdd to Dataframe in Spark Scala
- Read Files from S3 bucket to Spark Dataframe using Scala in Datastax Spark Submit giving AWS Error Message: Bad Request
- Selecting with conditions from a dataframe using Spark Scala
- How to fill column with value taken from a (non-adjacent) previous row without natural partitioning key using Spark Scala DataFrame
- Issue in spliting data in Txt file while converting from RDD to DataFrame in Spark using Scala
- Find average value from a column of stream dataframe with array values using spark scala
- Unable to remove "\" from the json nested attributes using spark scala
- How to extract table information from a cell in dataframe using Scala in Spark
- I need to create a spark dataframe from a nested json file in scala
- Get the values from nested structure dataframe in spark using scala
- How to use a JSON mapping file to generate a new DataFrame in Spark using Scala
- Error while calling udf from within withColumn in Spark using Scala
- How to Convert spark dataframe to nested json using spark scala dynamically
More Query from same tag
- Why do scala org.joda.time.DateTime.parse truncate at millis value in the given date?
- Scala allocate sequential index array
- Correct Approach to Recursively Summing Map in Scala
- is there a way to modify the sbt version of an existing project in IntelliJ IDEA?
- Scala trait - Is there an equivalent of Java interface public static field?
- Scala: Create object only if it doesn't exist yet
- How to define format for both Y and List[X] in Play JSON when X extends from Y and Y is trait?
- Spark SQL Dataframe API -build filter condition dynamically
- Reading files from Apache Spark textFileStream
- Get the specific simple name of a generic type in Scala
- How can I programmatically make methods chainable?
- How can I programmatically add transitive dependencies in Gradle?
- Scala forall example?
- Improving the performance of a specific piece of Scala code
- Inherit generic attributes in Scala
- Html page must be redirect after closing jquery ui popup close after sometime in spring mvc
- Play Scala - Found Future[String] but expected String error
- Spark Build Fails Because Of Avro Mapred Dependency
- Maven build not generating classfile
- IntellliJ Idea (scala) change color settings for syntax error
- How to prevent debugger from stepping into Object class code in Eclipse when developing with Scala?
- Scalacheck won't properly report the failing case
- Creating a function but also make sure I implement another one. Is it possible?
- Scala Play Guice Dependency Injection Fails
- declare val with a loop in scala
- In Gatling, how can I create a key value pair from returned body and save it in the session?
- Prepared Statement and Bind in lagom framework?
- What is the difference between .sc and .scala file?
- "using" function
- Twirl view engine look alike