score:1
You get the error because you are returning 1
as an Integer. Also, bigint
in hive is actually a Long
. So your else
is returning Long
and your if
is returning a Int
which makes the return type of your UDF Any
which isn't supported by Spark DataFrame. Here's a list of supported datatypes
If you use df.schema
, it'll show you that what you actually need is LongType
val df = sqlContext.sql(" select cast(2 as bigint) as a ")
// df: org.apache.spark.sql.DataFrame = [a: bigint]
df.printSchema
// root
// |-- a: long (nullable = false)
df.schema
// res16: org.apache.spark.sql.types.StructType = StructType(StructField(a,LongType,false))
Your UDF should look something like:
val makeSIfTesla = udf {(make: Long) => if(make == 0) 1.toLong else make}
//makeSIfTesla : UserDefinedFunction = UserDefinedFunction(<function1>,LongType,List(LongType))
However, for something as simple as this, you really don't need a UDF. You can use the when-otherwise
construct available in Spark.
df.withColumn("x" , when($"x" === lit(0) , lit(1) ).otherwise($"x") )
where x
is the column you are passing to your UDF makeSIfTesla
.
score:0
Fix the code like below:
val makeSIfTesla = udf {(make: BigInt) => if(make == 0) BigInt(1) else make}
The problem was that 1
is Int
, and make
is BigInt
, so the method in the udf was returning Any
. Any
is not supported with udf function and hence the error you see. Making the type consistent makes the method return BigInt
and fixes the issue. You can also make make
's type Int
Source: stackoverflow.com
Related Query
- Error while creating a UDF in Spark
- Reflection error while creating spark session through Livy
- Spark streaming on Yarn Error while creating FlumeDStream java.net.BindException: Cannot assign requested address
- Error while calling udf from within withColumn in Spark using Scala
- Error while creating a table from another one in Apache Spark
- Getting Error While Creating JAR from the SPARK SCALA MAVEN Project in Eclipse IDE
- Error while creating Wordcount project using Spark & Java on Eclipse in Cloudera through Vmware
- Task not serializable error while calling udf to spark dataframe
- Error while exploding a struct column in Spark
- spark error RDD type not found when creating RDD
- Apache Spark error while start
- 'Connection Refused' error while running Spark Streaming on local machine
- SPARK dataframe error: cannot be cast to scala.Function2 while using a UDF to split strings in column
- Rename columns in spark using @JsonProperty while creating Datasets
- Error creating dataframes in spark
- Spark 2.0 DataSourceRegister configuration error while saving DataFrame as cvs
- Out of memory error Error while building spark
- java.lang.String is not a valid external type for schema of int error in creating spark dataframe
- sbt Project name must be valid Scala identifier error while creating new project
- Am getting error in scala programming while integrating spark streaming with kafka
- Error while building Spark 1.3.0 - Unresolved dependencies path
- Why does the `is not a member of` error come while creating a list in scala using the :: operator
- Spark UDF deserialization error from sample Java program
- Task serialization error when using UDF in Spark
- Spark Build Fail: Error while building spark from source
- Error while adding a new utf8 string column to Row in Scala spark
- Error while reading a CSV file in Spark - Scala
- Error while reading very large files with spark csv package
- Getting connection error while reading data from ElasticSearch using apache Spark & Scala
- Error while creating graph in GraphX using edge/vertex input files
More Query from same tag
- Scala type system - help understanding mismatch on type alias
- Capturing and chaining types used in for comprehension
- How to use play's form validation to check if the password matches
- SignatureDoesNotMatch in AWS Java SDK for S3
- Scala classes with arbitrary type attribute
- Using Hibernate XML Mappings in Play2 Framework
- Writing File to a directory using java.io.File.mkdirs() and then accessing it not working in spark cluster mode
- Regex causes StackOverflowError
- By-name parameters or functions?
- java SimpleDateFormat parse millis format error
- Scala : Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/log4j/LogManager
- setTracing on Camel with SCALA DSL
- Saving to a custom output format in Spark / Hadoop
- How to fetch Kafka Stream and print it in Spark Shell?
- How to update Map value that is also a Map in Scala?
- Is there a way to omit processing over a RDD partition with few elements in Spark?
- Changing the default dispatcher configuration for all actors in an Akka ActorSystem
- How to flatten a RDD of tuple if it has an Option component
- Scala: How to invoke method with type parameter and manifest without knowing the type at compile time?
- SBT SCALA java.lang.ClassNotFoundException: javax.mail.Authenticator
- Remote akka actors on android?
- Isolating scala non-cross-published library in separate module within a multi module project in SBT
- Is adding a trait method with implementation breaking backward compatibility?
- How can I increment a column in MySQL using Slick 1.0.1
- How to use spark to generate huge amount of random integers?
- Value is not a member of type parameter
- Scala "Sentence-Like" Function Definition
- How to abstractly extend a path-dependent trait: "override trait"?
- Scala: Higher kinded, open-type and wild card generics in Java, C#, Scala and C++
- Filtering values on right side of scalaz disjunction