score:0
the first case it's because you have an extra "
on line 4 (first when condition)
this will work just fine:
df1.join(df2,df2("id") === df2("id"))
.withcolumn("result",
when(
df1("adhar_no") === df2("adhar_no") ||
df1("pan_no") === df2("pan_no") ||
df1("voter_id") === df2("voter_id") ||
df1("dl_no") === df2("dl_no"),"matched"
).otherwise("not matched"))
the second one is beacause each when must have an output value: that example makes no sense for me. the first one is just fine but you need to remove yout extra " (i assume is a type)
aslo, as a personal preference or recommendation, i'd rather to reference the column with the dollar syntax. it's way clearer for me and helps me to avoid such typos
edit with example
some crappy test dataframes
val df1 = list((1, 10, 100, 1000, 10000), (2, 20, 200, 2000, 20000), (3, 30, 300, 3000, 30000)).todf("id","adhar_no", "pan_no", "voter_id", "dl_no")
val df2 = list((1, 10, 100, 1000, 10000), (2, 20, 200, 2000, 20000), (4, 40, 400, 4000, 40000)).todf("id","adhar_no", "pan_no", "voter_id", "dl_no")
and then, your code with ambiguities fixed:
df1.as("df1").join(df2.as("df2"), df1("id") === df2("id"))
.withcolumn("result", when(
$"df1.adhar_no" === $"df2.adhar_no" ||
$"df1.pan_no" === $"df2.pan_no" ||
$"df1.voter_id" === $"df2.voter_id" ||
$"df1.dl_no" === $"df2.dl_no"
, "matched"
).otherwise("not matched")
)
+---+--------+------+--------+-----+---+--------+------+--------+-----+-------+
| id|adhar_no|pan_no|voter_id|dl_no| id|adhar_no|pan_no|voter_id|dl_no| result|
+---+--------+------+--------+-----+---+--------+------+--------+-----+-------+
| 1| 10| 100| 1000|10000| 1| 10| 100| 1000|10000|matched|
| 2| 20| 200| 2000|20000| 2| 20| 200| 2000|20000|matched|
+---+--------+------+--------+-----+---+--------+------+--------+-----+-------+
Source: stackoverflow.com
Related Query
- error: not enough arguments for method withColumn: Scala spark
- Alternate constructor on Scala case class not defined: not enough arguments for method
- not enough arguments for method when: Spark/scala dataframe
- Scala - Not enough arguments for method count
- Using Scala Breeze numerics results in error: not enough arguments for method apply
- Finch: not enough arguments for method 'toService'
- Not enough arguments for method unmarshal: (implicit evidence$1: spray.httpx.unmarshalling.FromResponseUnmarshaller
- Not getting the output metrics for Dataframe writer save operation in Spark 2.4 using Scala though I am getting input Metrics
- not enough arguments for method toArray
- playframework-2.6: not enough arguments for method apply:
- build.sbt ProjectRef : not enough arguments for method apply
- reduceByKey method not being found in Scala Spark
- java.lang.NoClassDefFoundError: Could not initialize class when launching spark job via spark-submit in scala code
- Add Number of days column to Date Column in same dataframe for Spark Scala App
- Spark Scala filter DataFrame where value not in another DataFrame
- Simplest method for text lemmatization in Scala and Spark
- idiomatic way to declare protected method in Scala when allowing for composition?
- Spark DataFrames when udf functions do not accept large enough input variables
- Should I use `()` or not for the method `getClients` when the return value can be changed?
- Scala Spark - empty map on DataFrame column for map(String, Int)
- AWS Credentials for lambda when working with scala not working
- Spark / Scala - Compare Two Columns In a Dataframe when one is NULL
- Scala - Spark In Dataframe retrieve, for row, column name with have max value
- Change Data Types for Dataframe by Schema in Scala Spark
- java.lang.String is not a valid external type for schema of int error in creating spark dataframe
- Default value for variable arguments in scala method
- Scala method call with generic arguments appears not polymorphic - what is wrong
- Sample a different number of random rows for every group in a dataframe in spark scala
- Spark is pushing down a filter even when the column is not in the dataframe
- Order By Timestamp is not working for Date time column in Scala Spark
More Query from same tag
- Map both keys and values of a Scala Map
- Scala Applicatives?
- how to replace distinct() with reducebykey
- Type inference when creating a partially applied function
- Adding 2 characters in scala
- dropDuplicates with non-numeric condition
- Kafka Streams PAPI: AbstractProcessor objects creation behaviour
- Force Scala traits to be incompatible
- Kryo cannot serialize a java.sql.Timestamp?
- Ingesting unique records in Kafka-Spark Streaming
- How to cache Dataframe in Apache ignite
- OAuth Consumer Secret Consumer Key generation in scala
- Counting the number of threads in an ExecutionContext
- Scala split string to tuple
- Simple concurrency with an Akka Hello World sample
- How to restrict the akka actor to do one job at a time
- scala case object pollution
- how to deal with error SPARK-5063 in spark
- Accessing the previous behaviour of a function in a function-type variable update in Scala
- How to provide Java callbacks (with 'void' return type), to Scala?
- Import works, but does not import everything in the package?
- Is it safe/bad practice to upate SuperVisor strategy from supervisor in Akka?
- Java or Scala for FSM Parser?
- Write concise sbt configurations based on combinations of predefined configuration parameters
- curly braces and parentheses for scala map
- Shapeless - programatically remove fields from case class using LabelledGeneric
- Scala generic that implements method toIterator
- Vertices with different properties in GraphX
- How to include timestamp(created_at and updated_at) on my models using SORM?
- Removing top level field in json4s when nested field with similar name exists