score:0

the first case it's because you have an extra " on line 4 (first when condition)

this will work just fine:

 df1.join(df2,df2("id") === df2("id"))
   .withcolumn("result",
   when(
   df1("adhar_no") === df2("adhar_no") || 
   df1("pan_no") === df2("pan_no") || 
   df1("voter_id") === df2("voter_id") || 
   df1("dl_no") === df2("dl_no"),"matched"
  ).otherwise("not matched"))

the second one is beacause each when must have an output value: that example makes no sense for me. the first one is just fine but you need to remove yout extra " (i assume is a type)

aslo, as a personal preference or recommendation, i'd rather to reference the column with the dollar syntax. it's way clearer for me and helps me to avoid such typos

edit with example

some crappy test dataframes

   val df1 = list((1, 10, 100, 1000, 10000), (2, 20, 200, 2000, 20000), (3, 30, 300, 3000, 30000)).todf("id","adhar_no", "pan_no", "voter_id", "dl_no")
    val df2 = list((1, 10, 100, 1000, 10000), (2, 20, 200, 2000, 20000), (4, 40, 400, 4000, 40000)).todf("id","adhar_no", "pan_no", "voter_id", "dl_no")

and then, your code with ambiguities fixed:

 df1.as("df1").join(df2.as("df2"), df1("id") === df2("id"))
      .withcolumn("result",  when(
          $"df1.adhar_no" === $"df2.adhar_no" ||
            $"df1.pan_no" === $"df2.pan_no" ||
            $"df1.voter_id" === $"df2.voter_id" ||
            $"df1.dl_no" === $"df2.dl_no"
          , "matched"
        ).otherwise("not matched")
      )
+---+--------+------+--------+-----+---+--------+------+--------+-----+-------+
| id|adhar_no|pan_no|voter_id|dl_no| id|adhar_no|pan_no|voter_id|dl_no| result|
+---+--------+------+--------+-----+---+--------+------+--------+-----+-------+
|  1|      10|   100|    1000|10000|  1|      10|   100|    1000|10000|matched|
|  2|      20|   200|    2000|20000|  2|      20|   200|    2000|20000|matched|
+---+--------+------+--------+-----+---+--------+------+--------+-----+-------+

Related Query

More Query from same tag