score:3

Accepted answer

you can use window function with filter on count as below

val df = seq(
  (1, 1, 2, "tom"),
  (1, 1, 2, "tim"),
  (1, 3, 2, "tom"),
  (2, 1, 2, "mary")
).todf("id1", "id2", "id3", "value")

val window = window.partitionby("id1", "id2", "id3")

df.withcolumn("count", count("value").over(window))
  .filter($"count" < 2)
  .drop("count")
  .show(false)

output:

+---+---+---+-----+
|id1|id2|id3|value|
+---+---+---+-----+
|1  |3  |2  |tom  |
|2  |1  |2  |mary |
+---+---+---+-----+

Related Query

More Query from same tag