score:1

Accepted answer

use p._1 instead of p(0).

val rdd = sc.parallelize(List("dog", "tiger", "lion", "cat", "spider", "eagle"), 1)

val kvRdd: RDD[(Int, String)] = rdd.keyBy(_.length)
val filterRdd: RDD[(Int, String)] = kvRdd.filter(p => p._1 == 4)

//display rdd
println(filterRdd.collect().toList)

List((4,lion))

score:1

There's a lookup method applicable to RDDs of Key-Value pairs (RDDs of type RDD[(K,V)]) that directly offers this functionality.

b.lookup(4)
// res4: Seq[String] = WrappedArray(lion)

b.lookup(5)
// res6: Seq[String] = WrappedArray(tiger, eagle)

Related Query

More Query from same tag