score:0

Accepted answer

turns out the problem was in my creation of the warc files that i was using,

val warcs = sc.newapihadoopfile(
              warcfile,
              classof[warcgzinputformat],             // inputformat
              classof[nullwritable],                  // key
              classof[warcwritable]                   // value
            ).cache()

turns out removing .cache() stops the exceptions. i don't know why though, so an explanation would still be welcome.


Related Query

More Query from same tag