Within the code for socketTextStream, Spark creates an instance of SocketInputDStream which uses is a client socket, which means it is expecting there to be a server already running at the address and port you specify. Unless you have some service running a server on port 7777 of your local machine, the error you are seeing is as expected.

To see what I mean, try the following (you may not need to set master or appName in your environment).

import org.apache.spark.streaming.Seconds
import org.apache.spark.streaming.StreamingContext
import org.apache.spark.SparkConf

object MyStream
  def main(args:Array[String])
    val sc = new StreamingContext(new SparkConf().setMaster("local").setAppName("socketstream"),Seconds(10))
    val mystreamRDD = sc.socketTextStream("",80)

This doesn't return any content because the app doesn't speak HTTP to the bbc website but it does not get a connection refused exception.

To run a local server when on linux, I would use netcat with a simple command such as

cat data.txt | ncat -l -p 7777

I'm not sure what your best approach is in Windows. You could write another application which listens as a server on that port and sends some data.


Make sure to start the netcat or the port connection before you run the program. nc -lk 8080

