score:15
Within the code for socketTextStream
, Spark creates an instance of SocketInputDStream
which uses java.net.Socket
https://github.com/apache/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/dstream/SocketInputDStream.scala#L73
java.net.Socket
is a client socket, which means it is expecting there to be a server already running at the address and port you specify. Unless you have some service running a server on port 7777 of your local machine, the error you are seeing is as expected.
To see what I mean, try the following (you may not need to set master
or appName
in your environment).
import org.apache.spark.streaming.Seconds
import org.apache.spark.streaming.StreamingContext
import org.apache.spark.SparkConf
object MyStream
{
def main(args:Array[String])
{
val sc = new StreamingContext(new SparkConf().setMaster("local").setAppName("socketstream"),Seconds(10))
val mystreamRDD = sc.socketTextStream("bbc.co.uk",80)
mystreamRDD.print()
sc.start()
sc.awaitTermination()
}
}
This doesn't return any content because the app doesn't speak HTTP to the bbc website but it does not get a connection refused exception.
To run a local server when on linux, I would use netcat with a simple command such as
cat data.txt | ncat -l -p 7777
I'm not sure what your best approach is in Windows. You could write another application which listens as a server on that port and sends some data.
score:1
Make sure to start the netcat or the port connection before you run the program. nc -lk 8080
Source: stackoverflow.com
Related Query
- 'Connection Refused' error while running Spark Streaming on local machine
- NoSuchMethodError while running Spark Streaming job on HDP 2.2
- Am getting error in scala programming while integrating spark streaming with kafka
- Getting connection error while reading data from ElasticSearch using apache Spark & Scala
- Connection from local machine installed Zeppelin to Docker Spark cluster
- Load CSV from local machine into spark which is running on Docker
- HadoopRDD error while trying to count lines in a file hosted on local HDFS using spark shell
- Error while running Spark application in Intellij - NoClassDefFoundError com/google/common/util/concurrent/internal/InternalFutureFailureAccess
- Spark streaming on Yarn Error while creating FlumeDStream java.net.BindException: Cannot assign requested address
- Spark Streaming Kinesis Integration: Error while initializing LeaseCoordinator in Worker
- Error while running standalone App example in Scala using the Spark API
- Spark streaming connection to S3 gives Forbidden error
- I am getting error in eclipse while running Spark WordCount in scala
- GC error while running spark job
- Error While Running spark on Eclipse
- Getting Error while connection with ElasticSearch in Spark Streamming
- Getting an error while trying to run a simple spark streaming kafka example
- Muting “replicated to only 0 peer(s) instead of 1 peers” WARNing in Spark Streaming while in Local Mode
- find: invalid predicate `-L' error occurred while running Gatling simulation using git command prompt on windows machine
- Error while running the spark scala code to do bulk load
- Spark Strutured Streaming automatically converts timestamp to local time
- "sparkContext was shut down" while running spark on a large dataset
- Error while exploding a struct column in Spark
- Running a Job on Spark 0.9.0 throws error
- Apache Spark error while start
- Error when Spark 2.2.0 standalone mode write Dataframe to local single-node Kafka
- Error in running job on Spark 1.4.0 with Jackson module with ScalaObjectMapper
- Error in running Spark in Intellij : "object apache is not a member of package org"
- "Dead Letters encountered" error while running AKKA remote actors
- Lagom ConductR connection refused error
More Query from same tag
- Negating a custom matcher in specs2
- Scala short and type safe cast operator
- Least common upper class of DecisionTreeRegressor and RandomForestRegressor
- Shapeless HList parameter override
- Scala Breeze: can you create a DenseMatrix of Int Array elements?
- What is the basic collection type in Scala?
- Scala - Timer Error
- Error in serializing classes with traits on Scala using pickling?
- Is there any way to avoid Scala libraries appearing twice in an Eclipse/Scala/Maven project?
- Scala: Accessing package visible methods through structural types outside the package
- Scala - Using complicated generics to get information of a subclass and another class - doesn't compile?
- Scala: Split array into chunks by some logic
- Why do I get conflicting cross-version in sbt on one environment but not another?
- Scala val concurrency
- Scala: read "from outside in" instead of "inside out"
- Mina SSHD enable KEX with SHA1 in version 2.6.0
- Converting JsValue to Model in Scala Play2
- Scala: what are the guarantees of the catch Throwable?
- Unit test logger messages using specs2 + scalalogging
- Ignore case when comparing scala set equality
- Slick select row by id
- Referencing instance member from anonymous function
- Spark dataframe orderBy using a variable
- Scala Constructor Deprecation
- What is the Clojure equivalent of Scalaz Foldable's foldmap?
- How to measure execution time of each loop step in Scala?
- Is it possible to get access to the Manifest when initialising a val inside a trait that uses generics?
- What is a good strategy for keeping global application state in Scala?
- Could/should an implicit conversion from T to Option[T] be added/created in Scala?
- Fix query to resolve to_char and or string comparison issue in scala databricks 2.4.3