score:1
you can get rdd[byte]
from rdd[string]
by doing rdd.flatmap(s => s.getbytes)
however beware - it very well might happen that string has 2 bytes per character (depends on locale settings, i guess).
also when you have rdd[byte] you will need to call, for example, mappartitions
give your data as array[byte]
to your c code. in that case you will have quite large arrays passed to your c code, but for each partition the c app will be called only once. another way would be to use rdd.map(s => s.getbytes)
in which case you will have rdd[array[byte]]
and thus you will have multiple c application runs per partition.
i think you can try to pipe()
api for launching your c code and just pipeline rdd elements to your c code and get output of your c application for further processing.
Source: stackoverflow.com
Related Query
- How to create a DataFrame from a text file in Spark
- Can I write a plain text HDFS (or local) file from a Spark program, not from an RDD?
- Add a new line to a text file in Spark
- Find size of data stored in rdd from a text file in apache spark
- Read JSON inside a text file using spark and Scala
- process a text file with xml column in apache spark scala
- Write and append Spark streaming data to a text file in HDFS
- Spark saveAsTextFile to Azure Blob creates a blob instead of a text file
- spark save simple string to text file
- saving model output from Decision tree train classifier as a text file in Spark Scala platform
- Compute average of numbers in a text file in spark scala
- When reading text file from file system, Spark still tries to connect to HDFS
- Passing a list loaded from text file to sql query in Spark SQL
- Text File of specific format into DataFrame in Spark using Scala
- Process large text file using Zeppelin and Spark
- java.lang.ClassCastException arises when convert Spark RDD to a Dataframe from a text file
- spark sql application crashes when used with 50 gb of text file
- How can I extract multi-line record from a text file with Spark
- How to save a Map result to a text file in Spark scala?
- save rdd of array of array to text file spark
- Reading saved text file from a Spark program into another one
- Save as text File without brackets in Spark and change delimiter between the types
- Convert text file to sequence array format in Spark Scala
- Spark - create list of words from text file and the word that comes immediately after it
- Spark Creating DataFrame from a text File
- Reading comma separated text file in spark 1.6
- How to process unstructured Text File using Spark
- How to find keys within a text file and compare them to another using Spark & JSON
- Error with spark Row.fromSeq for a text file
- How to provide a text file location in spark when file is on server
More Query from same tag
- Accept any case class which extends a trait as argument in scala
- Scala Play openapi generator project has missing imports
- Instantiate generic case class without "new"
- What is the proper way to test a system using Akka and RabbitMQ?
- Any way of appending to the act method in scala?
- Get a Type from a String
- Canonical list of scala 'operators'
- Scala - Initialize an empty mutable SynchronizedHashMap
- MAX and MIN value of RDD[scala.collection.immutable.Map[String,Any]
- add field conditionally after traversing all nodes in JSON Circe Scala
- How to use getOrElse in status.in() of Gatling?
- Why does transforming a for-expression toDouble results in an unexpected sequence?
- How do I initialize an eager singleton without requiring a controller request in Play?
- ReactiveMongo w/ Play Scala
- Start external console application from Scala in interactive mode
- Actor not response
- avoid making a play app for a core module
- Event-sourcing with akka-persistance: growing state as list?
- How to convert ArrayList into Scala Array in Spark
- Results encoding in Scala Worksheet Eclipse plugin
- Scala Gatling keep inserting user until end of feed file is reached
- Spark fails to write and then read JSON formatted data with nullable column
- what is apply method in Scala, especially used in type definition
- Is there a Rubocop for Scala?
- Constructing Scala parallel views with X.par.view vs X.view.par?
- How to apply or condition in filter in scala?
- Is there a better way to check if a number should be inside a range, in scalatest?
- Scala macro can't find java.util.List, java.lang.Object
- Have a common webapp in a multi module Java/scala project?
- how to match a spark RDD with results from JdbcRDD