score:1
one approach would be to use cast()
to cast the column into floattype
, essentially converting all the non-float values into null:
// csv file content:
// id,value
// 1,50
// 2,null
// 3,60.5
// 4,a
val df = spark.read.
option("header", true).
csv("/path/to/csvfile")
import org.apache.spark.sql.types._
val df2 = df.withcolumn("val_float", $"value".cast(floattype))
// +---+-----+---------+
// | id|value|val_float|
// +---+-----+---------+
// | 1| 50| 50.0|
// | 2| null| null|
// | 3| 60.5| 60.5|
// | 4| a| null|
// +---+-----+---------+
you can re-cast the floattype column back to stringtype, if necessary.
Source: stackoverflow.com
Related Query
- Filter DataFrame by Float column values in Scala
- How to filter a dataframe based on column values(multiple values through a arraybuffer) in scala
- Scala DataFrame filter values in array column
- Filtering rows based on column values in spark dataframe scala
- About how to add a new column to an existing DataFrame with random values in Scala
- How to get the set of rows which contains null values from dataframe in scala using filter
- Fetch all values irrespective of keys from a column of JSON type in a Spark dataframe using Spark with scala
- How to get column values from list which contains column names in spark scala dataframe
- Filter Spark Dataframe with list of values in Scala
- How to change the column values of a DataFrame into title case in scala .
- Filter NULL value in dataframe column of spark scala
- Get top values from a spark dataframe column in Scala
- Replace two different column values in a dataframe using same condition with minimum complexity in scala
- Spark Scala join dataframe subtract column values
- Iterate the row in dataframe based on the column values in spark scala
- Iterate a column values in a Stream dataframe and assign each value to a common list using Scala and Spark
- Find the duplicate values of an array column in the dataframe in Scala
- spark scala dataframe adding 1 to all the values in a column
- Filter on length of arrays in a column containing arrays in Scala Spark dataframe
- Scala Spark dataframe filter using multiple column based on available value
- Scala Spark: Filter rows based on values in a column of Floats
- Pass one dataframe column values to another dataframe filter condition expression + Spark 1.5
- Scala Spark Dataframe sum list of json values in the column
- Spark Scala split column values in a dataframe to appended lists
- map dataframe column values to a to a scala dictionary
- Flattening the array of a dataframe column into separate columns and corresponding values in Spark scala
- Find average value from a column of stream dataframe with array values using spark scala
- Split dataframe by column values Scala
- How to reinfer datatype of a Spark Dataframe column after updating values in the column in Scala
- Scala Spark creating a new column in the dataframe based on the aggregate count of values in another column
More Query from same tag
- lift-json manipulation - adding in the right place
- Scala dynamic repository
- implicit conversion of an array of arrays
- Slick outer join with multiple tables
- scala reflection instantiate class with multiple paramLists
- android bound service with scala - cant get Binder.getService working
- Calling arbitrary number of WS.url().get() in sequence
- Spark Scala - How to explode a column into multiple rows in spark scala
- How to get Option[T] field type on Android in 2.10.1?
- read CSV to Map[String,String]
- Actor supervised by BackoffSupervisor loses stashed messages after restart
- How to run databricks notebooks Concurrently
- console log4j2 kafka: No appenders could be found for logger (kafka.utils.Log4jControllerRegistration$)
- Scala - parse JSON data from API with ScalaJson Play Framework
- Large multi module play framework application
- Add dependencies to files inside `project` directory
- How do I define a function for a Monad Stack of State, Disjunction, and List?
- Scala: Overriding methods from java interface
- Method calling a function that takes a String and returns Unit
- List of Scala's "magic" functions
- How to write scodec codec for Collection Map
- DATE CONVERSION SCALA
- Asynchronous message handling with Akka's Actors
- Save file names from S3 location in an array
- Get all strings/char starting with a symbol in Scala
- How to make files optional in giter8
- Kafka 2.3.0 producer and consumer
- is "lift" using an eta expansion in "Functional Programming in Scala"?
- Get value from external client database for a column value as a key in spark dataframes
- Spark ERROR:ClassNotFoundException: scala.Cloneable