score:1
Your problem is that your if
expression returns either String
in case of match of Unit
in case of miss. You can fix your filter
easily:
val pairs = lines.map(
l => (if (l.split(",")(1).toInt < 60) {"rest"} else if (l.split(",")(1).toInt > 110) {"sport"}, 10))
.filter(_._1 != ())
()
in scala is identity of type Unit
.
But this is not the right way, really. You still get tuples of (Unit, Int)
as the result. You're losing type with this if
statement.
The correct way is either to filter your data before and have exhaustive if
:
val pairs =
lines.map(_.split(",")(1).toInt)
.filter(hr => hr < 60 || hr > 110)
.map(hr => (if (hr < 60) "rest" else "sport", 10))
Or to use collect
, which in spark is the shortcut for .filter.map
:
val pairs =
lines.map(_.split(",")(1).toInt)
.collect{
case hr if hr < 60 => "rest" -> 10
case hr if hr > 110 => "sport" -> 10
}
Probably this variant is more readable.
Also, please note how I moved split
into separate step. This is done to avoid calling split
second time for second if branch.
UPD. Another approach is to use flatMap
, as suggested in comments:
val pairs =
lines.flatMap(_.split(",")(1).toInt match{
case hr if hr < 60 => Some("rest" -> 10)
case hr if hr > 110 => Some("sport" -> 10)
case _ => None
})
It may or may not be more efficient, as it avoids filter
step, but adds wrapping and unwrapping elements in Option
. You can test performance of different approaches and tell us the results.
score:0
Note: Not a direct answer to this question. But it might be useful for users who uses dataframes
In Dataframe, we can use drop function to drop the rows which does not contain values for specified columns
In this case, you can use sc.parallelize and toDF to construct dataframe. And then you can just use df.drop()
Source: stackoverflow.com
Related Query
- Scala Spark - Discard empty keys
- Scala Spark - empty map on DataFrame column for map(String, Int)
- Writing CSV file using Spark and scala - empty quotes instead of Null values
- Fetch all values irrespective of keys from a column of JSON type in a Spark dataframe using Spark with scala
- Spark Scala : Check if string isn't null or empty
- Filling an empty value in Scala Spark Dataframe
- Spark SQL(v2.0) UDAF in Scala returns empty string
- Scala Spark Replace empty String with NULL
- Spark Scala - Handling empty DataFrame
- Add a row to a empty dataframe using spark scala
- How to replace nulls with empty string ("") in Apache spark using scala
- Remove duplicate keys from Spark Scala
- Spark Scala groupByKey and flatMapGroups give empty dataframe
- Spark Scala UDF not returning expected value when the parameters are empty
- Replace empty space in array of string in spark scala
- Spark Scala input empty values according result from self joined dataframe query
- Empty value imputation in Spark using Scala
- how to access map values and keys stored in a data frame in scala spark
- getting the values of a column with keys - spark scala
- Remove empty lists in Scala from ArrayType column in Spark Dataframe
- How to reverse map and loop through keys and values in Scala Spark
- Spark Scala - get number of unique values by keys
- Empty txt file is saved in Apache Spark in scala
- Spark Scala reduceByKey - how to reference to keys specified in configuration file?
- spark scala tab file read and replace empty
- Column bind two RDD in scala spark without KEYs
- How to create new empty column in Spark by Scala with just column name and data type
- remove rows with empty values - Spark Scala
- Comparing values from different keys in scala / spark
- Group By two different keys in two different DataFrames using Spark Scala without join
More Query from same tag
- Any scalatest matchers for matching json
- How to create a method which invokes another service and return a Future?
- Serializing a case class with a lazy val causes a StackOverflow
- Akka Http: Exceeded configured max-open-requests value of [32]
- Type Members and Value Members of a Scala Class
- Gatling Login scenario with CSV feeder
- Iterate Cypher results in Scala
- How to execute shell builtin from Scala
- use libraries in Scala
- Scala, how to read more than one integer in one line in and get them in one variable each?
- Can't add compile dependencies using sbt.AutoPlugins
- Friend access in Scala
- How to change the key of a KStream and then write to a topic using Scala?
- scala tools nsc: set compiler flags in compiler settings
- Automatically derived sealed trait/ADT ordering in Scala
- return value "constrained" by function type in Scala
- akka non-blocking BoundedMailbox
- How to prevent a method returning a Future from being invoked until a previous call does not complete
- Filesystem provider disappearing in Spark?
- Is there a way in Play JSON to define a reader for something that is not an object (/array)?
- How to override Scala generic methods for specific types?
- java.lang.NoSuchMethodError in Scala SBT Shell after executing JAR
- Scala Play - How to convert a list of Scala Strings into an Array of javascript Strings (avoiding the " issue)?
- Is it possible for the Scala compiler to understand overridden Abstract Types
- How to use a type constraint with abstract type
- error: ':' expected but identifier found
- scala use template type to resolve sub class
- How can a Map or List be immutable when we can add or remove elements from them?
- Problems with running Android APK file when merging dex files using Scala
- How to make method generic without getting "No matching Shape found"