score:3
Accepted answer
the problem of the sql is that, the result has two columns (key) with the same name from the two join tables.
solution #1 assign different names to the keys.
e.g. set the column name of the left table to be k1
set the column name of the right table to be k2
solution #2 specify the columns you want to keep in the result table
select a.*, b.val1, b.val2
from tab_a a left outer join tab_b b on a.key = b.key and a.e_date between b.start_date and b.end_date
// since you you only want to keep one key, please change the code you have
val result_df = tab_a.join(tab_b,tab_a.col("key") === tab_b.col("key")
&& tab_a.col("e_date").between(tab_b.col("start_date"),tab_b.col("start_date")),
"left_outer")
// drop the key from tab_b or tab_a
val result_df = tab_a.join(tab_b,tab_a.col("key") === tab_b.col("key")
&& tab_a.col("e_date").between(tab_b.col("start_date"),tab_b.col("start_date")),
"left_outer").drop(tab_b("key"))
Source: stackoverflow.com
Related Query
- Left outer Complex Join of Spark DataFrames using Seq("key") syntax
- spark cant join dataframes using left outer join
- Replacing null values with 0 after spark dataframe left outer join
- Equivalent to left outer join in SPARK
- How to replace NULL to 0 in left outer join in SPARK dataframe v1.6
- Using regexp to join two dataframes in spark
- Spark Dataframes join with 2 columns using or operator
- Spark 2.2 Null-safe Left Outer Join Null Pointer Exception
- Spark left outer join when left side key is Option[]
- Scala LEFT JOIN on dataframes using two columns (case insensitive)
- How to Merge Join Multiple DataFrames in Spark Scala Efficient Full Outer Join
- Left outer join not emitting null values when joining two streams in spark structured streaming 2.3.0
- Join two dataframes and replace the original column values using Spark Scala
- Spark SQL Dataframes - using $columnname in where clause of a join works in spark-shell but doesn't compile
- Join Dataframes dynamically using Spark Scala when JOIN columns differ
- Join of two Dataframes using multiple columns as keys stored in an Array in Apache Spark
- Spark left outer join and duplicate keys on RDDs
- Spark Dataframes : CASE statement while using Window PARTITION function Syntax
- Left Outer join for unequla records fro two data frames in spark scala
- Join two large spark dataframes persisted in parquet using Scala
- Left Anti join in Spark dataframes
- Group By two different keys in two different DataFrames using Spark Scala without join
- Apache spark DataFrames join is failing using scala
- Null value Left Outer Join in Spark
- Spark colocated join between two partitioned dataframes
- Slick left outer join fetching whole joined row as option
- Outer join two Datasets (not DataFrames) in Spark Structured Streaming
- Saving as Text in Spark 1.30 using Dataframes in Scala
- Broadcast Hash Join (BHJ) in Spark for full outer join (outer, full, fullouter)
- Spark Structured Streaming left outer joins returns outer nulls for already matched rows
More Query from same tag
- Conditional Spark map() function based on input columns
- What is the reference of the object that a partial function matches on?
- Scala version of Jgit
- Scala GraphLoader.edgeListFile (NumberFormatException)
- scala class members causing error "Class 'class' must either be declared abstract or implement member 'member' "
- Akka Http Error: InvalidContentLengthException
- select array of columns and expr from dataframe spark scala
- scala type bounds and variance
- Scala: when to use explicit type annotations
- How to NOT SKIP null values in spark JSON?
- Scala case has 22 fields but having issue with play-json in scala 2.11.5
- spark-submit command including mysql connector
- Split a list into half by even and odd indexes? TailRecursive + match
- dbutils.secrets.get- NoSuchElementException: None.get
- Spark on YARN - Cannot allocate containers as requested resource is greater than maximum allowed allocation
- What's the meaning of: 'request => out => ' in Play Framework
- regexp_extract is getting spaces
- Pick the latest folder from Array of folders where folder name has the date
- Serializing a priority queue in scala
- Spray: How to apply respondWithHeaders to all the routes instead of each one
- Apache Flink: How to sink stream to Google Cloud Storage File System
- scalaxb and multiple groups
- Can First-class functions in Scala be a concern for allocating a large PermGen Space in JVM?
- Scala - append RDD to itself
- mongodb No server chosen by com.mongodb.async.client.ClientSessionHelper error
- Scala REPL: avoid terminating batch job
- Read a JSON value from request.body in Scala & Play Framework
- How to pass functions to RDD.map?
- Sending an already Gzip'd payload to the client
- Type of parameter or return type as parameter?