score:1

Accepted answer

aggregate should be used inside a spark sql expr for spark 2.4. also it should be better to add a type cast to ensure there is no type mismatch:

df.withcolumn("amount", expr("aggregate(list_val, 0, (x, y) -> (x + int(y)))")

// for float type; for double type, replace "float" with "double"
df.withcolumn("amount", expr("aggregate(list_val, float(0), (x, y) -> (x + float(y)))")

in scala api that would be

df.withcolumn("amount", aggregate($"list_val", lit(0), (x, y) => (x + int(y))))

df.withcolumn("amount", aggregate($"list_val", lit(0f), (x, y) => (x + float(y))))

df.withcolumn("amount", aggregate($"list_val", lit(0.0), (x, y) => (x + double(y))))

Related Query

More Query from same tag