score:2
rangebetween
considers the actual values in the column. it will check which values are "in range" (including both start and end values). in your example, the current row is the start value and next row is the end value. since the range is inclusive, all duplicate values will be counted as well.
for example, if the start and end values are 1 and 3 respectively. all values in this range (1,2,3) will be used in the sum.
this is in contrast to rowsbetween
. for this function, only the specified rows are counted. i.e., rowsbetween(window.currentrow, 1)
will only consider the current and next rows, whether duplicates exists or not.
score:0
i would guess that for the same id in the same category,these same ids(here, the id is 1,the category is a) are calculated together...,that is:
for two same ids in the same category:
sum up all the same ids, here, it is 1+1
for these same ids, their next id is the one that is different to them, here it is 2, then the sum is 1+1+2
not sure my understanding is correct
Source: stackoverflow.com
Related Query
- Spark Source code: How to understand withScope method
- I can't understand 'RDD.map{ case (A, B) => A } ' in Scala Spark
- Understand Spark WindowSpec#rangeBetween
- Apache Spark - Unable to understand scala example
- spark UI - Understand metrics memory used
- Can not understand how Spark let python run at Yarn? How does the ProcessBuilder deal with zip file?
- How to understand spark api for elasticsearch
- can't understand how does scala operation functions in Apache spark
- Spark performance for Scala vs Python
- Add JAR files to a Spark job - spark-submit
- How can I change column types in Spark SQL's DataFrame?
- How to sort by column in descending order in Spark SQL?
- Spark - load CSV file as DataFrame?
- How to convert rdd object to dataframe in spark
- Spark - Error "A master URL must be set in your configuration" when submitting an app
- java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries. spark Eclipse on windows 7
- Extract column values of Dataframe as List in Apache Spark
- Renaming column names of a DataFrame in Spark Scala
- Is there a way to take the first 1000 rows of a Spark Dataframe?
- How to write unit tests in Spark 2.0+?
- scala slick method I can not understand so far
- How to pass -D parameter or environment variable to Spark job?
- how to filter out a null value from spark dataframe
- Querying Spark SQL DataFrame with complex types
- Provide schema while reading csv file as a dataframe in Scala Spark
- Apache Spark logging within Scala
- Spark : how to run spark file from spark shell
- Write to multiple outputs by key Spark - one Spark job
- Difference between == and === in Scala, Spark
- aggregate function Count usage with groupBy in Spark
More Query from same tag
- Is it possible in Scala 3 to do pattern matching by generic type?
- Suggest imports does not work with Scala IDE (and sbt)
- Reading a large file using Akka Streams
- Spark: Data can't fit in memory and I want to avoid write it in disk, can I perform iterations with slices of the data to use only the memory?
- Running setup/teardown tasks during stb test task
- Performance of multiple generators in Scala 'for' expressions?
- Coverting a function with functions as parameters
- Create Bulk XML from template for POST request using gatling/scala
- spark-shell: strange behavior with import
- Scala error function deprecated. What is the alternative?
- How does prepared statement skip the escape characters?
- scala.actors.Actor in version 2.11.1 of the language works sequentially
- How to load the ReactiveMongo for both linux and osx when defining the dependancies?
- Dynamic query conditions Slick 3.0
- How to set Options values to form mapping in Play 2?
- Scala: Use function and inner object with the same name
- How to separate out parsing from validation in case of versioned config using scala?
- How to take specific value from key in JSON in Scala?
- Creating a list of triples from a list of pairs such that all subset of triples is present in the list of pairs
- How to parse an variant Json using Circe library with Scala?
- Working only when case class defined outside main method to create Dataset[case class] or Dataframe[case class]
- Ref Updates And Fiber Triggers Using Cats Effect
- How do I wait for a Scala future's onSuccess callback to complete?
- PlayFramework 2.x - Forms / Associating error message to one element of a tuple
- How to find tuple with different value in a list using scala?
- How to compile and run existing project of scala in eclipse
- Best way to write a generic method that checks whether a number is 0 in scala
- transformations of data structures with structural sharing
- python double list comprehension to scala equivalent
- How to handle returns correctly in Scala