Cloudera Impala Cumulative Sum, Average and Example

  • Post author:
  • Post last modified:February 27, 2018
  • Post category:BigData
  • Reading time:3 mins read

You can make use of the Cloudera impala Analytic functions to calculate the cumulative sum or running sum. Sum and Average analytical functions are used along with window options to calculate the Cloudera Impala Cumulative Sum or running sum.

Cloudera Impala Cumulative Sum, Average Syntax:

Below are the Syntax for Cloudera Impala Cumulative SUM, AVG analytic functions. You can defined ORDER BY clause with column inside OVER clause.

SUM([DISTINCT| ALL] expression)[OVER (analytic_clause)]

AVG([DISTINCT| ALL] expression)[OVER (analytic_clause)]

Cloudera Impala Cumulative Sum, Average Examples

Impala Cumulative Sum and Average.

Query:

select
name, amount,
 SUM(amount)
over
 (
 --partition by name
 order by name
 rows between unbounded preceding and current row

) as 'cumulative Sum'
from test;

Output:

name amount cumulative sum
abc 100 100
abc 200 300
bcd 100 400
bcd 100 500
bcd 100 600
cde 400 1000
cde 400 1400
efg 600 2000
efg 600 2600

Query:

select
name, amount,
 AVG(amount)
over
 (
 --partition by name
 order by name
 rows between unbounded preceding and current row

) as 'cumulative avg'
from test;

Output:

name amount cumulative avg
abc 100 100
abc 200 150
bcd 100 133.333333333
bcd 100 125
bcd 100 120
cde 400 166.666666667
cde 400 200
efg 600 250
efg 600 288.888888889

Read: