SQL for Beginners: Metric Growth Rate

EN
Learn how to use window functions to get the value of any other result row in the current query, and how to refer to the value of the previous day or the first user action.
SQL for Beginners: Metric Growth Rate
Published
15.09.2022
|
devtodev

In previous articles, we have already considered various ways to calculate metrics such as DAU/WAU/MAUAverage check, and ARPPU. But sometimes, a more important metric will not be the value of the metric itself, but the growth rate.

In this article, we will look at the window functions that allow us to get the value of any other result row in the current query string. And also refer to the value of the previous day, the first user action, the last value of the selection, and so on.

To get sarted, log in to your devtodev account and find the SQL Report in our demo.

By what % does your income grow each month?

To calculate the change in any metric, we need to know the value of the previous line for each of the lines. To do this, we can use the lag(@param, @offset) over() window function, where the function parameters are the name of the column from which the @param value should be taken, and the number of the @offset row preceding the current one.

select monthgross
, lag(gross,1) over(order by month) as gross_previous_month
from( 
select to_char (eventtime, 'yyyy-mm') as month
, sum(p.priceusd) as gross
from p102968.payments p
where eventtime>=date_trunc('month',current_date)-interval '24 month'
and eventtime date_trunc('month',current_date)
and tester is false
and cheater is false
group by 1)t 
order by month

sql gross
Query result. Screenshot from devtodev demo

With an internal query, we calculated the gross for each month. Then using lag(gross,1) over(order by month) we get the gross value for the previous month. It remains only to calculate the change in the metric: just divide the current gross value by the value of the previous month and convert the result to a percentage form.

select monthgross
,(gross::,numeric / lag(gross,1) over(order by month) 1) * 100 as change
from( 
select to_char (eventtime, 'yyyy-mm') as month
, sum(p.priceusd) as gross
from p102968.payments p
where eventtime>=date_trunc('month',current_date)-interval '24 month'
and eventtime date_trunc('month',current_date)
and tester is false
and cheater is false
group by 1)t 
order by month

sql previous gross
Query result. Screenshot from devtodev demo

If you need to access the value of the next row, use the lead(@param, @offset) over() window function.

Read more: SQL for Beginners: How to Track First In-App Events 

How much has gross changed in the last month compared to the average for the last 3 months?

You can try to solve this problem through the same lag function, calling it 4 times for each of the past weeks separately and then calculating the average value for them. But there is a much more convenient way: using window-frame. Previously, the window was defined only with groups and sort order over(partition by … order by …). This construction has one more possibility: specifying the size of the window. And if we need the average value for the previous 3 periods, we can use the following function: avg(gross) over(order by month rows between 3 preceding and 1 preceding).
For each row, the function will take the average of the gross value over the rows starting from the 3 previous row and ending with the 1 previous row.

The final query will look like this:

select monthgross
,(gross::,numeric / avg(gross) over(order by month rows between 3 preceding and preceding) 1) * 100 as change
from( 
select to_char (eventtime, 'yyyy-mm') as month
, sum(p.priceusd) as gross
from p102968.payments p
where eventtime>=date_trunc('month',current_date)-interval '24 month'
and eventtime date_trunc('month',current_date)
and tester is false
and cheater is false
group by 1)t 
order by month

There are a lot of possibilities for using such windows. For example:

  • sum(gross) over(order by month rows between 1 preceding and 1 preceding) – returns the gross value from the previous row (similar to lag(...,1)).
  • sum(gross) over(order by month rows between unbounded preceding and current row) – will return the sum of gross values ​​for the entire partition (partition by) from its beginning to the current row, including it in the calculation. We would get the same result if we did not specify this window: sum(gross) over(order by month).
  • sum(gross) over(order by month rows between unbounded preceding and unbounded following) – the sum of the values ​​of the entire partition, starting from its beginning and ending with the last value. The analog is sum(gross) over().

Using the lag() and lead() functions, you can also count the conversions of each of the funnel steps, or the conversion from the first step to the last.

User actions before purchase

There can be many uses of window functions, and one of the most common is to look for actions before the user pays. For example, these can be some attempts to buy an in-game item that the user did not have enough resources for. Or you can find purchase window events and find out which paywall users are converting from.

To do this, we must collect information about any user events we need and combine them into one table. In devtodev, all custom events created by users can be found in one table, custom_events. Therefore, we select events from this table and add information about completed payment events to it using union all.

with events as (
select devtodevid, eventtime, name as name
from p102968.custom_events
where created < current_date -interval '1 month'
and tester is false
and cheater is false
union all
select devtodevid, eventtime, 'payment' as name
from p102968.payments
where created < current_date -interval '1 month'
and tester is false
and cheater is false
)

Now each user has a set of events committed by him. For each captured event, we'll find the previous captured event using the lag() window function:

select devtodevid, eventtime, name
, lag(name,1) over(partition by devtodevid order by eventtime) as prev_event
from events

We don't need all the events, but only the payment events and the one preceding it. Therefore, we wrap the result in a nested from() and filter only events that have name='payment'.

select prev_eventcount() as payments
from(
select devtodevid, eventtime, name
, lag(name,1) over(partition by devtodevid order by eventtime) as prev_event
from events)t
where name = 'payment'
group by prev_event 
order by 2

sql prev_event
Query result. Screenshot from devtodev demo

As a result, we get the names of events made before payment and the number of payments made after this event. An empty prev_event value means that the user did not have completed events before the payment, or not all user events were included in our selection. Perhaps, you should add other events to with events as (). For example, levelups, progression, purchases, etc. 

Read more