PostgreSQL – LAG Function
The LAG()
function is used to access the value of a column in a preceding row within the same result set. This function allows you to compare values between the current row and the previous row or to retrieve values from “lagging” rows relative to the current row based on a specified order.
The syntax of the LAG()
function is as follows:
LAG(expression, offset, default_value) OVER (PARTITION BY partition_expression ORDER BY sort_expression)
expression
: The column or expression whose value you want to access from the preceding row.offset
: The number of rows before the current row from which you want to access the value. The default is 1 (i.e., the previous row).default_value
: An optional value to be returned when theLAG()
function is invoked on the first row of a partition. If omitted, the default isNULL
.PARTITION BY
: An optional clause that divides the result set into partitions or groups. The function operates separately within each partition. If omitted, the function operates on the entire result set.ORDER BY
: Specifies the column(s) by which you want to order the result set for accessing preceding row values.
Here’s a simple example of how to use the LAG()
function in PostgreSQL:
Suppose you have a table named “daily_sales” with columns “sale_date” and “revenue”, and you want to calculate the difference in revenue between the current day’s sales and the previous day’s sales:
SELECT sale_date, revenue, LAG(revenue, 1) OVER (ORDER BY sale_date) AS previous_day_revenue FROM daily_sales;
In this example, the LAG()
function retrieves the revenue from the previous day’s sale for each row based on the order of sale dates. The result might look like this:
sale_date | revenue | previous_day_revenue
------------+---------+----------------------
2023-01-05 | 100 | NULL
2023-01-06 | 150 | 100
2023-01-07 | 90 | 150
2023-01-08 | 200 | 90
In this output, the “previous_day_revenue” column contains the revenue from the sale that occurred on the day immediately preceding each current sale.
The LAG()
function is valuable when you need to compare or calculate values from the previous row, especially for understanding trends, changes, or patterns within ordered data.