PostgreSQL – GROUP BY Clause
The GROUP BY
clause is used to group rows from a table based on the values in one or more columns. It’s often used in combination with aggregate functions to perform calculations on groups of rows rather than on individual rows. The GROUP BY
clause is typically used with the SELECT
statement. Here’s the basic syntax:
SELECT column1, column2, aggregate_function(column_name) FROM table_name GROUP BY column1, column2;
column1, column2
: Columns by which you want to group the data.aggregate_function(column_name)
: Aggregate functions likeSUM
,AVG
,COUNT
, etc., applied to columns within each group.
Here’s an example to illustrate how to use the GROUP BY
clause:
Consider a table named orders
with columns order_id
, customer_id
, and amount
. Below is the syntax to calculate the total amount spent by each customer.
SELECT customer_id, SUM(amount) AS total_amount FROM orders GROUP BY customer_id;
In this query, the GROUP BY
clause groups the rows based on the customer_id
column. The SUM
aggregate function then calculates the total amount spent by each customer within their respective groups.
You can also use multiple columns in the GROUP BY
clause to create more granular groups:
SELECT product_category, year, SUM(sales_amount) AS total_sales FROM sales GROUP BY product_category, year;
In this query, the data is grouped by both product_category
and year
, allowing you to calculate total sales for each product category within each year.
Keep in mind that any column you include in the SELECT
clause that is not part of the GROUP BY
clause should be used with an aggregate function. This is because the GROUP BY
clause collapses multiple rows into a single row for each group, and you need to specify how you want to aggregate data within each group.
The GROUP BY
clause is a fundamental tool for performing group-level calculations and analysis in SQL, allowing you to summarize data by different categories or dimensions.