PostgreSQL – Subquery
A subquery (also known as a nested query or inner query) is a query that is embedded within another query. Subqueries are used to retrieve data that will be used in the main query’s conditions, expressions, or selection criteria. They are a powerful tool for combining multiple queries and performing more complex operations in a single SQL statement.
There are two main types of subqueries:
Scalar Subquery
A scalar subquery is a subquery that returns a single value (one row and one column). It can be used in places where a single value is expected, such as within expressions or as a part of conditions.
Example of a scalar subquery:
SELECT name, salary, (SELECT AVG(salary) FROM employees) AS avg_salary FROM employees;
Table Subquery (Row Subquery)
A table subquery is a subquery that returns a result set (multiple rows and columns). It can be used as if it were an actual table in the main query.
Example of a table subquery:
SELECT name FROM employees WHERE department_id IN (SELECT department_id FROM departments WHERE location = 'New York');
Subqueries can be used in various parts of a SQL query, including the SELECT
, FROM
, WHERE
, HAVING
, and JOIN
clauses. They provide a way to perform more complex filtering, calculations, and data manipulations without needing to create temporary tables or use multiple separate queries.
While subqueries can enhance query flexibility and readability, they can also impact performance if not used carefully. In some cases, rewriting queries using joins or other techniques might result in more efficient execution plans.
Below are a few points to keep in mind when using subqueries:
- Ensure that the subquery returns the expected data and the appropriate number of rows and columns.
- Pay attention to the correlation between the subquery and the main query if needed.
- Be mindful of performance implications, as some subqueries can lead to slower query execution.
- Test and analyze the query execution plan to optimize performance.
Subqueries in PostgreSQL offer a way to embed one query within another, allowing you to build more complex queries that retrieve specific data based on conditions or calculations involving other data in the database.