SQL Subquery – The Complete Guide

Updated on October 15, 2024

Article Outline

In relational databases, subqueries are such queries embedded within another one and are often used to fetch data that will be used in the external query. Such queries can be placed in different parts like SELECT, FROM, or WHERE, making it very adaptable in deriving complex data. Subqueries decompose complex SQL statements into small optimally designed fragments which are several comprehensible statements that any user can use to ease the construction and performance of queries.

 

In this blog, we will discuss about the concept of a subquery that you will be using in SQL. We are also going to discuss various forms of subqueries and how they can be utilised in CRUD operations. This will help you to recognise all the usage and importance of subqueries in SQL.

 

What Is a Subquery in SQL?

A subquery is a query which you can place within another query with the help of using the SELECT, WHEE and FROM clauses. The main purpose of this query is to get access to the data which is later used in the outer query. Hence, a subquery can be referred to as a nested query or you can say an inner query.

 

For example, suppose you want to find the names of employees from a data table, whose incomes are above the average. In that case, you must use a subquery in order to determine the average salary and find the names of those employees above the average number.

SELECT name FROM employees WHERE salary > (SELECT AVG(salary) FROM employees);

In this example, the subquery (SELECT AVG(salary) FROM employees) returns the average salary. The outer query then uses this value to compare it with the salary column.

 

There are distinct methods in which subqueries may be used. They can be covered in the WHERE clause for filtering situations or maybe in the FROM clause where subqueries act as transient tables that may be queried further. This makes it easy to break down a complicated task into smaller tasks which then are made into SQL queries in a more organised manner for easier clarification and comprehension.

*Image
Get curriculum highlights, career paths, industry insights and accelerate your technology journey.
Download brochure

Why Use SQL Subqueries?

SQL subqueries can help reduce the complexity of the database queries and thus improve performance. But there are cases when it is clear that using a subquery is much better than simply trying to write a long (or complex) query. Let us explain some of the reasons in steps so that a simple case may assist in acclimatising.

1. Breaking Down Complex Queries

Sometimes, a query is too complicated to write in one step. Subqueries allow you to break it down into smaller, easier parts. This makes the query simpler and easier to read.

Example Table: Employees

id name department salary
1 Alice HR 5000
2 Bob IT 7000
3 Carol Sales 6000
4 Dave IT 8000

Now, let’s say you want to find the names of people who work inside the department with the best common income.

Step 1: First, you would write a query to find the department with the highest average salary.

SELECT department FROM employees GROUP BY department ORDER BY AVG(salary) DESC LIMIT 1;

This query gives you the department with the highest average salary, but what if you want the names of employees in that department? Here’s where a subquery comes in.

Step 2: Use the subquery to plug that result into the main query.

SELECT name FROM employees WHERE department = (SELECT department FROM employees GROUP BY department ORDER BY AVG(salary) DESC LIMIT 1);

Now, this query makes use of the result of the subquery to clear out the outer query and give you the names of personnel operating in the department with the highest average earnings. Without a subquery, writing this in one step would be very difficult.

2. Improving Query Efficiency

Subqueries can also help make your query faster and more efficient.

For instance, if you want to examine each worker’s salary to the overall average earnings inside the business enterprise, you could write:

SELECT name, salary FROM employees WHERE salary > (SELECT AVG(salary) FROM employees);

Here, the subquery calculates the average salary only once, and the outer query uses that result to compare the individual employee salaries. This reduces redundancy and improves performance.

3. Making Queries More Flexible

Subqueries make your queries more flexible. This is because data can be dynamically filtered, calculated or manipulated based on the results of other queries.

For example, you want to find employees who are paid better than the best-paid employees in the Human Resources department. Without writing complicated joins or any query, you can simply write the following:

SELECT name FROM employees WHERE salary > (SELECT MAX(salary) FROM employees WHERE department = 'HR');

In this case, the subquery finds the maximum salary in the HR department, and the outer query filters out employees earning more than that amount. Without subqueries, handling this scenario would be more difficult.

4. Reusing Code

Subqueries also help reduce duplication in your SQL code. By nesting queries, you avoid having to repeat the same logic. For example, calculating averages, sums, or counts can be done once in a subquery and reused as needed in the main query.

5. Avoiding Joins in Certain Cases

In some cases, subqueries can be used to avoid writing complex JOIN operations, making your query easier to understand. For example, instead of joining two tables to compare values, you can write subqueries that provide different comparisons.

Assume that you want to seek and find employees having salaries more than the salary of the average earner in the Sales Department. All of the above would have made use of a JOIN but in this case, a subquery would make it much easier:

SELECT name FROM employees WHERE salary > (SELECT AVG(salary) FROM employees WHERE department = 'Sales');

General Rules to Use Subqueries in SQL

  • A subquery must be enclosed in parentheses. This helps SQL identify it as a separate query.
  • There is no limitation on clauses like SELECT, FROM, WHERE, HAVING etc. where you cannot place a sub-query.
  • When such a subquery is used within a WHERE or HAVING condition, the result set should be one column. A query with multiple columns would lead to an error.
  • The outer query may contain columns from the inner query but the opposite is not allowed.
  • You can use subqueries in INSERT, UPDATE, DELETE, and SELECT statements, making them very flexible.
  • With operators such as “=”, “<”, “>”, and “<>”, the sub-query should only return one value. In any situation where many outputs will be returned, the IN or EXISTS will be more applicable.

Subqueries in CRUD Operations

It is possible to apply subqueries in CRUD (Create, Read, Update, and Delete) Methods. This assists with the manipulation of records and removes useless complexities in the queries. Below we look at the specific application of subqueries in each of the stated operations. The operations we will cover are listed below:

 

  • Subqueries in INSERT statements
  • Subqueries in SELECT statements
  • Subqueries in UPDATE statements
  • Subqueries in DELETE statements

 

We’ll use the following employees table for the examples:

 

id name department age salary
1 Alice HR 30 5000
2 Bob IT 40 7000
3 Carol Sales 25 6000
4 Dave IT 35 8000

Subqueries in INSERT Statements

Subqueries can be used to insert data into one table based on the result of a subquery from another table. This is useful when you want to copy filtered data or perform calculations before inserting records.

Example:

Suppose we have another table senior_employees, and we want to insert employees who are older than the average age of employees in the company.

INSERT INTO senior_employees (id, name, age) SELECT id, name, age FROM employees WHERE age > (SELECT AVG(age) FROM employees);

The subquery (SELECT AVG(age) FROM employees) calculates the average age of all employees.

The outer query selects employees older than the average and inserts them into the senior_employees table.

Subqueries in SELECT Statements

Subqueries in SELECT statements are often used to filter results based on conditions from another query, allowing more refined data selection.

Example:

Let’s say we want to find employees whose salary is higher than the average salary of employees in the Sales department.

SELECT name, salary FROM employees WHERE salary > (SELECT AVG(salary) FROM employees WHERE department = 'Sales');

The subquery (SELECT AVG(salary) FROM employees WHERE department = ‘Sales’) calculates the average salary of employees in the Sales department.

The outer query selects employees whose salary is higher than the average.

Subqueries in UPDATE Statements

Subqueries in UPDATE statements allow you to update rows based on conditions from a related query. This is helpful when you want to modify data for specific records dynamically.

Example:

Suppose we want to give a 5% salary raise to employees who are younger than the average age in the company.

UPDATE employees SET salary = salary * 1.05 WHERE age < (SELECT AVG(age) FROM employees);

The subquery (SELECT AVG(age) FROM employees) calculates the average age of all employees.

The outer query updates the salary for employees who are younger than the average, increasing their salary by 5%.

Subqueries in DELETE Statements

Subqueries in DELETE statements help to remove records that meet specific criteria derived from another query.

Example:

Let’s say we want to delete employees from the employees table who earn much less than the best salary within the IT branch.

DELETE FROM employees WHERE salary < (SELECT MAX(salary) FROM employees WHERE department = 'IT');

The subquery (SELECT MAX(salary) FROM employees WHERE department = ‘IT’) finds the highest salary in the IT department.

The outer query deletes employees whose salary is lower than that value.

What are the types of subqueries?

Subqueries in SQL may be classified primarily based on their functionality and where they are used. Here are the primary varieties of subqueries:

 

1. Single-row Subquery

A single-row subquery returns only one row. It is often used with comparison operators like =, >, <, >=, or <=. If the subquery returns more than one row, an error will occur.

Example:

SELECT name

FROM employees

WHERE salary = (SELECT MAX(salary) FROM employees);

 

This query returns the name of the employee with the highest income. The subquery returns only one value (the highest salary), so it’s a single-row subquery.

2. Multiple-row Subquery

A multiple-row subquery returns more than one row. It is usually paired with operators like IN, ANY, or ALL.

Example:

SELECT name FROM employees WHERE salary IN (SELECT salary FROM employees WHERE department = 'IT');

In this example, the subquery returns all the salaries of employees in the IT department. The outer query selects the names of employees whose salaries match one of those values.

3. Correlated Subquery

A correlated subquery is one that depends on the outer query for its values. It is evaluated once for each row processed by the outer query, making it more dynamic but potentially slower for large datasets.

Example:

SELECT name FROM employees e1 WHERE salary > (SELECT AVG(salary) FROM employees e2 WHERE e1.department = e2.department);

Here, the subquery calculates the average income for every department, and the outer query compares every worker’s profits to the average income of their own branch. The subquery depends on the outer query, making it a correlated subquery.

 

Also read: SQL Interview Questions and Answers

4. Uncorrelated Subquery

An uncorrelated subquery is independent of the outer query. It is executed once and its result is used by the outer query. These subqueries are generally more efficient because they don’t rely on data from the outer query.

Example:

SELECT name FROM employees WHERE salary > (SELECT AVG(salary) FROM employees);

In this case, the subquery calculates the average earnings of all employees, and the outer query uses that end result to filter employees whose revenue is greater than the average value.

5. Scalar Subquery

A scalar subquery returns a single value, much like a single-row subquery, but it’s far particularly used when a single cost is predicted, which includes SELECT or WHERE clauses.

Example:

SELECT name, (SELECT MAX(salary) FROM employees) AS highest_salary FROM employees;

Here, the subquery returns the very best revenue from the employees table, and it’s handled as a scalar value within the SELECT clause.

6. Nested Subquery

A nested subquery is simply a subquery within another subquery. This allows for more complex data filtering or calculations.

Example:

SELECT name FROM employees WHERE salary > (SELECT AVG(salary) FROM employees WHERE department IN (SELECT department FROM departments WHERE location = 'NY'));

In this case, the subquery inside the IN clause first retrieves departments located in New York, and the outer subquery calculates the average income for those departments. The main query then selects employees whose salary is higher than the average.

Conclusion

Subqueries in SQL assist in breaking down complex queries into smaller steps for easy handling of data and arithmetic operations. Therefore, they are useful for data filtering, record updating or inserting where complexities are reduced.

 

As such, the involvement of different subqueries such as single-row, multiple-row and correlated subqueries increases the comprehensiveness and efficiency of writing queries in SQL. They enable the handling of critical queries with much ease, thereby enhancing the execution of database processes. Every SQL programmer must learn how to use subqueries.

FAQs
As the name implies, a subquery is a query inscribed in another query, which contains data that is useful for the external query.
There are several types of subqueries: single-row, multiple-row, correlated subqueries, uncorrelated, and scalar subqueries.
Sub-queries in SQL can be used in the statements like SELECT, INSERT, UPDATE and DELETE and these queries are known as subqueries.
A correlated subquery is a subquery that cannot be executed independently or out of the primary query because it refers to the primary query. Such a type of subquery gives different results for each of the outer query rows.
A correlated subquery is a subquery which is dependent on the outer query whilst no such dependence is present in an uncorrelated subquery.
Yes, subqueries, especially correlated ones do have some performance overhead, particularly on large data tables if not balanced well.
Use operators like IN, ANY, or ALL when the subquery returns multiple results.

Updated on October 15, 2024

Link
left dot patternright dot pattern

Programs tailored for your success

Popular

Management

Data Science

Finance

Technology

Future Tech

Upskill with expert articles

View all
Hero Vired logo
Hero Vired is a leading LearnTech company dedicated to offering cutting-edge programs in collaboration with top-tier global institutions. As part of the esteemed Hero Group, we are committed to revolutionizing the skill development landscape in India. Our programs, delivered by industry experts, are designed to empower professionals and students with the skills they need to thrive in today’s competitive job market.
Blogs
Reviews
Events
In the News
About Us
Contact us
Learning Hub
18003093939     ·     hello@herovired.com     ·    Whatsapp
Privacy policy and Terms of use

|

Sitemap

© 2024 Hero Vired. All rights reserved