Welcome! In this article, we’ll explore two powerful SQL tools—subqueries and Common Table Expressions (CTEs)—that simplify complex queries, making your SQL code more readable, maintainable, and efficient. We’ll dive into examples using tables for orders, customers, and employees, with seed data to illustrate each concept in action.
Step 1: Setting Up Our Tables and Seed Data
For our examples, we’ll use three tables: Orders, Customers, and Employees.
- Orders Table – Stores order details, including the quantity ordered and sales amount.
CREATE TABLE Orders (
OrderKey INT PRIMARY KEY,
CustomerKey INT,
EmployeeKey INT,
Quantity INT,
SalesAmount DECIMAL(10, 2)
);
INSERT INTO Orders (OrderKey, CustomerKey, EmployeeKey, Quantity, SalesAmount)
VALUES
(1, 101, 1001, 3, 300.00),
(2, 102, 1001, 4, 450.00),
(3, 103, 1002, 2, 200.00),
(4, 104, 1002, 5, 500.00),
(5, 105, 1003, 6, 600.00),
(6, 101, 1003, 2, 200.00);
- Customers Table – Stores customer information, including occupation, which we’ll use to filter data.
CREATE TABLE Customers (
CustomerKey INT PRIMARY KEY,
Name NVARCHAR(50),
Occupation NVARCHAR(50)
);
INSERT INTO Customers (CustomerKey, Name, Occupation)
VALUES
(101, 'Alice', 'Management'),
(102, 'Bob', 'Management'),
(103, 'Charlie', 'Staff'),
(104, 'Diana', 'Staff'),
(105, 'Eve', 'Management');
- Employees Table – Stores basic information about employees.
CREATE TABLE Employees (
EmployeeKey INT PRIMARY KEY,
Name NVARCHAR(50)
);
INSERT INTO Employees (EmployeeKey, Name)
VALUES
(1001, 'John'),
(1002, 'Sara'),
(1003, 'Mike');
With our tables in place, let’s jump into using subqueries and CTEs.
What Are Subqueries?
A subquery is a query embedded within another SQL query. It allows you to perform intermediate calculations, filter data, or retrieve summarized data for comparison purposes.
Types of Subqueries:
- Self-contained subqueries: Can run independently of the main query and are easy to troubleshoot.
- Correlated subqueries: Depend on the main query, allowing interaction between the two queries.
Subqueries can return different types of results:
-
Single-valued: Returns one result, useful in clauses like
WHERE
. -
Multi-valued: Returns multiple values, often used with
IN
. -
Table-valued: Returns a set of rows for use within a
FROM
clause.
Example 1: Finding Orders Above the Average Quantity
Our first example will identify orders with quantities above the average order quantity.
- Calculate the average order quantity:
SELECT AVG(Quantity) FROM Orders;
- Use a subquery in the WHERE clause to filter orders that exceed this average:
SELECT OrderKey, Quantity
FROM Orders
WHERE Quantity > (SELECT AVG(Quantity) FROM Orders);
This returns orders where the quantity is higher than the average, comparing each order against a summary value.
Example 2: Filtering Orders by Customer Occupation
Let’s refine the previous example to include only orders placed by customers in management roles. We’ll add another subquery to filter for these customers.
SELECT OrderKey, Quantity
FROM Orders
WHERE Quantity > (SELECT AVG(Quantity) FROM Orders)
AND CustomerKey IN (SELECT CustomerKey FROM Customers WHERE Occupation = 'Management');
Here, the IN
operator allows us to match multiple values from the subquery, selecting only customers with the occupation 'Management'.
Example 3: Calculating Running Totals Using Correlated Subqueries
A correlated subquery is useful for calculating running totals. We’ll calculate cumulative order quantities for each OrderKey
by summing quantities up to that point.
SELECT o1.OrderKey, o1.Quantity,
(SELECT SUM(o2.Quantity)
FROM Orders o2
WHERE o2.OrderKey <= o1.OrderKey) AS RunningTotal
FROM Orders o1
ORDER BY o1.OrderKey;
In this example, the correlated subquery calculates the running total by summing quantities for all previous OrderKey
values, including the current row.
What Are Common Table Expressions (CTEs)?
A Common Table Expression (CTE) is a temporary result set defined using the WITH
clause. CTEs simplify complex queries by creating modular, reusable result sets.
Key Features of CTEs:
- Each column must have a unique name.
- You can define multiple CTEs within a single query.
- CTEs can reference each other but cannot be nested.
Example 4: Calculating Sales Percentage per Customer with CTEs
Imagine we want to find the percentage of sales each customer contributes to the total sales for each employee. We’ll use two CTEs to simplify the calculations.
- Define Total Sales per Employee:
WITH EmployeeSales AS (
SELECT EmployeeKey, SUM(SalesAmount) AS TotalEmployeeSales
FROM Orders
GROUP BY EmployeeKey
),
- Define Sales per Employee-Customer Combination:
EmployeeCustomerSales AS (
SELECT EmployeeKey, CustomerKey, SUM(SalesAmount) AS EmployeeCustomerSales
FROM Orders
GROUP BY EmployeeKey, CustomerKey
)
- Join the CTEs to calculate the percentage contribution:
SELECT e.EmployeeKey, c.CustomerKey,
FORMAT((c.EmployeeCustomerSales / e.TotalEmployeeSales * 100), 'N2') AS PrctSales
FROM EmployeeCustomerSales AS c
LEFT JOIN EmployeeSales AS e ON c.EmployeeKey = e.EmployeeKey
ORDER BY PrctSales DESC;
By using CTEs, this complex calculation becomes more readable and modular. Each CTE represents a step in the calculation, and they can be joined like regular tables.
Subqueries vs. CTEs: Which to Use?
When to use subqueries vs. CTEs depends on:
- Readability: For complex queries, CTEs make the code cleaner and easier to follow.
- Performance: Simple subqueries are often more efficient for one-off operations.
- Reusability: CTEs are ideal for repeated references to the same result set within a query.
Conclusion
Both subqueries and CTEs are valuable tools in SQL, allowing for efficient data manipulation and clearer, more maintainable code. Subqueries are ideal for quick data transformations, while CTEs offer a way to build and reuse modular query components. With these techniques, you’ll be able to simplify complex queries and enhance SQL performance. Happy querying!