MET CS 669 Database Design and Implementation for Business
Lab 5 Submission
This submission template is a convenient document for you to provide the
screenshots and explanations for Lab 5. This submission template is
intended to be used in conjunction with the Lab 5 Instructions document.
The instructions document illustrates how to correctly execute each SQL
construct, explains important theoretical and practical details, and contains
the complete set of instructions on how to complete this lab.
Name:
Date:
Page 1 of 4
Copyright 2016 Boston University. All Rights Reserved.
Section One – Subqueries
2. Subquery explanation
a. Subquery superiority explanation
b. Uncorrelated subquery explanation
3. Two queries combination
a. Screenshot
b. Explanation
5. More and less expensive product use case
7. More complex use case
a. Part identification
b. Independent queries
c. Full query
d. Explanation
9. FROM clause solution
a. Screenshot
b. Explanation
11. Correlated vs. uncorrelated explanation
12. EXISTS solution
Page 2 of 4
a. Screenshot
b. Explanation of changes
c. Explanation of better choice
Use the Ask your Facilitator Discussion Board if you have any questions
regarding the how to approach this assignment.
Save your assignment as lastnameFirstname_lab5.doc and submit it in the
Assignments section of the course.
For help uploading files please refer to the Technical Support page in the syllabus.
Page 3 of 4
Your lab submission will be evaluated according to the following rubric.
Letter
Grade
Qualities Demonstrated by the Lab Submission
A+ ➔
100
The results and explanations are entirely complete and correct for all steps. There are absolutely
no technical or other errors present. There is no known way to improve the logic and makeup of
any of the SQL constructs.
One insignificant technical or other error is present, but otherwise the results and explanations are
entirely complete and correct for all steps. Excluding the insignificant error, there is no known
way to improve the makeup of any of the SQL constructs.
One or two consequential technical or other errors are present, but otherwise the results and
explanations are entirely complete and correct for all steps. Excluding the one or two errors, there
is no known way to improve the makeup of any of the SQL constructs.
A few steps have significantly incomplete or incorrect results or explanations. The results and
explanations are complete and correct for the remainder of the steps. The logic and makeup of
most SQL constructs are sound.
A few steps have significantly incomplete or incorrect results or explanations. The results and
explanations are mostly complete and correct for the remainder of the steps, with the exception of
a few insignificant technical or other errors. The logic and makeup of most SQL constructs are
sound.
About ¼ of the steps have significantly incomplete or incorrect results or explanations. The
results and explanations are complete and correct for the remainder of the steps. The logic and
makeup of at least ¾ of the SQL constructs are sound.
About ¼ of the steps have significantly incomplete or incorrect results or explanations. The
results and explanations are mostly complete and correct for the remainder of the steps, with the
exception of a few insignificant technical or other errors. The logic and makeup of at least ¾ of
the SQL constructs are sound.
About half of the steps have significantly incomplete or incorrect results or explanations. The
results and explanations are complete and correct for the remainder of the steps. The logic and
makeup of at least half of the SQL constructs are sound.
About half of the steps have significantly incomplete or incorrect results or explanations. The
results and explanations are mostly complete and correct for the remainder of the steps, with the
exception of a few insignificant technical or other errors. The logic and makeup of at least half of
the SQL constructs are sound.
About ¾ of the steps have significantly incomplete or incorrect results or explanations. The
results and explanations are complete and correct for the remainder of the steps. The logic and
makeup of at least ¼ of the SQL constructs are sound
All or almost all of the steps have incomplete or incorrect results or explanations. The logic and
makeup of all or almost all of the SQL constructs are unsound.
A ➔ 96
A- ➔ 92
B+ ➔ 88
Correctness,
completeness,
and
constitution
Measures the
correctness and
completeness of
the results, and
the quality of
the constitution
of the SQL
constructs
B ➔ 85
B- ➔ 82
C+ ➔ 78
C ➔ 75
C- ➔ 72
D ➔ 67
F➔0
Page 4 of 4
MET CS 669 Database Design and Implementation for Business
SQL Lab 5 Instructions: Subqueries
Objective
This lab teaches you the mechanics crafting SQL queries that harness the power of
subqueries to handle more complex use cases.
Prerequisites
Before attempting this lab, it is best to read the textbook and lecture material
covering the objectives listed above. While this lab shows you how to create and use
these constructs in SQL, the lab does not explain in full the theory behind the
constructs, as does the lecture and textbook.
Required Software
The examples in this lab will execute in modern versions of Oracle, Microsoft SQL
Server, and PostgreSQL as is. If you are using a different RDBMS, you may need to
modify the SQL for successful execution.
Saving Your Data
If you choose to perform portions of the assignment in different sittings, it is
important to commit your data at the end of each session. This way, you will be sure
to make permanent any data changes you have made in your curent session, so that
you can resume working without issue in your next session. To do so, simply issue
this command:
COMMIT;
Data changes in one session will only be visible only in that session, unless they are
committed, at which time the changes are made permanent in the database.
Lab Completion
Use the submission template provided in the assignment inbox to complete this lab.
Page 1 of 45
Copyright 2016, 2018 Boston University. All Rights Reserved.
Lab Overview
In this lab, we practice value manipulation and subqueries on the schema illustrated
below. Note that this schema and its values was first introduced in Lab 3 in a learning by
example section. If you already created it in your database to follow along with the
examples, you do not need to create it a second time. If you did not create it in Lab 3,
you need to create it for this lab.
1..1
has
Store_location
uses
0..*
store_location_id {pk}
store_name
currency_accepted_id {fk1}
1..1
has
1..*
0..*
Offers
1..1
Currency
Sells
offers_id {pk}
store_location_id {fk1}
purchase_delivery_offering_id {fk2}
currency_id {pk}
currency_name
us_dollars_to_currency_ratio
sells_id {pk}
product_id {fk1}
store_location_id {fk2}
0..*
0..*
has
has
1..1
1..1
Purchase_delivery_offering
purchase_delivery_offering_id {pk}
offering
Product
Sizes
product_id {pk}
product_name
price_in_us_dollars
sizes_id {pk}
size_option
1..1
1..1
has
1..*
Available_in
0..*
has
available_in_id {pk}
product_id {fk1}
sizes_id {fk2}
This schema’s structure supports basic product and currency information for an
international organization, including store locations, the products they sell and their
sizes, purchase and delivery offerings, the currency each location accepts, as well as
conversion factors for converting from U.S. dollars into the accepted currency. This
schema models prices and exchange rates at a specific point in time. While a real‐world
schema would make provision for changes to prices and exchange rates over time, the
Page 2 of 45
tables needed to support this have been intentionally excluded from our schema,
because their addition would add unneeded complexity on your journey of learning
subqueries, expressions, and value manipulation. The schema has just the right amount
of complexity for your learning.
The data for the tables is listed below.
Currencies
Name
British Pound
Canadian Dollar
US Dollar
Euro
Mexican Peso
Ratio
0.66
1.33
1.00
0.93
16.75
Store Locations
Name
Berlin Extension
Cancun Extension
London Extension
New York Extension
Toronto Extension
Currency
Euro
Mexican Peso
British Pound
US Dollar
Canadian Dollar
Product
Name
Cashmere Sweater
Designer Jeans
Flowing Skirt
Silk Blouse
Wool Overcoat
US Dollar Price
$100
$150
$125
$200
$250
Sells
Store Location
Berlin Extension
Berlin Extension
Berlin Extension
Berlin Extension
Cancun Extension
Cancun Extension
Cancun Extension
London Extension
London Extension
Product
Cashmere Sweater
Designer Jeans
Silk Blouse
Wool Overcoat
Designer Jeans
Flowing Skirt
Silk Blouse
Cashmere Sweater
Designer Jeans
Page 3 of 45
London Extension
London Extension
London Extension
New York Extension
New York Extension
New York Extension
New York Extension
New York Extension
Toronto Extension
Toronto Extension
Toronto Extension
Toronto Extension
Toronto Extension
Flowing Skirt
Silk Blouse
Wool Overcoat
Cashmere Sweater
Designer Jeans
Flowing Skirt
Silk Blouse
Wool Overcoat
Cashmere Sweater
Designer Jeans
Flowing Skirt
Silk Blouse
Wool Overcoat
Purchase_delivery_offering
Offering
Purchase In Store
Purchase Online, Ship to Home
Purchase Online, Pickup in Store
Offers
Store Location
Berlin Extension
Cancun Extension
London Extension
London Extension
London Extension
New York Extension
New York Extension
Toronto Extension
Purchase Delivery Offering
Purchase In Store
Purchase In Store
Purchase In Store
Purchase Online, Ship to Home
Purchase Online, Pickup in Store
Purchase In Store
Purchase Online, Pickup in Store
Purchase In Store
Sizes
Size Option
Small
Medium
Large
Various
2
4
6
8
Page 4 of 45
10
12
14
16
Available_in
Product
Cashmere Sweater
Cashmere Sweater
Cashmere Sweater
Designer Jeans
Flowing Skirt
Flowing Skirt
Flowing Skirt
Flowing Skirt
Flowing Skirt
Flowing Skirt
Flowing Skirt
Flowing Skirt
Silk Blouse
Silk Blouse
Silk Blouse
Wool Overcoat
Wool Overcoat
Wool Overcoat
Size Option
Small
Medium
Large
Various
2
4
6
8
10
12
14
16
Small
Medium
Large
Small
Medium
Large
DDL and DML to create and populate the tables in the schema is listed below.
DROP
DROP
DROP
DROP
DROP
DROP
DROP
DROP
TABLE
TABLE
TABLE
TABLE
TABLE
TABLE
TABLE
TABLE
Sells;
Offers;
Available_in;
Store_location;
Product;
Currency;
Purchase_delivery_offering;
Sizes;
CREATE TABLE Currency (
currency_id DECIMAL(12) NOT NULL PRIMARY KEY,
currency_name VARCHAR(255) NOT NULL,
us_dollars_to_currency_ratio DECIMAL(12,2) NOT NULL);
CREATE TABLE Store_location (
store_location_id DECIMAL(12) NOT NULL PRIMARY KEY,
store_name VARCHAR(255) NOT NULL,
currency_accepted_id DECIMAL(12) NOT NULL);
Page 5 of 45
CREATE TABLE Product (
product_id DECIMAL(12) NOT NULL PRIMARY KEY,
product_name VARCHAR(255) NOT NULL,
price_in_us_dollars DECIMAL(12,2) NOT NULL);
CREATE TABLE Sells (
sells_id DECIMAL(12) NOT NULL PRIMARY KEY,
product_id DECIMAL(12) NOT NULL,
store_location_id DECIMAL(12) NOT NULL);
CREATE TABLE Purchase_delivery_offering (
purchase_delivery_offering_id DECIMAL(12) NOT NULL PRIMARY KEY,
offering VARCHAR(255) NOT NULL);
CREATE TABLE Offers (
offers_id DECIMAL(12) NOT NULL PRIMARY KEY,
store_location_id DECIMAL(12) NOT NULL,
purchase_delivery_offering_id DECIMAL(12) NOT NULL);
CREATE TABLE Sizes (
sizes_id DECIMAL(12) NOT NULL PRIMARY KEY,
size_option VARCHAR(255) NOT NULL);
CREATE TABLE Available_in (
available_in_id DECIMAL(12) NOT NULL PRIMARY KEY,
product_id DECIMAL(12) NOT NULL,
sizes_id DECIMAL(12) NOT NULL);
ALTER TABLE Store_location
ADD CONSTRAINT fk_location_to_currency FOREIGN KEY(currency_accepted_id)
REFERENCES Currency(currency_id);
ALTER TABLE Sells
ADD CONSTRAINT fk_sells_to_product FOREIGN KEY(product_id) REFERENCES
Product(product_id);
ALTER TABLE Sells
ADD CONSTRAINT fk_sells_to_location FOREIGN KEY(store_location_id) REFERENCES
Store_location(store_location_id);
ALTER TABLE Offers
ADD CONSTRAINT fk_offers_to_location FOREIGN KEY(store_location_id) REFERENCES
Store_location(store_location_id);
ALTER TABLE Offers
ADD CONSTRAINT fk_offers_to_offering FOREIGN KEY(purchase_delivery_offering_id)
REFERENCES Purchase_delivery_offering(purchase_delivery_offering_id);
ALTER TABLE Available_in
ADD CONSTRAINT fk_available_to_product FOREIGN KEY(product_id)
REFERENCES Product(product_id);
ALTER TABLE Available_in
ADD CONSTRAINT fk_available_to_sizes FOREIGN KEY(sizes_id)
REFERENCES Sizes(sizes_id);
INSERT INTO Currency(currency_id, currency_name, us_dollars_to_currency_ratio)
Page 6 of 45
VALUES(1, 'Britsh Pound', 0.66);
INSERT INTO Currency(currency_id, currency_name,
VALUES(2, 'Canadian Dollar', 1.33);
INSERT INTO Currency(currency_id, currency_name,
VALUES(3, 'US Dollar', 1.00);
INSERT INTO Currency(currency_id, currency_name,
VALUES(4, 'Euro', 0.93);
INSERT INTO Currency(currency_id, currency_name,
VALUES(5, 'Mexican Peso', 16.75);
INSERT
VALUES
INSERT
VALUES
INSERT
VALUES
INTO
(50,
INTO
(51,
INTO
(52,
us_dollars_to_currency_ratio)
us_dollars_to_currency_ratio)
us_dollars_to_currency_ratio)
us_dollars_to_currency_ratio)
Purchase_delivery_offering(purchase_delivery_offering_id, offering)
'Purchase In Store');
Purchase_delivery_offering(purchase_delivery_offering_id, offering)
'Purchase Online, Ship to Home');
Purchase_delivery_offering(purchase_delivery_offering_id, offering)
'Purchase Online, Pickup in Store');
INSERT INTO Sizes(sizes_id,
VALUES(1, 'Small');
INSERT INTO Sizes(sizes_id,
VALUES(2, 'Medium');
INSERT INTO Sizes(sizes_id,
VALUES(3, 'Large');
INSERT INTO Sizes(sizes_id,
VALUES(4, 'Various');
INSERT INTO Sizes(sizes_id,
VALUES(5, '2');
INSERT INTO Sizes(sizes_id,
VALUES(6, '4');
INSERT INTO Sizes(sizes_id,
VALUES(7, '6');
INSERT INTO Sizes(sizes_id,
VALUES(8, '8');
INSERT INTO Sizes(sizes_id,
VALUES(9, '10');
INSERT INTO Sizes(sizes_id,
VALUES(10, '12');
INSERT INTO Sizes(sizes_id,
VALUES(11, '14');
INSERT INTO Sizes(sizes_id,
VALUES(12, '16');
size_option)
size_option)
size_option)
size_option)
size_option)
size_option)
size_option)
size_option)
size_option)
size_option)
size_option)
size_option)
‐‐Cashmere Sweater
INSERT INTO Product(product_id, product_name, price_in_us_dollars)
VALUES(100, 'Cashmere Sweater', 100);
INSERT INTO Available_in(available_in_id, product_id, sizes_id)
VALUES(10000, 100, 1);
INSERT INTO Available_in(available_in_id, product_id, sizes_id)
VALUES(10001, 100, 2);
INSERT INTO Available_in(available_in_id, product_id, sizes_id)
VALUES(10002, 100, 3);
‐‐Designer Jeans
INSERT INTO Product(product_id, product_name, price_in_us_dollars)
VALUES(101, 'Designer Jeans', 150);
INSERT INTO Available_in(available_in_id, product_id, sizes_id)
VALUES(10003, 101, 4);
Page 7 of 45
‐‐Flowing Skirt
INSERT INTO Product(product_id, product_name, price_in_us_dollars)
VALUES(102, 'Flowing Skirt', 125);
INSERT INTO Available_in(available_in_id, product_id, sizes_id)
VALUES(10004, 102, 5);
INSERT INTO Available_in(available_in_id, product_id, sizes_id)
VALUES(10005, 102, 6);
INSERT INTO Available_in(available_in_id, product_id, sizes_id)
VALUES(10006, 102, 7);
INSERT INTO Available_in(available_in_id, product_id, sizes_id)
VALUES(10007, 102, 8);
INSERT INTO Available_in(available_in_id, product_id, sizes_id)
VALUES(10008, 102, 9);
INSERT INTO Available_in(available_in_id, product_id, sizes_id)
VALUES(10009, 102, 10);
INSERT INTO Available_in(available_in_id, product_id, sizes_id)
VALUES(10010, 102, 11);
INSERT INTO Available_in(available_in_id, product_id, sizes_id)
VALUES(10011, 102, 12);
‐‐Silk Blouse
INSERT INTO Product(product_id, product_name, price_in_us_dollars)
VALUES(103, 'Silk Blouse', 200);
INSERT INTO Available_in(available_in_id, product_id, sizes_id)
VALUES(10012, 103, 1);
INSERT INTO Available_in(available_in_id, product_id, sizes_id)
VALUES(10013, 103, 2);
INSERT INTO Available_in(available_in_id, product_id, sizes_id)
VALUES(10014, 103, 3);
‐‐Wool Overcoat
INSERT INTO Product(product_id, product_name, price_in_us_dollars)
VALUES(104, 'Wool Overcoat', 250);
INSERT INTO Available_in(available_in_id, product_id, sizes_id)
VALUES(10015, 104, 1);
INSERT INTO Available_in(available_in_id, product_id, sizes_id)
VALUES(10016, 104, 2);
INSERT INTO Available_in(available_in_id, product_id, sizes_id)
VALUES(10017, 104, 3);
‐‐Berlin Extension
INSERT INTO Store_location(store_location_id, store_name, currency_accepted_id)
VALUES(10, 'Berlin Extension', 4);
INSERT INTO Sells(sells_id, store_location_id, product_id)
VALUES(1000, 10, 100);
INSERT INTO Sells(sells_id, store_location_id, product_id)
VALUES(1001, 10, 101);
INSERT INTO Sells(sells_id, store_location_id, product_id)
VALUES(1002, 10, 103);
INSERT INTO Sells(sells_id, store_location_id, product_id)
VALUES(1003, 10, 104);
INSERT INTO Offers(offers_id, store_location_id, purchase_delivery_offering_id)
VALUES(150, 10, 50);
‐‐Cancun Extension
INSERT INTO Store_location(store_location_id, store_name, currency_accepted_id)
VALUES(11, 'Cancun Extension', 5);
INSERT INTO Sells(sells_id, store_location_id, product_id)
Page 8 of 45
VALUES(1004, 11, 101);
INSERT INTO Sells(sells_id, store_location_id, product_id)
VALUES(1005, 11, 102);
INSERT INTO Sells(sells_id, store_location_id, product_id)
VALUES(1006, 11, 103);
INSERT INTO Offers(offers_id, store_location_id, purchase_delivery_offering_id)
VALUES(151, 11, 50);
‐‐London Extension
INSERT INTO Store_location(store_location_id, store_name, currency_accepted_id)
VALUES(12, 'London Extension', 1);
INSERT INTO Sells(sells_id, store_location_id, product_id)
VALUES(1007, 12, 100);
INSERT INTO Sells(sells_id, store_location_id, product_id)
VALUES(1008, 12, 101);
INSERT INTO Sells(sells_id, store_location_id, product_id)
VALUES(1009, 12, 102);
INSERT INTO Sells(sells_id, store_location_id, product_id)
VALUES(1010, 12, 103);
INSERT INTO Sells(sells_id, store_location_id, product_id)
VALUES(1011, 12, 104);
INSERT INTO Offers(offers_id, store_location_id, purchase_delivery_offering_id)
VALUES(152, 12, 50);
INSERT INTO Offers(offers_id, store_location_id, purchase_delivery_offering_id)
VALUES(153, 12, 51);
INSERT INTO Offers(offers_id, store_location_id, purchase_delivery_offering_id)
VALUES(154, 12, 52);
‐‐New York Extension
INSERT INTO Store_location(store_location_id, store_name, currency_accepted_id)
VALUES(13, 'New York Extension', 3);
INSERT INTO Sells(sells_id, store_location_id, product_id)
VALUES(1012, 13, 100);
INSERT INTO Sells(sells_id, store_location_id, product_id)
VALUES(1013, 13, 101);
INSERT INTO Sells(sells_id, store_location_id, product_id)
VALUES(1014, 13, 102);
INSERT INTO Sells(sells_id, store_location_id, product_id)
VALUES(1015, 13, 103);
INSERT INTO Sells(sells_id, store_location_id, product_id)
VALUES(1016, 13, 104);
INSERT INTO Offers(offers_id, store_location_id, purchase_delivery_offering_id)
VALUES(155, 13, 50);
INSERT INTO Offers(offers_id, store_location_id, purchase_delivery_offering_id)
VALUES(156, 13, 52);
‐‐Toronto Extension
INSERT INTO Store_location(store_location_id, store_name, currency_accepted_id)
VALUES(14, 'Toronto Extension', 2);
INSERT INTO Sells(sells_id, store_location_id, product_id)
VALUES(1017, 14, 100);
INSERT INTO Sells(sells_id, store_location_id, product_id)
VALUES(1018, 14, 101);
INSERT INTO Sells(sells_id, store_location_id, product_id)
VALUES(1019, 14, 102);
INSERT INTO Sells(sells_id, store_location_id, product_id)
VALUES(1020, 14, 103);
INSERT INTO Sells(sells_id, store_location_id, product_id)
Page 9 of 45
VALUES(1021, 14, 104);
INSERT INTO Offers(offers_id, store_location_id, purchase_delivery_offering_id)
VALUES(157, 14, 50);
Page 10 of 45
Section One – Subqueries
Overview
In this section we learn to work with subqueries, which significantly extend the
expressional power of queries. Through the use of subqueries, a single query can extract
result sets that could not be extracted without subqueries. Subqueries enable the query
creator to ask the database for many complex structures in a single query.
The foundation for learning how subqueries work lies in the expressional nature of the
relational model. We learned previously that the operations in the relational model,
such as SELECT, PROJECT, and UNION, perform operations on relations and yield a new
relation as the result. That is, when one operation operates on a relation, and yields a
new relation, we can use a second operation to operate on the result of the first
operation.
We will take a look at these relational operations in a moment, but first let us look at a
simple mathematical example. If we add two plus two to obtain a result of 4, 2 + 2 = 4, we
have applied the plus operation to two numbers. Interestingly, the result of the plus
operation is another number. So we can say that the plus operator operates on two
numbers, and results in a new number.
Because the result is a number, there is no reason why we cannot use that result in
another operation. For example, if we wanted to subtract one from the result, we nest
the operations like so: (2 + 2) ‐ 1 = 3. That is, the plus operation adds two plus two to
arrive at a result of 4, then the minus operation subtracts one from that result, to arrive
at a final result of 3.
Now that we have seen a simple example of nesting operations, let us look at a concrete
example in the relational model. Imagine that we have a Person relation which has two
columns – first_name and last_name.
Person =
first_name
Bill
Jane
last_name
Glass
Smith
We can use the PROJECT operation to create a new relation consisting of only the
last_name column from Person, denoted as PROJECTlast_name(Person).
PROJECTlast_name(Person) =
last_name
Glass
Smith
Page 11 of 45
Because the PROJECT operation creates a new relation, there is no reason why we
cannot apply a second operation to that new relation. In this example, we can apply the
SELECT operation to the result of the PROJECT in order to retrieve only the rows where
the last name is Smith, denoted as SELECTlast_name=Smith(PROJECTlast_name(Person)).
SELECTlast_name=Smith(PROJECTlast_name(Person))
=
last_name
Smith
In other words, wherever a particular operation expects a relation, we can give it an
existing relation, or we can give it the result of another operation. When we have
operations that operate on the results of other operations, we term this as nesting the
operations. It is this nesting ability that gives the relational model, and therefore
relational databases, their expressional power. Think about the concept of nesting
operations until you are sure you understand it well, for it is the foundation of
understanding subqueries.
Relational databases allow queries to be nested inside of other queries. Wherever a SQL
statement expects a table, we can give it an existing table, or we can give it the result of
another query. This is because all queries in relational databases yield result sets which
are of the same form as a relational table ‐‐ a two‐dimensional set of rows and columns.
Let us look at a concrete example of a subquery using a Person table with the same rows
as the Person relation just described. When we issue a SELECT * FROM Person
command, we see the initial Person table with Bill Glass and Jane Smith as row values.
To perform the PROJECTlast_name(Person) operation in SQL, we specify the last_name column
in the command ‐‐ SELECT last_name FROM Person -- and the database returns for
us the values in the last_name column.
Page 12 of 45
Now if we would like to perform the equivalent of the
SELECTlast_name=Smith(PROJECTlast_name(Person)) operation in our database. We can do so through
the use of a subquery, illustrated in the command below.
SELECT * FROM (SELECT last_name from Person)
WHERE last_name = 'Glass'
Did you see what we did in this command? The SELECT last_name FROM Person query
has been put in place of an existing table, in order to perform the PROJECT operation.
We usually see the name of a table in the FROM clause, but here we see the placement
of the subquery. And then the additional WHERE last_name = 'Glass' restriction has
been placed on the outer query, which performs the relational SELECT operation. Thus,
we have used the results from one query in another query, nesting one query inside the
other. This use of subqueries is powerful and allows us to obtain and manipulate data in
a variety of ways.
In the screenshot below, you can see that we obtain the result we expect.
Steps
1. If we were asked to give the price of the Cashmere sweater in Euros we could use two
queries to do so. We could use the query below to find out that the ratio is 0.93.
SELECT us_dollars_to_currency_ratio
FROM
Currency
WHERE currency_name = 'Euro'
We could then manually hardcode that 0.93 value into the next query to obtain the
price in Euros.
SELECT price_in_us_dollars * 0.93 AS price_in_euros
FROM
Product
WHERE product_name = 'Cashmere Sweater'
Page 13 of 45
Doing so gives us the result of €93, as shown in the side‐by‐side screenshots of
Oracle, SQL Server and PostgreSQL below.
Executing two queries independently gives us the results we want, but forces us to
manually hardcode one value into another. For subsequent executions of the query,
if we do not execute the first query to reobtain the conversion ratio and resubstitute
that value into the second query, we risk giving a wrong result since the conversion
factor will change often.
There is a better way. Embedding one query inside the other is more concise and
production‐worthy, and gets us out of the business of manually hardcoding values
from one query into another. Let us try it out in Oracle, SQL Server and PostgreSQL to
see if we still get the same result.
Oracle
SQL Server
Page 14 of 45
PostgreSQL
The result is the same in these databases, except for formatting of course. It is
possible to embed one query in another! Let us explore how the two‐in‐one
combination yields its results by examining the query line‐by‐line.
1: SELECT price_in_us_dollars *
2:
(SELECT us_dollars_to_currency_ratio
3:
FROM
Currency
4:
WHERE currency_name = 'Euro') AS price_in_euros
5: FROM
Product
6: WHERE product_name = 'Cashmere Sweater'
On lines 2‐4, we embed the query that retrieves the ratio for Euros. The embedded
query is termed a subquery because it resides inside of another query. The subquery
has the advantage that should the ratio change over time, the overall query will
always retrieve the correct results. A significant property of this particular subquery
is that it retrieves one column from one row, so it retrieves one single value (in this
case, the ratio). Not all subqueries retrieve a single value, but this one does, so we
can use this subquery wherever a single value is expected. This two‐in‐one
combination is superior because it is more concise and survives changes over time.
Knowing how to retrieve a single value from a subquery is useful, but knowing where
to place the subquery is just as important. Placing a subquery in the column list of a
SELECT statement gives us the ability to directly manipulate values from every row
returned in the outer query. On line 5, we indicate we want to select from the
Product table, on line 6, we indicate we only want the Product named “Cashmere
Sweater”, and on line 1 we indicate we want the U.S. Dollar price of the sweater. The
result of our subquery is thus used to manipulate the U.S. Dollar price for the one
row returned from the outer query. However, if the WHERE clause on line 6 were to
Page 15 of 45
allow for multiple products, the result of the subquery would be used to manipulate
the U.S. Dollar price of all products returned. The principle is that the result of a
subquery placed in the column list is used for every row returned from the outer
query.
Placing a subquery correctly is important, but understanding how the SQL engine
executes a subquery empowers us to make the best use subqueries to solve a wider
variety of problems. In our example, the result of the subquery does not depend on
which product price is retrieved. That is, the ratio of the Euro is the ratio of the Euro;
the ratio does not vary if the prices of the products vary. Therefore the SQL engine
executes the subquery on its own to retrieve its result. We could state that a
subquery will be executed before the outer query is executed, but that is not entirely
correct. A subquery may be executed in parallel with the outer query depending
upon the DBMS and its configuration, but we do know that a subquery’s result must
be available before it is used in the expression in the column list. To be more specific,
this type of subquery is termed an uncorrelated subquery, which means that the
subquery does not reference a table or value in the outer query, and that its results
can be retrieved with or without the existence of the outer query. An uncorrelated
subquery can always be extracted and executed as a query in its own right. In fact, a
simple test to determine whether a subquery is correlated or not is to try and
execute it on its own outside of the outer query. The SQL engine executes an
uncorrelated subquery independently of the outer query, before it needs the
subquery’s results.
For completeness, let us format the result so that we see it as a monetary amount. In
Oracle, we would modify the query as the following screenshot illustrates:
Notice that the result is now “€93.00” instead of “93”. Just as in step 10, we use the
to_char function to format our result. What is different than step 10 is the format
string, “FML999.00” and the fact that a parameter list follows. The “FM” in the
format string instructs the SQL engine to display only as many digits are as necessary
(in this example, displaying “93” instead of “093” or the number prefixed with
Page 16 of 45
spaces), the “L” indicates to use the local currency symbol (which is specified as a
parameter in the next argument), the “999” indicates there may be up to three digits
to the left of the decimal point, and the “.00” indicates that there must always be
two digits to the right of the decimal point. The parameter list “NLS_CURRENCY=€”
indicates the local currency symbol is the Euro symbol. In Windows, holding the Alt
key followed by the numbers 0128 on the number pad inserts the Euro symbol.
Alternatively, the function UNISTR('\20ac') can be used to insert the Euro symbol
without the need to hardcode it by using the Alt key combination, as illustrated
below.
In SQL Server, we would modify the query as the following screenshot indicates:
In PostgreSQL we modify the query as so. FM suppresses padding so the symbol
used is next to the numeric value and D is used as the decimal point and pulled from
the local setting.
Page 17 of 45
Just like with Oracle, we hold the Alt key followed by the numbers 0128 on the
number pad inserts the Euro symbol. The result is now €93.00 instead of 93.0000.
We use the format function as illustrated in step 9 with one difference – we use the
“€” symbol instead of the “$” symbol. To avoid hardcoding the Euro symbol as a
character, we could also concatenate the result of the function nchar(8364), as
shown below.
In this step, we place the subquery in the column list of the outer query, but there
are other options for placement. If the situation were to merit it, we could also place
the subquery in the WHERE clause, the FROM clause, the ORDER BY clause, and in
several other locations in a SQL query. We could also place a subquery inside of
Page 18 of 45
another subquery! Where we place a subquery determines the role its results play in
the outer query. We explore additional placements in other steps. Nevertheless, you
already have a taste of the flexibility and power of that subqueries give you.
2. Explain in your own words:
a. why using a subquery is superior to executing two independent queries and
hardcoding the results of the first in the second.
b. what it means for a subquery to be uncorrelated, including how and when an
uncorrelated subquery is executed in the context of the outer query.
3. You are asked to give the price of a flowing skirt in Cancun. You now have the skills to
retrieve this in a single query! Do so now, making sure to format the result as a
currency. It is convention to use either “$” or “Mex$” as the currency symbol for
Mexican Pesos.
a. Capture the results of the query and its execution.
b. Explain how your solution makes use of a subquery to help retrieve the result, and
the advantages of your solution over a solution that uses two queries.
4. We know how convert currencies in our example schema for display purposes, but
what about making a decision based off the conversion? Imagine that a Mexican
customer wants to know the names and prices of all products that cost less than
2,750 Mexican Pesos. We can start by calculating that for all products with the
following query.
SELECT product_name,
price_in_us_dollars *
(SELECT us_dollars_to_currency_ratio
FROM
Currency
WHERE currency_name = 'Mexican Peso') AS price_in_pesos
FROM
Product
We use a subquery in the column list to obtain the ratio for Mexican Pesos in order
to calculate the product price in Mexican Pesos. The results in Oracle, SQL Server and
PostgreSQL respectively, are below.
Page 19 of 45
Page 20 of 45
We have obtained the price of all products, and now we need to restrict the list to
the products costing less than Mex$2,750, which we can do by adding a subquery to
the WHERE clause, as shown below.
1: SELECT
2:
3:
4:
5:
6: FROM
7: WHERE
8:
9:
10:
product_name,
price_in_us_dollars *
(SELECT us_dollars_to_currency_ratio
FROM
Currency
WHERE currency_name = 'Mexican Peso') AS price_in_pesos
Product
price_in_us_dollars *
(SELECT us_dollars_to_currency_ratio
FROM
Currency
WHERE currency_name = 'Mexican Peso') < 2750
The formatted results are shown for Oracle below.
Page 21 of 45
Notice that, because the results are in the thousands, we use three additional 9 digits
with a comma separator in the format string, to properly format the result. We also
use “Mex$” instead of the U.S. Dollar or Euro symbol.
The formatted results for PostgreSQL are shown below.
Page 22 of 45
The formatted results for SQL Server are shown below.
The only formatting change for SQL Server compared to prior steps is that we use
“Mex$” in the format string, and add “0,0” to the left of the decimal point so that
each group of 3 numbers is separated by a comma.
Notice that in the results, each product in the list cost less then Mex$2,750. It is
possible to add a subquery to a WHERE clause in order to make decisions based on
the result of the subquery.
This example illustrates a construct we have seen – a subquery – used in a clause we
have not seen – the WHERE clause. Let us examine the query line by line. Lines 1‐6
need no additional explanation beyond the fact that a subquery is used in the column
list to calculate the price in Mexican Pesos for each product. Lines 7‐10 contain the
same subquery used in lines 3‐5. That subquery is executed before its result must be
used in the WHERE clause, and the subquery’s result takes the place of a literal value.
The results of that subquery are used to restrict (filter) rows in the result, because it
is located in the WHERE clause. Recall that the conditions specified in the WHERE
clause are applied to each row, and rows that do not meet the conditions are
excluded from the result set. In this specific example, products whose prices are not
less than Mex$2,750 are excluded. Stated differently, products whose prices are
greater than or equal to Mex$2,750 are excluded. Subqueries placed in different
Page 23 of 45
clauses are used for different purposes, but the methodology behind how and when
they are executed does not change.
This example illustrates another important point, which is that more than one
subquery can be embedded in a single query. In this example, the first subquery is
used to retrieve the price in Mexican Pesos, and the second subquery is to restrict
the products retrieved. Each subquery has a useful purpose, and the use of one does
not preclude the use of another.
5. Now that you know how to use subqueries in the WHERE clause, let us use your skills
to address a slightly more involved use case. Imagine that a Canadian store manager
wants to know which products are on the cheaper end and which products are on
the more expensive end. The manager asks you to retrieve the names and prices of
all products that are less than 150 Canadian Dollars or more than 300 Canadian
Dollars. Retrieve the results needed for this use case using a single query, and
capture a screenshot of the query and its execution. Make sure to format the results
in Canadian Dollars. The currency symbol for Canadian Dollars is the same as for U.S.
Dollars, the “$” symbol.
6. Deciding where to place a subquery is important, yet deciding how many values to
retrieve in the subquery is equally important. In prior steps our subqueries always
retrieve a single value, but single‐valued subqueries cannot practically address all use
cases. We need, and thankfully have, more options! In this step we examine
retrieving a list of values, and in subsequent steps we examine retrieving results in
tabular form. Using subqueries to address use cases requires skill and necessitates
making many decisions.
Understanding the concept of a single value is straightforward, but what about a list
of values? Simply put, a list of values in a relational database is a tabular construct
that consists of exactly one column with the one or more rows. In contrast, a single
value consists of exactly one column and exactly one row. When we create a
subquery, we decide the maximum number of rows and columns it may retrieve. If a
subquery retrieves one column, then we have the option to retrieve a single value by
ensuring the subquery always retrieves exactly one row, and we have the option to
retrieve a list of values by allowing it to retrieve as many rows as are needed. For
example, the subquery
SELECT last_name FROM Person WHERE person_id = 5
would presumably retrieve a single value, because it restricts the number of rows
retrieved by a single primary key value. But the subquery
SELECT last_name FROM Person WHERE weight_in_pounds < 130
Page 24 of 45
would presumably retrieve a list of values, because there would be many people that
weigh less than 130 pounds. We control whether a subquery retrieves a list of values
or not by how we write it.
Whatever our decision, we must ensure that the outer query uses the correct
construct to handle the result of the subquery based upon the number of values
returned. The equality operator generally is only used to compare single values. For
example, the test “Does the value in column X equal 5?” makes sense, but the test
“Does the value in column X equal the list of numbers 5, 10, and 20?” does not make
sense. Using SQL syntax, we would say that “WHERE X = 5” makes sense, but
“WHERE X = (1, 2, 3, 4)” does not make sense. We can however use the IN
operator instead, “WHERE X IN (5, 10, 20)”. The IN operator tests whether a
single value is found in a list of values. If X is 5 or 10 or 20, then “X IN (5, 10,
20)” is true; otherwise, it is false. Some constructs in SQL only work with single
values, and some work with lists of values, and we must use the correct class of
constructs with each subquery we create.
Subqueries in conjunction with appropriate SQL constructs can oftentimes be used to
address use cases that have distinct and dissimilar parts. Let’s look at one such
example.
Jill travels internationally and is considering purchasing some items from
some of the store locations while on her travels. She wants flexibility in how
she can order and receive the items, so she would like to see the list of
products and prices (in U.S. Dollars) only for store locations that offer more
than one purchase and delivery option.
This use case is a challenge because it has two parts that require different SQL
strategies. In order to determine which locations are suitable for Jill, we need to
count the number of purchase and delivery options offered by each location, which
means we need to aggregate results. However, because aggregation hides line‐by‐
line details in favor of summarized results, we need to avoid it in order to obtain line‐
by‐line product information for each store. So what do we do about this apparent
conflict? You guessed it. Use a subquery! More complex use cases require more
complex SQL strategies.
A good way to craft a query to address a more complex use case is to create one
independent query for each distinct part, then put them together. For this example,
we first create a query that finds the right stores, then create a query that lists the
names and prices of products of all stores, then combine the two.
Determining which store locations offer more than one purchase and delivery option
is solvable by use of a GROUP BY coupled with a HAVING clause as shown below.
Page 25 of 45
1:
2:
3:
4:
5:
SELECT
FROM
JOIN
GROUP BY
HAVING
Store_location.store_location_id, Store_location.store_name
Store_location
Offers ON Offers.store_location_id = Store_location.store_location_id
Store_location.store_location_id, Store_location.store_name
COUNT(Offers.purchase_delivery_offering_id) > 1
On line 4, the GROUP BY groups the result set by the store_location_id and
secondarily by the store_name, and the HAVING limits the results returned to those
stores with more than one purchase and delivery option. Executing this query yields
results similar to the screenshot below in both Oracle, SQL Server and PostgreSQL,
illustrating that the London Extension and the New York Extension both have more
than one purchase and delivery offering.
Listing the products and prices for all stores is straightforward and only requires basic
joins, as illustrated below.
1:
2:
3:
4:
5:
6:
SELECT Store_location.store_name,
Product.product_name,
Product.price_in_us_dollars
FROM
Store_location
JOIN
Sells ON Sells.store_location_id = Store_location.store_location_id
JOIN
Product ON Product.product_id = Sells.product_id
Notice that we join Store_location to Sells to Product, and list out each store’s name,
product name, and product price. The query’s execution is illustrated below, and the
results are truncated for brevity.
Page 26 of 45
Finally, we embed the first query into the second another to retrieve the results we
need, as illustrated below.
1: SELECT
2:
3:
4: FROM
5: JOIN
6: JOIN
7: WHERE
8:
9:
10:
11:
12:
13:
Store_location.store_name,
Product.product_name,
Product.price_in_us_dollars
Store_location
Sells ON Sells.store_location_id = Store_location.store_location_id
Product ON Product.product_id = Sells.product_id
Store_location.store_location_id IN
(SELECT
Store_location.store_location_id
FROM
Store_location
JOIN
Offers
ON Offers.store_location_id = Store_location.store_location_id
GROUP BY Store_location.store_location_id
HAVING
COUNT(Offers.purchase_delivery_offering_id) > 1)
On lines 8‐13, the first query is embedded as a subquery with one change: the name
of the store is not retrieved. This is because when we use the query in a standalone
fashion, we want to see the name of the stores (seeing the store ID alone is not
helpful), but when we embed the query, we want only to retrieve the store IDs so
that the outer query can limit what it retrieves by those IDs. On lines 1‐6 you see the
Page 27 of 45
second query, the one that lists all products, is present without any changes
compared to the original. Line 7 is where this query gets interesting; it is the glue that
causes the two queries to work together. The outer query only retrieves the products
of stores returned by the subquery by ensuring that the outer query’s
store_location_id is in the list of the ids returned by the subquery. The
“Store_location.store_location_id IN” part of line 7 sets up the condition
that the store_location_id must be in the list of values that follow, and the list of
values that follow are determined by the subquery. Think about the last two
sentences until you are sure you understand how these queries work together; it is
essential you understand how to glue two queries together as illustrated in this
example. Combining queries to solve complex use cases is powerful, but also requires
skilled knowledge of SQL constructs and mechanisms.
Execution of the combined query is shown for Oracle below, with the price formatted
as a monetary amount.
Notice that only the products and prices for the London Extension and New York
Extension are list in the results. Execution in SQL Server is shown below, and the
results are the same.
Page 28 of 45
Execution in PostgreSQL is shown below, and the results are the same.
Page 29 of 45
This query gives Jill what she wants! We have successfully addressed Jill’s use case
using a subquery embedded in an outer query, and she now knows about all
products that are sold in locations that offer more than one purchase and delivery
option.
7. Now it is time for another, more complex use case.
Like Jill, Marcus also travels internationally, and he is interested in
considering some products to purchase. Because his employer sends him to
various locations throughout the world with little notice, he only wants to
consider a product if it is available in all store locations, and is not interested
in products that are available in some but not all store locations. This way,
should he decide to purchase a product, he has the assurance that can
purchase it at any of the locations. Lastly, he is interested in viewing the
sizing options for each product that meets his criteria.
Page 30 of 45
This use case can be solved by creating different queries to solve different parts, then
putting them together to address the entire use case. For example, one of the
subqueries can be put into the WHERE clause to limit what is returned. In your
thinking about how to address this use case, one item should be brought to your
attention – the phrase “all store locations”. By eyeballing the data, we can see that
there are 5 locations. Retrieving products that are sold at exactly 5 locations
addresses Marcus’ request at this present time, but even better is to dynamically
determine the total number of locations in the query itself so that it returns correct
results over time, even if the number of locations changes. For example, if one store
location closes, there would be 4 locations instead of 5. Likewise, if a new store
location is opens, there would be 6 locations. Dynamically retrieving the number of
locations is a superior solution.
a. Think about which parts can be solved with independent queries, then identify and
briefly explain them here.
b. Write an independent query for each part you identified, and capture a screenshot
of the execution and results of each query. Explain the role each result will play in the
final query.
c. Write the full query to address the use case by combining your independent
queries into a single, larger query. Capture a screenshot of its execution and results.
d. Explain how you combined the independent queries to get your results, in terms of
what SQL constructs you used, and the mechanics of how they work together.
8. Some SQL constructs work with the single values, some with lists of values, and some
with tables of values. Recall that all queries, subqueries included, return tabular
results in the form of rows and columns. If a subquery returns one column and one
row, the result can be treated as a single value. Likewise, results with one column
and one or more rows can be treated as a list of values. However, there are no
restrictions on the number of rows or columns when tabular results are expected,
since the results are by definition tabular. In particular, the FROM clause always
expects tabular elements, so a subquery can be used in the FROM clause without
regard to the number of columns and rows it retrieves. Constructs that expect tables
of values allow for flexible subquery creation.
Let us review how a subquery placed in the WHERE clause can filter rows in advanced
ways by reviewing the use case provided in step 6.
Jill travels internationally and is considering purchasing some items from
some of the store locations while on her travels. She wants flexibility in how
she can order and receive the items, so she would like to see the list of
Page 31 of 45
products and prices (in U.S. Dollars) only for store locations that offer more
than one purchase and delivery option.
Next, let us review the solution from step 6 which makes use of a subquery in the
WHERE clause.
1: SELECT
2:
3:
4: FROM
5: JOIN
6: JOIN
7: WHERE
8:
9:
10:
11:
12:
13:
Store_location.store_name,
Product.product_name,
Product.price_in_us_dollars
Store_location
Sells ON Sells.store_location_id = Store_location.store_location_id
Product ON Product.product_id = Sells.product_id
Store_location.store_location_id IN
(SELECT
Store_location.store_location_id
FROM
Store_location
JOIN
Offers
ON Offers.store_location_id = Store_location.store_location_id
GROUP BY Store_location.store_location_id
HAVING
COUNT(Offers.purchase_delivery_offering_id) > 1)
We include the Store_location table on line 4 so that we can retrieve the store name
and also match up the locations to the products they sell. We restrict the store
locations retrieved based upon the results in the subquery on lines 8‐13, thus
employing the subquery as a filtering mechanism. The combined query correctly
addresses the use case.
It makes sense that a subquery in the WHERE clause can be employed as an
advanced filtering mechanism since filtering conditions in general are placed in the
WHERE clause, but you may be surprised to know that a subquery in the FROM clause
can serve the same purpose. This is more easily explained by example, so let us look
at an alternative solution to Jill’s use case.
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
SELECT locations.store_name,
Product.product_name,
Product.price_in_us_dollars
FROM
(SELECT
Store_location.store_location_id,
Store_location.store_name
FROM
Store_location
JOIN
Offers
ON Offers.store_location_id = Store_location.store_location_id
GROUP BY Store_location.store_location_id, Store_location.store_name
HAVING
COUNT(Offers.purchase_delivery_offering_id) > 1) locations
JOIN
Sells ON Sells.store_location_id = locations.store_location_id
JOIN
Product ON Product.product_id = Sells.product_id
Page 32 of 45
Notice that the WHERE clause subquery in the original solution is moved to the
FROM clause, to lines 4‐10. The subquery retrieves the id and name of each store
location that matches Jill’s criteria, which are locations that have more than one
purchase and delivery option. The word “locations” on line 10 is an alias which
provides a name for the subquery’s results. Once defined, the alias can be used as if
it were a table, and that table consists of whatever rows and columns are retrieved
by the subquery. On line 11, the “locations” alias is used as a part of the join
condition, to join the results from the subquery into the Sells table. Because the
subquery only retrieves locations matching Jill’s criteria, the overall query does the
same, and filtering has been achieved in the FROM clause rather than the WHERE
clause. Subqueries placed in the FROM clause can be flexible and powerful
constructs.
The screenshot below shows execution of the query in Oracle with a formatted price.
The same results are retrieved as in the original solution with the exception of row
ordering, which is insignificant.
Page 33 of 45
The screenshot below shows execution of the query in PostgreSQL.
Page 34 of 45
The screenshot below shows execution of the query in SQL Server.
The results are the same as in the original solution. Both screenshots illustrate that
we can successfully filter rows by using a subquery in the FROM clause.
Subqueries in the FROM clause are useful for more than just filtering. The columns
retrieved by the subquery can actually be returned in the outer query directly! Notice
in the new solution, the Store_location table is no longer directly used in the FROM
clause; the results from the subquery actually take the place of using the
Store_location table. On line 1, the store_name column is used directly from the
subquery through use of the “locations” alias, and there is no need to additionally
join into the Store_location table in order to retrieve the name of the store. A
subquery placed in the FROM clause sometimes take the place of a table.
The fact that two solutions address the same use case, one in step 6 and one in this
step, demonstrates that the same results can be retrieved from different queries.
Given that many queries address a particular use case correctly, is one better, and if
so, which one? There is no universal answer. Typically, we choose the query that
performs the best by selecting the one that outperforms the others. If several queries
Page 35 of 45
perform well, we typically choose the one that is the least complex. Sometimes two
or more queries work equally as well, and we just need to select one of them. With
enough experience, we discern one of the better strategies before we write the
query, and only change strategies if the query we write has a problem we did not
foresee. For Jill’s use case, given the small data set in our schema, either solution –
the one in step 19 with a subquery in the WHERE clause, and the one in this step with
a subquery in the FROM clause – works fine. One solution could outperform the
other if we add millions of products into the schema, but even that depends upon
the particular DBMS used and the particular execution plan the DBMS selects. Some
DBMS would discern that the two queries are functionally equivalent, and choose the
same execution plan for both of them. Which solution is better for a particular use
case depends upon many factors.
9. You provided a solution to Marcus’ use case in step 7 by using a subquery in the
WHERE clause. Using a similar methodology as in the prior step 8, create an
alternative solution by using a subquery in the FROM clause.
a. Capture a screenshot of your new query and the results of its execution.
b. Explain in your own words the mechanics of how your new query works to address
Marcus’ use case.
10. Some use cases have a distinct part that requires the existence of a particular item.
Below is one such example.
Adina travels regularly, has already decided she wants to purchase a
Cashmere sweater from one of the store locations, and is considering
purchasing other products as well. For each location that sells Cashmere
sweaters, she wants to see all products and their prices in U.S. Dollars. Then
she can make an informed decision of where the purchase the sweater in
addition to any other products she may want.
Just as in prior steps, we can identify the parts, write queries for them, and then put
them together. One part comes from this sentence fragment in the use case, “she
wants to see all products and their prices in U.S. Dollars”. This is straightforward and
we can steal a query from step 19 to address this part, shown below.
1:
2:
3:
4:
5:
6:
SELECT Store_location.store_name,
Product.product_name,
Product.price_in_us_dollars
FROM
Store_location
JOIN
Sells ON Sells.store_location_id = Store_location.store_location_id
JOIN
Product ON Product.product_id = Sells.product_id
Page 36 of 45
The other part comes from this sentence fragment, “she wants to purchase a
Cashmere sweater from one of the store locations”; this use case has a distinct part
that requires a location to sell Cashmere sweaters in order to be considered. The SQL
for this part is similar to the SQL for the prior part, differing by an extra condition that
accepts only the “Cashmere Sweater” product, and retrieving only columns related to
the Store_location table.
1:
2:
3:
4:
5:
6:
SELECT Store_location.store_location_id,
Store_location.store_name
FROM
Store_location
JOIN
Sells ON Sells.store_location_id = Store_location.store_location_id
JOIN
Product ON Product.product_id = Sells.product_id
AND Product.product_name = 'Cashmere Sweater'
Only store locations that sell Cashmere sweaters are retrieved by this query, as
illustrated in the screenshot from Oracle below.
Notice that the Cancun Extension is excluded because it does not sell Cashmere
sweaters. This is also illustrated in the PostgreSQL screenshot below.
Page 37 of 45
Notice that the Cancun Extension is excluded because it does not sell Cashmere
sweaters. This is also illustrated in the SQL Server screenshot below.
The initial strategy to solve this use case is similar to the initial strategies used to
solve use cases in other steps.
The SQL construct used when combining the queries for each part for this use case,
however, is quite different than those used in other steps. The EXISTS clause is useful
to address use cases such as this that test for the existence of a certain item. Unlike
some SQL constructs that work with a single value, a list of values, or a table of
values, the EXISTS clause only works with a subquery. An EXISTS clause is a Boolean
expression that returns only true and false, true if the subquery returns any rows at
all, and false if the subquery returns no rows. EXISTS does not consider the number
of columns or the column’s datatypes retrieved by the subquery, and even does not
consider whether any values are NULL; rather, EXISTS only tests the existence of at
least one row. So if we were to provide an English description of what EXISTS tests, it
Page 38 of 45
could be “Is any row retrieved from this subquery?” EXISTS is different than most
other SQL constructs because it is designed specifically for subqueries.
At this point we would expect to simply combine the two independent queries with
EXISTS, but this will not get us the results we need for Adina’s use case. Let us try it
so we see what happens with the query below.
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
SELECT Store_location.store_name,
Product.product_name,
Product.price_in_us_dollars
FROM
Store_location
JOIN
Sells ON Sells.store_location_id = Store_location.store_location_id
JOIN
Product ON Product.product_id = Sells.product_id
WHERE EXISTS (SELECT Store_location.store_location_id, Store_location.store_name
FROM
Store_location
JOIN
Sells ON Sells.store_location_id = Store_location.store_location_id
JOIN
Product ON Product.product_id = Sells.product_id
AND Product.product name = 'Cashmere Sweater')
You will notice that the first query is found on lines 1‐6, and the second is found on
lines 7‐11 embedded inside of the EXISTS clause. The screenshot below shows part of
the items returned in Oracle (there are too many rows to show them all).
And the screenshot below is for PostgreSQL screenshot below.
Page 39 of 45
And the screenshot below is for SQL Server.
Page 40 of 45
You will immediately notice the problem with the results in either screenshot. The
products for Cancun Extension are included in the results, but Adina only wants
products for locations that sell Cashmere sweaters! We can see what is wrong with
the query’s logic by summarizing in English what the query does, “Retrieve all
products and their prices for all locations if any location sells Cashmere sweaters.”
Therein lies the problem. EXISTS only checks for the existence of any row, so if any
location sells a Cashmere sweater, EXISTS indicates a true value, and the products for
all locations are retrieved. Simple combining does not work in this case.
EXISTS usually demands a more complex method of combining the queries for each
part. The subquery usually must be correlated with the outer query to get the results
we want. A correlated subquery references at least one table from the outer query,
which means that conceptually, the subquery is not an independent query. Unlike
subqueries in prior steps, which are termed uncorrelated subqueries, we cannot
execute correlated subqueries on their own; correlated subqueries only make sense
in the context of the outer query into which they are embedded. And unlike
uncorrelated subqueries which are executed once and retrieve results once,
correlated subqueries are executed once for each row in the outer query and
therefore retrieve one result set for each row in the outer query. We can say that
during the execution of an overall query, the results of an uncorrelated subquery are
fixed, and the results of a correlated subquery are relative to each row in the outer
query. EXISTS is typically coupled with a correlated subquery to achieve meaningful
results.
Altering our subquery for Adina’s use case by correlating the subquery gives us the
logic we want.
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
SELECT Store_location.store_name,
Product.product_name,
Product.price_in_us_dollars
FROM
Store_location
JOIN
Sells ON Sells.store_location_id = Store_location.store_location_id
JOIN
Product ON Product.product_id = Sells.product_id
WHERE EXISTS (SELECT Cashmere_location.store_location_id, Cashmere_location.store_name
FROM
Store_location Cashmere_location
JOIN
Sells ON Sells.store_location_id = Cashmere_location.store_location_id
JOIN
Product ON Product.product_id = Sells.product_id
AND Product.product_name = 'Cashmere Sweater'
WHERE Cashmere_location.store_location_id = Store_location.store_location_id)
Lines 1‐6 are the same as in the first solution, but lines 7‐12 are different. The first
difference you may notice is that there is the alias Cashmere_location for the
Store_location table introduced the subquery. This is necessary to eliminate any
ambiguity between the Store_location table introduced in the outer query on line 4,
and the Store_location introduced in the subquery on line 8. With the alias, it is clear
Page 41 of 45
that a reference to Store_location is a reference to the table introduced in the
outer query, and a reference to Cashmere_location is a reference to the table
introduced in the subquery. We use the identifier Cashmere_location to highlight
the fact that locations in the subquery are only those that sell Cashmere sweaters.
The second difference is found on line 12, WHERE
Cashmere_location.store_location_id =
Store_location.store_location_id. It is this line that correlates the subquery
with the outer query! Notice that the ID of Cashmere_location must equal the ID
of Store_location, and it is this equality that forces the subquery into correlation.
In English, we could summarize the logic of the subquery as follows: “Retrieve the
store location found in the current row of the outer query only if that store location
sells Cashmere sweaters”. This logic, coupled with the EXISTS keyword, means that if
the current row in the outer query does not contain a store location that sells
Cashmere sweaters, it is excluded from the result set. This is exactly what we want!
The modified solution containing a correlated subquery gives us the results we want.
Below is the screenshot for Oracle, which shows enough rows to see that our query
works, but excludes some rows for brevity.
And below is the screenshot for PostgreSQL for the same.
Page 42 of 45
And below is the screenshot for SQL Server for the same.
Page 43 of 45
Notice that Cancun Extension, which does not sell Cashmere sweaters, has been
excluded from the result set. This is exactly what Adina wants! She now has a list of
all products and their prices for store locations that sell Cashmere sweaters.
Since the topic of correlated subqueries is complex, let us summarize the steps we go
through to solve Adina’s use case, and any use case requiring a correlated subquery.
First, we identify the distinct parts of the use case that require different SQL queries
and constructs. Second, we write independent queries that address each part. Third,
we combine the independent queries in such a way that the subquery is correlated to
the outer query. Correlating the subquery involves changing it from an independent
(uncorrelated) subquery to one that references at least one table introduced in the
outer query, thus ensuring the subquery retrieves results based upon the current row
of the outer query. The steps needed to solve all use cases requiring subqueries are
similar, but those that require correlated subqueries necessitate a different method
of combining the queries that make up the parts.
It should be noted that use of EXISTS combined with a correlated subquery is not
limited to addressing use cases that test for the existence of a single item. This EXISTS
combination can be used to test for the existence of mostly any set of conditions. In
fact, many use cases that make use of uncorrelated subqueries in the WHERE clause
can be rewritten to use a correlated subquery with EXISTS. EXISTS may perform
better in some situations, and an uncorrelated subquery may perform better in
others. Knowing how to use EXISTS gives us another tool we can use to help address
more complex use cases.
Lastly, correlated subqueries can be used with other constructs, such as the IN
construct. However, it is difficult to think of use cases where using correlated
subqueries with other constructs makes for the best solution. Correlated subqueries
commonly are coupled with EXISTS.
11. Explain in your own words the differences between a correlated subquery and an
uncorrelated subquery.
12. In step 7, you created a solution to Marcus’ use case, which is reproduced again
below.
Like Jill, Marcus also travels internationally, and he is interested in
considering some products to purchase. Because his employer sends him to
various locations throughout the world with little notice, he only wants to
consider a product if it is available in all store locations, and is not interested
in products that are only available in some but not all store locations. This
way, should he decide to purchase a product, he has the assurance that can
purchase it at any of the locations. Lastly, he is interested in viewing the
sizing options for each product that meets his criteria.
Page 44 of 45
You identified the distinct parts, wrote queries for them, and then combined them to
solve the use case. For this step, modify your solution to make use of the EXISTS
clause and a correlated subquery.
a. Capture a screenshot of the query and the results of its execution.
b. Explain what changes you needed to make to your solution to make use of a
correlated subquery.
c. Explain which solution, the one in step 20 or the one in this step, you believe to be
more appropriate to solve Marcus’ use case, and why. There is no one correct
answer, but it is useful to know why you would choose one solution over the other.
Lab Summary
Congratulations! You have learned how to solve more complex use cases by using
subqueries. You are now better equipped to address the wide variety of demanding
use cases presented to database designers and developers.
Page 45 of 45
Purchase answer to see full
attachment