set operators in Postgres

Postgres offers set operators that make it easy to query and filter the results of searches from your database. Set operators are used to join the results of two or more SELECT statements. These operators are UNION, UNION ALL, INTERSECT, and EXCEPT—each can be used to construct queries across mutliple tables and filter for the specific data that you need.

To return the combined results of two SELECT statements, we use the UNION set operator. This operator removes all the duplicates from the queried results—only listing one row for each duplicated result. To examine this behavior, the UNION ALL set operator can be used, as it retains duplicates in the final result. The INTERSECT set operator only lists records that are shared by both SELECT queries, and conversely, the EXCEPT set operator removes the results from the second SELECT query. Thus, the INTERSECT and EXCEPT set operators are used to produce unduplicated results.

All set operators initially share the same degree of precedence (except for INTERSECT, which we’ll discuss later). Because parentheses are prioritized above dangling operators, they can cause the order to differ.

The column-list data types must be implicitly convertible by Postgres. Postgres does not perform implicit type conversion if the corresponding columns in the queries belong to different data types. If a column in the first query of type INT, and the corresponding column in the second query is of type CHAR, Postgres will not perform an implicit conversion—instead, it will raise a type error.

To sort the result set, positional ordering is used. Set operators do not allow individual result set ordering. At the end of the query, ORDER BY can only appear once.

The order of the queries in UNION and INTERSECT operators is not important, and doesn’t change the final result – UNION and INTERSECT operators are commutative.

UNION ALL has better performance compared to the UNION operator because resources are not used in filtering duplicates and sorting the results.

  • It is possible to use set operators as part of subqueries.
  • It is not possible to use set operators in SELECT statements containing TABLE collection expressions.
DROP TABLE IF EXISTS top_rated_films;

CREATE TABLE top_rated_films(
title VARCHAR NOT NULL,
release_year SMALLINT);

DROP TABLE IF EXISTS most_popular_films;
CREATE TABLE most_popular_films(
title VARCHAR NOT NULL,
release_year SMALLINT);

INSERT INTO top_rated_films(title, release_year)
VALUES
('Fast and furious 7', 2011),
('The redemption', 2015),
('Shooter', 2017);

INSERT INTO most_popular_films(title, release_year)
VALUES
('American snipper', 2015),
('Doulou continent',2018),
('Outpost',2019),
('Shooter', 2017);

Postgres displays the combined results from all of the compounded SELECT queries after removal of duplicate values, and the results are sorted in ascending order when the UNION operator joins multiple SELECT queries:

SELECT * FROM top_rated_films
UNION
SELECT * FROM most_popular_films

Note —
the columns selected must be of the compatible data types, otherwise Postgres throws a type error:

The Difference Between UNION and UNION ALL

The UNION and UNION ALL operators are similar—the difference being that UNION ALL returns the result set without removing duplicates and sorting the data.

Check the query in the UNION section below. Make sure to note the difference in the output that is generated without sorting and removal of duplicates:

SELECT * FROM top_rated_films
UNION ALL
SELECT * FROM most_popular_films;

Find Overlapping Data from Multiple SELECT Queries Using INTERSECT

To display the common rows from both the select statements with no duplicates, and with data arranged in sorted order; the INTERSECT operator is used.
The shooter movie is returned because it is available in both top rated and most popular movie tables.

SELECT * FROM most_popular_films
INTERSECT
SELECT * FROM top_rated_films;

Returning Unique Rows with EXCEPT in Postgres

To display rows that are present in the first query but absent in the second one, the EXCEPT operator is used. It returns no duplicates, and the data is arranged in ascending order by default.

SELECT * FROM most_popular_films
EXCEPT
SELECT * FROM top_rated_films

Using the ORDER BY clause in SET operations

In a query containing a compound SELECT statement, the ORDER BY clause can only appear once at the end. This indicates that individual SELECT statements cannot contain the ORDER BY clause. Sorting is based on the columns that appear in the first SELECT query only. It is recommended to sort compound queries using column positions.

SELECT * FROM most_popular_films
UNION ALL
SELECT * FROM top_rated_films
ORDER BY title;

Resource: https://arctype.com/blog/postgres-set-operators/

Leave a Reply

Your email address will not be published. Required fields are marked *