Join Expressions

On this page Carat arrow pointing down
Warning:
CockroachDB v19.1 is no longer supported. For more details, see the Release Support Policy.

Join expressions, also called "joins", combine the results of two or more table expressions based on conditions on the values of particular columns (i.e., equality columns).

Join expressions define a data source in the FROM sub-clause of simple SELECT clauses, or as parameter to TABLE. Joins are a particular kind of table expression.

Tip:

New in v19.1: The cost-based optimizer supports hint syntax to force the use of a specific join algorithm. For more information, see Join hints.

Synopsis

( joined_table ) table_ref CROSS NATURAL FULL LEFT RIGHT OUTER INNER JOIN table_ref FULL LEFT RIGHT OUTER INNER JOIN table_ref USING ( name , ) ON a_expr

Parameters

Parameter Description
joined_table Another join expression.
table_ref A table expression.
a_expr A scalar expression to use as ON join condition.
name A column name to use as USING join condition

Supported join types

CockroachDB supports the following join types:

Inner joins

Only the rows from the left and right operand that match the condition are returned.

<table expr> [ INNER ] JOIN <table expr> ON <val expr>
<table expr> [ INNER ] JOIN <table expr> USING(<colname>, <colname>, ...)
<table expr> NATURAL [ INNER ] JOIN <table expr>
<table expr> CROSS JOIN <table expr>

Left outer joins

For every left row where there is no match on the right, NULL values are returned for the columns on the right.

<table expr> LEFT [ OUTER ] JOIN <table expr> ON <val expr>
<table expr> LEFT [ OUTER ] JOIN <table expr> USING(<colname>, <colname>, ...)
<table expr> NATURAL LEFT [ OUTER ] JOIN <table expr>

Right outer joins

For every right row where there is no match on the left, NULL values are returned for the columns on the left.

<table expr> RIGHT [ OUTER ] JOIN <table expr> ON <val expr>
<table expr> RIGHT [ OUTER ] JOIN <table expr> USING(<colname>, <colname>, ...)
<table expr> NATURAL RIGHT [ OUTER ] JOIN <table expr>

Full outer joins

For every row on one side of the join where there is no match on the other side, NULL values are returned for the columns on the non-matching side.

<table expr> FULL [ OUTER ] JOIN <table expr> ON <val expr>
<table expr> FULL [ OUTER ] JOIN <table expr> USING(<colname>, <colname>, ...)
<table expr> NATURAL FULL [ OUTER ] JOIN <table expr>

Supported join conditions

CockroachDB supports the following conditions to match rows in a join:

  • No condition with CROSS JOIN: each row on the left is considered to match every row on the right.
  • ON predicates: a Boolean scalar expression is evaluated to determine whether the operand rows match.
  • USING: the named columns are compared pairwise from the left and right rows; left and right rows are considered to match if the columns are equal pairwise.
  • NATURAL: generates an implicit USING condition using all the column names that are present in both the left and right table expressions.
Warning:
NATURAL is supported for compatibility with PostgreSQL; its use in new applications is discouraged because its results can silently change in unpredictable ways when new columns are added to one of the join operands.

Join algorithms

CockroachDB supports the following algorithms for performing a join:

Merge joins

By default, CockroachDB uses merge joins whenever possible because they are more performant than hash joins, computationally and in terms of memory. A merge join requires both tables to be indexed on the equality columns and the indexes must have the same ordering. When these conditions are not met, CockroachDB resorts to the slower hash joins. Merge joins can be used only with distributed query processing.

Merge joins are performed on the indexed columns of two tables as follows:

  1. CockroachDB checks for indexes on the equality columns and that they are ordered the same (i.e., ASC or DESC).
  2. CockroachDB takes one row from each table and compares them.
    • For inner joins:
      • If the rows are equal, CockroachDB returns the rows.
      • If there are multiple matches, the cartesian product of the matches is returned.
      • If the rows are not equal, CockroachDB discards the lower-value row and repeats the process with the next row until all rows are processed.
    • For outer joins:
      • If the rows are equal, CockroachDB returns the rows.
      • If there are multiple matches, the cartesian product of the matches is returned.
      • If the rows are not equal, CockroachDB returns NULL for the non-matching column and repeats the process with the next row until all rows are processed.

Hash joins

If a merge join cannot be used, CockroachDB uses a hash join. Hash joins are computationally expensive and require additional memory.

Hash joins are performed on two tables as follows:

  1. CockroachDB reads both tables and attempts to pick the smaller table.
  2. CockroachDB creates an in-memory hash table on the smaller table. If the hash table is too large, it will spill over to disk storage (which could affect performance).
  3. CockroachDB then scans the large table, looking up each row in the hash table.

Lookup joins

The cost-based optimizer decides when it would be beneficial to use a lookup join. Lookup joins are used when there is a large imbalance in size between the two tables, as it only reads the smaller table and then looks up matches in the larger table. A lookup join requires that the right-hand (i.e., larger) table is indexed on the equality column.

Lookup joins are performed on two tables as follows:

  1. CockroachDB reads each row in the small table.
  2. CockroachDB then scans (or "looks up") the larger table for matches to the smaller table and outputs the matching rows.

You can override the use of lookup joins using join hints.

Performance best practices

Note:
CockroachDBs is currently undergoing major changes to evolve and improve the performance of queries using joins. The restrictions and workarounds listed in this section will be lifted or made unnecessary over time.
  • Joins over interleaved tables are usually (but not always) processed more effectively than over non-interleaved tables.
  • When no indexes can be used to satisfy a join, CockroachDB may load all the rows in memory that satisfy the condition one of the join operands before starting to return result rows. This may cause joins to fail if the join condition or other WHERE clauses are insufficiently selective.
  • Outer joins (i.e., left outer joins, right outer joins, and full outer joins) are generally processed less efficiently than inner joins. Use inner joins whenever possible. Full outer joins are the least optimized.
  • Use EXPLAIN over queries containing joins to verify that indexes are used.
  • Use indexes for faster joins.

See also


Yes No
On this page

Yes No