Performance Tuning Recipes

On this page Carat arrow pointing down
Warning:
Cockroach Labs will stop providing Assistance Support for v22.2 on June 5, 2024. Prior to that date, upgrade to a more recent version to continue receiving support. For more details, see the Release Support Policy.

This page provides recipes for fixing performance issues in your applications.

Problems

This section describes how to use CockroachDB commands and dashboards to identify performance problems in your applications.

Observation Diagnosis Solution
  • Your application takes a long time to return results.
  • Excess network latency.
  • Your application is experiencing degraded performance with the following transaction retry errors:
    • SQLSTATE: 40001
    • RETRY_WRITE_TOO_OLD
    • RETRY_SERIALIZABLE
  • The SQL Statement Contention dashboard in the DB Console is showing spikes over time.
  • The SQL Statement Errors graph in the DB Console is showing spikes in retries over time.
  • The statement plan produced by EXPLAIN or EXPLAIN ANALYZE indicates that the statement uses a full table scan.
  • Querying the crdb_internal.node_statement_statistics table indicates that you have full table scans in some statement's plans.
  • Viewing the statement plan on the Statement Fingerprint page in the DB Console indicates that the plan contains full table scans.
  • Running the SHOW FULL TABLE SCANS statement returns results.
  • The Full Table/Index Scans graph in the DB Console is showing spikes over time.
  • Poor quality statement plans retrieve more rows than are required, leading to longer execution times.
  • The Hardware metrics dashboard in the DB Console shows high resource usage per node.
  • The Problem Ranges report on the Advanced Debug page in the DB Console indicates a high number of queries per second on a subset of ranges or nodes.
  • You have resource contention.
  • The Overview dashboard in the DB Console shows high service latency and QPS for INSERT and UPDATE statements.
  • Your tables have long write times.
  • You experience high latency on queries that cannot be explained by high contention or a suboptimal query plan. You might also see high CPU on one or more nodes.
  • You may be scanning over large numbers of MVCC versions. This is similar to how a full table scan can be slow.

Solutions

This section provides solutions for common performance issues in your applications.

Transaction contention

Transaction contention occurs when transactions issued from multiple clients at the same time operate on the same data. This can cause transactions to wait on each other (like when many people try to check out with the same cashier at a store) and decrease performance.

Indicators that your application is experiencing transaction contention

Fix transaction contention problems

Find the transactions and statements within the transactions that are experiencing contention. CockroachDB has several tools to help you track down such transactions and statements:

After you identify the transactions or statements that are causing contention, follow the steps in the next section to avoid contention.

Note:

If you experience a hanging or stuck query that is not showing up in the list of contended transactions and statements on the Transactions or Statements pages in the DB Console, the process described above will not work. You will need to follow the process described in Hanging or stuck queries instead.

Statements with full table scans

Full table scans often result in poor statement performance.

Indicators that your application has statements with full table scans

  • The following query returns statements with full table scans in their statement plan:

    icon/buttons/copy
    SHOW FULL TABLE SCANS;
    
  • The following query against the crdb_internal.node_statement_statistics table returns results:

    icon/buttons/copy
    SELECT count(*) as total_full_scans
    FROM crdb_internal.node_statement_statistics
    WHERE full_scan = true;
    
  • Viewing the statement plan on the Statement Fingerprint page in the DB Console indicates that the plan contains full table scans.

  • The statement plans returned by the EXPLAIN and EXPLAIN ANALYZE commands indicate that there are full table scans.

  • The Full Table/Index Scans graph in the DB Console is showing spikes over time.

Fix full table scans in statements

Not every full table scan is an indicator of poor performance. The cost-based optimizer may decide on a full table scan when other index or join scans would result in longer execution time.

Examine the statements that result in full table scans and consider adding secondary indexes.

In the DB Console, visit the Schema Insights tab on the Insights page and check if there are any insights to create missing indexes. These missing index recommendations are generated based on slow statement execution. A missing index may cause a statement to have a suboptimal plan. If the execution was slow, based on the insights threshold, then it's likely the create index recommendation is valid. If the plan had a full table scan, it's likely that it should be removed with an index.

Also see Table scans best practices.

Suboptimal primary keys

Indicators that your tables are using suboptimal primary keys

  • The Hardware metrics dashboard in the DB Console shows high resource usage per node.
  • The Problem Ranges report on the Advanced Debug page in the DB Console indicates a high number of queries per second on a subset of ranges or nodes.

Fix suboptimal primary keys

Evaluate the schema of your table to see if you can redistribute data more evenly across multiple ranges. Specifically, make sure you have followed best practices when selecting your primary key.

If your application with a small dataset (for example, a dataset that contains few index key values) is experiencing resource contention, consider splitting your tables and indexes to distribute ranges across multiple nodes to reduce resource contention.

Slow writes

Indicators that your tables are experiencing slow writes

If the Overview dashboard in the DB Console shows high service latency when the QPS of INSERT and UPDATE statements is high, your tables are experiencing slow writes.

Fix slow writes

Secondary indexes can improve application read performance. However, there is overhead in maintaining secondary indexes that can affect your write performance. You should profile your tables periodically to determine whether an index is worth the overhead. To identify infrequently accessed indexes that could be candidates to drop, do one of the following:

  • In the DB Console, visit the Schema Insights tab on the Insights page and check if there are any insights to drop unused indexes.
  • In the DB Console, visit the Databases page and check databases and tables for Index Recommendations to drop unused indexes.
  • Run a join query against the crdb_internal.index_usage_statistics and crdb_internal.table_indexes tables:

    icon/buttons/copy
    SELECT ti.descriptor_name as table_name, ti.index_name, total_reads, last_read
    FROM crdb_internal.index_usage_statistics AS us
    JOIN crdb_internal.table_indexes ti
    ON us.index_id = ti.index_id
    AND us.table_id = ti.descriptor_id
    ORDER BY total_reads ASC;
    
                  table_name     |                  index_name                   | total_reads |           last_read
    -----------------------------+-----------------------------------------------+-------------+--------------------------------
      vehicle_location_histories | vehicle_location_histories_pkey               |           1 | 2021-09-28 22:59:03.324398+00
      rides                      | rides_auto_index_fk_city_ref_users            |           1 | 2021-09-28 22:59:01.500962+00
      rides                      | rides_auto_index_fk_vehicle_city_ref_vehicles |           1 | 2021-09-28 22:59:02.470526+00
      user_promo_codes           | user_promo_codes_pkey                         |         456 | 2021-09-29 00:01:17.063418+00
      promo_codes                | promo_codes_pkey                              |         910 | 2021-09-29 00:01:17.062319+00
      vehicles                   | vehicles_pkey                                 |        3591 | 2021-09-29 00:01:18.261658+00
      users                      | users_pkey                                    |        5401 | 2021-09-29 00:01:18.260198+00
      rides                      | rides_pkey                                    |       45658 | 2021-09-29 00:01:18.258208+00
      vehicles                   | vehicles_auto_index_fk_city_ref_users         |       87119 | 2021-09-29 00:01:19.071476+00
    (9 rows)
    

    Use the values in the total_reads and last_read columns to identify indexes that have low usage or are stale and can be dropped.

Too many MVCC values

Indicators that your tables have too many MVCC values

In the Databases page in the DB Console, the Tables view shows the percentage of live data for each table. For example:

Table live data

In this example, at 37.3% the vehicles table would be considered to have a low percentage of live data. In the worst cases, the percentage can be 0%.

A low percentage of live data can cause statements to scan more data (MVCC values) than required, which can reduce performance.

Configure CockroachDB to purge MVCC values

Reduce the gc.ttlseconds zone configuration of the table as much as possible.

See also

If you aren't sure whether SQL query performance needs to be improved, see Identify slow queries.


Yes No
On this page

Yes No